5 Distributed computing

5.1 BOSS user guide

5.1.1 Preparation

Apply certificates and join BES VO before using distributed resources to run BOSS jobs.

5.1.2 Environment set-up

Copy the Ganga configuration file to your directory：

$ cp   /afs/.ihep.ac.cn/bes3/offline/ExternalLib/gangadist/.gangarc  ~/.gangarc

modify the parameter gangadir (line 66)，pointing to the user directory you defined

build grid and BOSS software env set-up scripts, including BOSS and Grid，the Grid env as follows：

   source /cvmfs/dcomputing.ihep.ac.cn/frontend/gangadist/env_grid.sh

5.1.3 Job management

Here use bhabha as the example to introduce job submission and monitoring.

5.1.3.1 Job submission

1. Use the above env scripts to initialize your env, and input your certificate passwd when asked.

2. Edit ganga optional file in python. Here is the example(sim_dirac_by_event.py)：

# ganga job file for BOSS simulation only
 bossVersion = '6.6.4.p03'
 optionsFile = 'jobOptions_sim_bhabha.txt'
 jobGroup = '150420_bhabha_01'
 metadata = {'resonance':   '4360',
             'eventType':   'bhabha',
 }
 splitter = UserSplitterByEvent(evtMaxPerJob = 1000, evtTotal = 1000*200)

 app = Boss(version=bossVersion, optsfile=optionsFile, metadata=metadata)
 j = Job(application=app, splitter=splitter)

 j.backend = Dirac()
 j.backend.settings['JobGroup'] = jobGroup
 j.backend.settings['Destination'] = ['BES.UMN.us']
 #j.inputsandbox.append("mypdt.table")
 j.submit()

 print '\nThe DFC path for output data is:'
 print app.get_output_dir()
 print '\nThe dataset name for output data is:'
 print app.get_dataset_name()

Note： the parameter "evtMaxPerJob" has better not exceed 20,000 to avoid high failure rate."jobGroup" can be used to filter jobs in the monitoring page. Extra input files besides job option and decay cards need to be added in the inputsandbox.

Copy BOSS job files including job options and decay cards to the same directory as ganga scripts, and use the following commands to submit：

$ ganga sim_dirac_by_event.py
After the job completed, the output will be stored in the DFC as a dataset like:

User_tyan_4360_664p01_bhabha_round06_31001_31100_stream001_rtraw

5.1.3.2 Job monitoring

The job status can be known from DIRAC portal in the "Job monitor" page.

5.1.3.3 Get output

The output data files could be found under /scratchfs in your own directory. You can access /scratchfs through lxlogin.ihep.ac.cn in IHEP.

5.1.4 Job types supported

Now BESIII distributed computing system can support three kinds of BOSS jobs：

（1）simulation (output: rtraw files)

（2）simulation+reconstruction (output: rtraw/dst/rec files)

（3）simulation+reconstruction+analysis(output: rtraw/dst/rec/root files)

5.1.4.1 Simulation

The example ganga script is as follows:

bossVersion = '6.6.4.p02'
optionsFile = 'jobOptions_sim_rhopi.txt'
jobGroup = 'sim_rhopi_141102'
metadata = {'resonance': 'jpsi', 'eventType': 'rhopi'}
splitter = UserSplitterByRun(evtMaxPerJob = 100, evtTotal = 100*10)

app = Boss(version=bossVersion, optsfile=optionsFile, metadata=metadata)
j = Job(application=app, splitter=splitter)

j.backend = Dirac()
j.backend.settings['JobGroup'] = jobGroup
j.submit()

5.1.4.2 Simulation+reconstruction

The example ganga script is as follows:

bossVersion = '6.6.4.p02'
optionsFile = 'jobOptions_sim_rhopi.txt'
recOptionsFile = 'jobOptions_rec_rhopi.txt'
jobGroup = 'simrec_rhopi_141102'
metadata = {'resonance': 'jpsi', 'eventType': 'rhopi'}
splitter = UserSplitterByRun(evtMaxPerJob = 100, evtTotal = 100*10)

app = Boss(version=bossVersion, optsfile=optionsFile,
            recoptsfile=recOptionsFile, metadata=metadata)
j = Job(application=app, splitter=splitter)

j.backend = Dirac()
j.backend.settings['JobGroup'] = jobGroup
j.submit()

5.1.4.3 Simulation+Reconstruction+Analysis

The example ganga script is as follows:

bossVersion = '6.6.4.p02'
optionsFile = 'jobOptions_sim_rhopi.txt'
recOptionsFile = 'jobOptions_rec_rhopi.txt'
anaOptionsFile = 'jobOptions_ana_rhopi.txt'
jobGroup = 'simrecana_rhopi_141102'
metadata = {'resonance': 'jpsi', 'eventType': 'rhopi'}
splitter = UserSplitterByRun(evtMaxPerJob = 100, evtTotal = 100*10)

app = Boss(version=bossVersion, optsfile=optionsFile, recoptsfile=recOptionsFile,
    anaoptsfile=anaOptionsFile, use_custom_package=True, metadata=metadata)
j = Job(application=app, splitter=splitter)

j.backend = Dirac()
j.backend.settings['JobGroup'] = jobGroup

j.submit()

If users need to upload their own code to grid, he needs to set "use_custom_package=True".

5.1.5 Job Splitting

Two ways of splitting:

（1） split by run, splitting by the luminosity of each run, each job contain just one run and event number is proportional to the luminosity of the run.

（2） split by event, all the jobs contains the range of run specified in job options files and events for each jobs are the same.

5.1.5.1 Split by Run

Use splitter to define total events and the maximum events in each jobs

splitter = UserSplitterByRun(evtMaxPerJob = 500, evtTotal = 500*100)

In the job option file, please set the run range with the smaller run (absolute value) first:

RealizationSvc.RunIdList = {-11414, 0, -13988, -14395, 0, -14604};

RealizationSvc.RunIdList = {11414, 0, 13988, 14395, 0, 14604};

Note: Sign of minus is optional. Be sure that the run range don't exceed one round.

If uses want to know event number in each one，define output file name in the splitter：

splitter = UserSplitterByRun(evtMaxPerJob   = 500, evtTotal = 500*100,       outputEvtNum = 'run_event.log')

5.1.5.2 Split by Event

Use splitter to define total events and the maximum events in each jobs

splitter = UserSplitterByEvent(evtMaxPerJob = 500, evtTotal = 500*100)

splitter =   UserSplitterByEvent(evtMaxPerJob = 500, evtTotal = 500*100)

Note: only use this method in a special case since it will create heavy pressure in local file system when jobs are running in parallel.

5.1.5.3 Random seed

Random seed can be defined by default, but users can also define his own, for example:

splitter =   UserSplitterByRun(evtMaxPerJob = 100, evtTotal = 100*10, seed = 10001)

Then the seed can be 10001, 10002, 10003,...

5.1.6 Others

5.1.6.1 Select output files

The output files are the final output of the jobs by default. If users want to keep the middle results, can do the setting:

app = Boss(version=bossVersion, optsfile=optionsFile,         recoptsfile=recOptionsFile, anaoptsfile=anaOptionsFile,   metadata=metadata,         use_custom_package=True, output_step=['sim','rec','ana'])

5.1.6.2 Use defined generators

If users need to do the simulation with its own generator such as BESEVTGEN DIY mode, he needs to set "use_custom_package=True" which allows to upload the packages under his work directory

app = Boss(version=bossVersion, optsfile=optionsFile,  use_custom_package=True, metadata=metadata)

5.1.6.3 Other input files

If users want to upload his own special single files, he can use auto_upload.append to do the upload.

bossVersion = '6.6.4.p02'
optionsFile = 'jobOptions_sim_rhopi.txt'
recOptionsFile = 'jobOptions_rec_rhopi.txt'
anaOptionsFile = 'jobOptions_ana_rhopi.txt'
jobGroup = 'simrecana_rhopi_141102'
metadata = {'resonance': 'jpsi', 'eventType': 'rhopi'}
splitter = UserSplitterByRun(evtMaxPerJob = 100, evtTotal = 100*10)

app = Boss(version=bossVersion, optsfile=optionsFile, recoptsfile=recOptionsFile,
    anaoptsfile=anaOptionsFile, use_custom_package=True, metadata=metadata)
app.auto_upload.append('/your_extra_file_path_1/extra_input_1.txt')
app.auto_upload.append('/your_extra_file_path_2/extra_input_2.txt')

j = Job(application=app, splitter=splitter)

j.backend = Dirac()
j.backend.settings['JobGroup'] = jobGroup

j.submit()

Note：Make sure that the paths for these files in the code are using relative directory to the current directory instead of absolute directories. A good way is to set these file paths in the job option files.

5.2 CEPC User Guide

5.2.1 Preparation

Apply certificates and join BES VO before using distributed resources to run BOSS jobs.

5.2.2 Environment set-up

source /cvmfs/dcomputing.ihep.ac.cn/frontend/dsub/setup/env_dsub.sh

5.2.3 Prepare job configuration files

Prepare job.cfg，here is the example, you can also refer to the examples under the directory /cvmfs/dcomputing.ihep.ac.cn/frontend/dsub

job_type = cepc_sr
repo_dir = ./repo
work_dir = ./work
input_filelist = ./stdhep.list 
output_dir = test_001
evtmax = 10
evtstart = 0
batch = 1
job_group = CEPC_test_001

5.2.4 Job submission

Submit jobs with the following commands:

dsub job.cfg

5.2.5 Job status monitoring

Users can go to the DIRAC web portal（http://dirac.ihep.ac.cn) and choose the job monitor. JobGroup is a parameter used to query jobs.

Here is exit codes for common problems:

10 error in preparation
11 cvmfs is not ready
20 error in simulation
21 error in database connection
22 Too many substeps 
23 no input events
30 error in reconstruction

5.2.6 Get outputs

When jobs completed, you can find your outputs under /gridfs

/gridfs/cepc/user/<initial>/<username>/<output_dir>/sim   /gridfs/cepc/user/<initial>/<username> /<output_dir>/rec

5. Distributed Computation

5 Distributed computing

5.1 BOSS user guide

5.1.1 Preparation

5.1.2 Environment set-up

5.1.3 Job management

5.1.3.1 Job submission

5.1.3.2 Job monitoring

5.1.3.3 Get output

5.1.4 Job types supported

5.1.4.1 Simulation

5.1.4.2 Simulation+reconstruction

5.1.4.3 Simulation+Reconstruction+Analysis

5.1.5 Job Splitting

5.1.5.1 Split by Run

5.1.5.2 Split by Event

5.1.5.3 Random seed

5.1.6 Others

5.1.6.1 Select output files

5.1.6.2 Use defined generators

5.1.6.3 Other input files

5.2 CEPC User Guide

5.2.1 Preparation

5.2.2 Environment set-up

5.2.3 Prepare job configuration files

5.2.4 Job submission

5.2.5 Job status monitoring

5.2.6 Get outputs

results matching ""

No results matching ""