3.2.1 HTCondor Guide

3.2.1.1 Introduction

HTCondor is a popular job management system in HEP field. It's flexible and powerful for high throughput computing in very large clusters. Most local computer resources have been moved to HTCondor at IHEP. The system is configured and optimized according to IHEP cluster environment. We also implemented a toolkit HepJob, which helps users to manage their jobs. There are customizations in HepJob for the IHEP cluster. We should use HepJob instead of the native HTCondor commands unless there are some specific requirements.

3.2.1.2 HepJob Toolkit

1) Preparations

The HepJob is installed in the following directory. It's recommended to set the directory in your PATH environment variable:

  • for bash

    $ export PATH=/afs/ihep.ac.cn/soft/common/sysgroup/hep_job/bin:$PATH
    
  • for tcsh

    $ setenv PATH /afs/ihep.ac.cn/soft/common/sysgroup/hep_job/bin:$PATH
    

The application file of the job should be executable. We can check and change the file permission as following:

  • Show the job permission

    $ /bin/ls –l job.sh
    -rw-r--r-- 1 jiangxw u07 85 Aug 29 18:23 job.sh
    

    The file job.sh is not executable. It can be set by the command chmod

    $ /bin/chmod +x job.sh
    

    Then,the additional ‘x’ in the first column indicates the executable permission

    $ /bin/ls –l job.sh
    -rwxr-xr-x 1 jiangxw u07 85 Aug 29 18:23 job.sh
    

2) Job Submission

Command:

hep_sub [-h] [-g {physics,juno,dybrun,dyw,u07,offlinerun,pku,longq}]
             [-p {virtual,local,ali}] [-u {vanilla,grid,docker}] [-o OUT]
             [-e ERROR] [-n NUMBER] [-os OPERATINGSYSTEM]
             [-t {atlasbm,hxmtbm,wljMC}] [-prio PRIORITY]
             [-np NUMBERPROCESS] [-argu ARGUMENTS [ARGUMENTS ...]]
             [-dir DIRECTORY] [-mem MEMORY] [-quiet] [-part PARTITION]
             [-name NAME] [-slurm] [-site SITENAME] [-jf JOBFILE]
             [-tf TRANSFERFILE] [-wn WORKNODE] [-wt WALLTIME]
             jobscript

Options:

  • jobscript: job application name, both absolute path and relative path are supported. For example

    $ hep_sub job.sh
    
  • -g: to indicate the job group. The user's primary group is used by default if it is not set. For example, if you want to use the computing resources of juno

    $ hep_sub –g juno job.sh
    
  • -p: to indicate the resource pool. Currently their are 2 types of resource pools, the local physical resource pool and the virtual machine resource pool. The local physical resource pool is used by default if it is not set. For example, if we want to use the virtual machine resource pool

    $ hep_sub –p virtual job.sh
    
  • -u: to indicate the job universe. The vanilla universe is used by default if it is not set. It is not necessary to set it on IHEP local cluster in most cases. However, we can set it by -u grid to submit grid jobs. For example

    $ hep_sub –u grid job.sh
    
  • -o: to write the standard output of the job to a file. When it is not set, the standard output is wrote to a file named "jobname+.out".

  • -e: to write the standard error of the job to a file. When it is not set, the standard error is wrote to a file named "jobname+.err".

  • -l: to write the job log to a file. The job log file is not generated by default if it is not set. The job log file is meaningless in most cases. We can ignore it if you are uncertain.

  • -os: to indicate the operation system version for the job. It is SL6 by default if it is not set. For example, we can set the job running on a SL7 node as following

    $ hep_sub –os SL7 job.sh
    
  • -np: to indicate the number of CPU cores for mpi job.

  • -argu: to indicate the job application arguments. For example, we can set an argument "1" to "job.sh" as following

    $ hep_sub –argu 1 job.sh
    
  • -mem: to indicate the memory size (in MB) used by the job. For example, we can set a job with memory larger than 3GB

    $ hep_sub –mem 3000 job.sh
    

    If -mem is used without the value, the job will be scheduled to the node with largest memory.

  • -site: to submit the job to a remote site.

  • options for simplification: except the previous parameters, there are different templates for different job types to simplify the users' commands. (There is a cepcmpi template at present, more templates will be supplied). The parameter -t is used to indicate a template. For example, we can submit a cepc mpi job in this way

    $ hep_sub –t cepcmpi job.sh
    

Attention

There might be side effect to the -t template if we use other parameters together with -t.

3) Job Querying

Command:

hep_q [-h] [-u [USER]] [-i ID] [-run] [-p {virtual,local,ali}]
      [-t {atlasbm,hxmtbm,wljMC}] [-st STARTTIME]
      [-stat {run,idle,other} [{run,idle,other} ...]] [-slurm]

Options:

  • -u: to query the jobs of the specified user. It is the current user by default. For example

    $ hep_q –u <username>
    

    The current user's jobs are queried if we use “hep_q -u” without a username.

  • -i: to query a job with JobID or the clusterid. There is no default value to it. A JobID consists of a clusterid and a processid, in the form of clusterid.processid (a JobID 3745232.1 contains a clusterid 3745232 and a processid 1). Take the JobID 3745232.1 as example

    $ hep_q –i 3745232.1
    

    The processid is ignored when we query all the jobs belonging to a same clusterid

    $ hep_q –i 3745232
    
  • -run: to query the running jobs. All running jobs are queried when it is not used. The output looks like

    JOBID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
    5519.0 user1 04/11 15:11 0+00:00:00 I 0 0.0 myjob.sh
    3656.0 user2 04/11 17:12 0+00:00:00 I 0 0.1 job.sh
    

    in which the JOBID indicate the ID of the job, OWNER indicate the user, SUBMITTED indicate the job submission time, RUN_TIME the job running time, ST indicate the job status, PRI indicate the job priority, SIZE indicate the virtual memory occupied by the job, CMD indicate the job application.

  • -t: to indicate the job template. For example, it is necessary to cpecmpi users

    $ hep_q –i 3745232 –t cepcmpi
    

4) Job Removing

Command:

hep_rm [-h] [-a] [-t {atlasbm,hxmtbm,wljMC}] [-p {virtual,local,ali}]
       [-name NAME] [-slurm]
       [jobs [jobs ...]]

Options:

  • jobs: to indicate the JobIDs for removing. One or more JobIDs are supported in each invoking

    $ hep_rm 3745232 3745233.0
    

    all jobs with clusterid 3745232, and the job with JobID 3745233.0 will be removed at the same time.

  • -a: to remove all the jobs belonging to the current user. For example

    $ hep_rm -a
    
  • -t: to indicate a job template. For example, it is necessary to cpecmpi users

    $ hep_rm –i 3745232 –t cepcmpi
    
  • -forcex: force to delete the job stucked in stat "X". Please note that this parameter only take effects on "X" job. Stat "X" generally indicates there would be a problem between job server and worker node, and the job is in a deleting status. If remove job 3745232.0 which is stucked in stat "X", please run the following command:

    $ hep_rm 3745232 –forcex
    

5) More Information

We can find a brief introduction to each HepJob command with a "-h" or "--help" parameter.

3.2.1.3 Job and Resource Information

1) Limitation of Job Running Time and Maximum Number of Running Jobs

On login node (lxlogin.ihep.ac.cn), the walltime limit can be query by group. For example, to know the walltime limit of group juno, execute command:

$ hep_clus -g juno --walltime
Experiment short job walltime limit (hour) normal job walltime limit (hour) mid job walltime limit (hour) & the rate of available resources
BES <0.5 <40 <100:10%
JUNO <0.5 <20 <100:10%
DYW <0.5 <10 <100:10%
CEPC <0.5 <10 <100:10%
ATLAS <0.5 <10 <100:10%
CMS <0.5 <10 <100:10%
HXMT <0.5 <14 <100:10%
GECAM <0.5 <24 <100:10%
LHCb <0.5 <100
LHAASO <0.5 <15 <100:10%

Note, the experiment with the non-value of mid job walltime has no resources for mid job.

3.2.1.4 Tips of Using HepJob

1) Macro Usage in Submitting Jobs

Macro can be used to transfer the variable to scheduler system, when you want get some attribute information of your jobs, such as ClusterId, ProcId, AcctGroup and so on. Use the macro, set as:

  %{macro_name}

The general usage scenarios is submitting batch jobs, referring to 3.2.

2) Submit A Batch of Jobs

By setting argument ‘-n’, the number of jobs can be submitted one time, which is called batch jobs. Run the command as:

  $ hep_sub job.sh.%{ProcId} -n 5000

The %{ClusterId} is job id; The %{ProcId} is a natural number, increasing from 0 by step 1. It can be used to distinguish the different jobs.

Meanwhile, %{ProcId} can be passed into job script by parameter -argu. So that we can access some specific files by mapping procid to filename. E.g.

  $ hep_sub job.sh -argu %{ProcId} -n 5000

Note, please make sure your job script is existing and executable, and do not submit too large amount of jobs into computing cluster.

  • Example 1: Prepared all the job scripts before submission, and ensure all scripts has the executable permission. E.g:
real_job_20191204_0.sh
real_job_20191204_1.sh
real_job_20191204_2.sh
real_job_20191204_3.sh
real_job_20191204_4.sh
real_job_20191204_5.sh
real_job_20191204_6.sh
real_job_20191204_7.sh
real_job_20191204_8.sh
real_job_20191204_9.sh
real_job_20191204_10.sh

The name of these job scripts is defined in a regular format, just like real_job_20191204_*.sh, and the key letter(s) is the increasing sequence which is starting with 0 (0,1,2,...). To submit these kind of jobs, run:

$ hep_sub real_job_20191204_%{ProcId}.sh -n 11
  • Example 2: If the name of job scripts has no normal sequence of starting with 0, but still in a specific regular format, just like:
real_job_20191201.sh
real_job_20191202.sh
real_job_20191203.sh
real_job_20191204.sh
real_job_20191205.sh
real_job_20191206.sh
real_job_20191207.sh
...
real_job_20191230.sh
real_job_20191231.sh

The script name is with a date-format substring, we can prepare a extra script to map %{ProcId} to the date string. By passing ${ProcId} to the extra script, we can do all kinds of mapping as expected. As the example 2, write a script of real_job_parent.sh:

#!/bin/bash

# get procid from command line
procid=$1

# map 0,1,2,...,30 to 1,2,3,...,31
sub_name_number=`expr $procid + 1`

# format 1,2,3,...,31 to 01,02,03,...,31
sub_name=`printf "%02d\n" $sub_name_number` 

# run the real job script by the formatted file name
bash real_job_201912"${sub_name}".sh

To submit jobs, run:

$ hep_sub real_job_parent.sh -argu %{ProcId} -n 31

3) Get Basic Job Info in Job Script during Job Running

  • Get the JobID in the job Script: the JobID is available in the environment variable "_CONDOR_IHEP_JOB_ID" while the job is running. For example, we can get it in a bash script
#!/bin/bash
JobId=$_CONDOR_IHEP_JOB_ID
  • Get the executing worker node by accessing environment variable "_CONDOR_IHEP_REMOTE_HOST". Here is an example:
#!/bin/bash
ExecWorkNode=$_CONDOR_IHEP_REMOTE_HOST
  • Get the job submission time by accessing environment variable "_CONDOR_IHEP_SUBMISSION_TIME". Here is an example:
#!/bin/bash
SubmissionTime=$_CONDOR_IHEP_SUBMISSION_TIME

3.2.1.5 Specifications to Different Job Types

1) BESIII Users

For the standard boss jobs, we can use the simplified command boss.condor:

$ boss.condor joboptions.txt

If you need to specify the job memory size, you can use the parameter -mem:

$ boss.condor -mem 5000 joboptions.txt

If you need to use multiple CPU cores, you can use the parameter -cpu:

$ boss.condor -cpu 2 joboptions.txt

If you want to adjust the maximum running time of the job, you can use the parameter -wt:

$ boss.condor -wt mid joboptions.txt

Note: We provide the following job types: test, short, mid, long, special

If you want to submit a batch of jobs with continuous job numbers, you can use the parameter -n:

# Suppose there are user jobs: job0, job1, job2. You can use "-n" and "%{ProcId}" to batch submit

$ boss.condor -n 3 job%"{ProcId}"

# Here, "{ProcId}" replaces the number starting from 0 in the job number. Before that, you need to add %. In this way, you can submit job0, job1, and job2 jobs at one time.

Note: The job number must start from 0.

boss.condor now supports job flow service (package SIM and REC into a whole job). You can use the following commands:

$ boss.condor sim_joboption.txt rec_joboption.txt ana_joboption.txt

One submission can complete three jobs.

Similarly, the flow service supports batch submission.

# Suppose there are user jobs: sim0, rec0, ana0, sim1, rec1, ana1

$ boss.condor -n 2 sim%"{ProcId}" rec%"{ProcId}" ana%"{ProcId}"

# 此语法会提交两个用户作业,第一个作业为sim0, rec0, ana0, 第二个作业为sim1, rec1, ana1

For other BESIII jobs, please set your group as physics:

$ hep_sub -g physics job.sh

Job querying and removing are the same as previous descriptions.

2) CEPC MPI Jobs

To submit a MPI job:

$ hep_sub /cefs/CEPCMPI/usernem/mpijob.sh -g cepcmpi –t cepcmpi –np 20

where we use -t to indicate the cepcmpi template,-np to indicate the number of CPU cores, an absolute application path should be provided in this case.

Job querying and removing are the same as previous descriptions.

3) COMET Jobs

It is same as BESIII for job submission, querying and removing.

4) Others

No specifications. Please see the previous section.

results matching ""

    No results matching ""