PBS Job Batch Submission¶
The PBSBatch class is a tool launch many jobs simultaneously.
The basic steps are:
Instantiating a
PBS
that will be used to submit the jobs.Creating a list of
BatchJob
objects that hold the name of the job and a list of the commands to run.Setting up the job directory with the appropriate input files.
Giving the
PBS
object and list ofBatchJob
to thePBSBatch
constructor and then calling one of the launch methods.
Setting up the Job Directories¶
By default jobs are launched in directories with the same name as the job. This prevents concurrent jobs in the batch from overwriting each other’s output files.
To set up a job, these directories can be created and populated with code like this:
batch = PBSBatch(pbs,jobs)
batch.create_directories()
common_inputs_to_copy = ['fun3d.nml','*.cfg']
for job in jobs:
for input in common_inputs_to_copy:
os.system(f'cp {input} {job.name}')
Launch Methods¶
The batch jobs can be submitted with two different methods of the PBSBatch
class.
launch_jobs_with_limit()
will launch every job in the list,
but it will only allow a certain number of jobs to be active in the queue system
(queued, running, held) at a time. This would be the preferred launch method if
you have many jobs and don’t want to submit 100s of jobs into the queue at a time
as a courtesy to your fellow HPC users.
launch_all_jobs()
will launch every job in the list.
It has an optional argument to wait for the jobs to finish before returning or
returning immediately after all of the jobs are submitted to the queue.
Batch Job Class¶
- class pbs4py.pbs_batch.BatchJob(name, body)¶
Class for individual PBS jobs within a batch of jobs
Can be used as a context manager to enter/exit a directory with the job’s name
- name¶
Name of the job.
- Type
str
- body¶
list of commands to run in PBS job
- Type
List[str]
- id¶
pbs job identifier returned by qsub
- Type
str
- get_pbs_job_state()¶
Get the job’s status after it has been submitted. Returns the entry of job_state in the qstat information, e.g., ‘Q’, ‘R’, ‘F’, ‘H’, etc.
- Return type
str
PBSBatch Class¶
- class pbs4py.pbs_batch.PBSBatch(pbs, jobs, use_separate_directories=True)¶
Batch of PBS jobs. Assumes all jobs required the same job request size. By default, separate directories with the job’s name will be used to separate output files.
- Parameters
- create_directories()¶
Create the set of directories with the jobs’ names
- launch_all_jobs(wait_for_jobs_to_finish=False, check_frequency_in_secs=30)¶
Launch of the all of the jobs in the list. Stores the pbs job id in the job objects
- Parameters
wait_for_jobs_to_finish (
bool
) – If True, the jobs will be submitted, and this function will not return until all of the jobs are finished.check_frequency_in_secs (
float
) – Time interval to wait before checking if all jobs are done. Only relevant ifwait_for_jobs_to_finish
is True.
- launch_jobs_with_limit(max_jobs_at_a_time=20, check_frequency_in_secs=30)¶
The “courteous” version of launch_all_jobs(wait_for_jobs_to_finish=True) and where a limit is set for the maximum number of jobs running or in the queue at a time since some people may not like if you submit 1000 jobs at once.
- Parameters
max_jobs_at_a_time (
int
) – Limit for number of jobs to have queued, running, or held at a timecheck_frequency_in_secs (
float
) – Time interval to wait before checking if jobs’ statuses.
- wait_for_all_jobs_to_finish(check_frequency_in_secs=30)¶
A blocking check for all the jobs in the batch to finish. Can be paired with
launch_all_jobs
.- Parameters
check_frequency_in_secs (
float
) – How often to check and print the jobs’ states