PBS Job Launcher

The PBS class is a tool to define properties of the PBS set up you want to use, write pbs scripts, and launch jobs. The PBS class has several classmethods that serve as alternate constructors which fill in properties of some NASA HPC systems and queue. Examples of instantiating with this methods is shown below. For systems or queues not covered by these classmethods, the basic queue attributes are set in the standard constructor, and less common ones can be adjusted by changing the attributes of the object.

PBS Class

class pbs4py.pbs.PBS(queue_name='K4-route', ncpus_per_node=40, ngpus_per_node=0, queue_node_limit=10, time=72, mem=None, profile_filename='~/.bashrc', requested_number_of_nodes=1)
A class for creating and running pbs jobs. Default queue properties are for K4.
Defaults not set during instantiation can be adjusted by directly modifying attributes.
Parameters
  • queue_name (str) – Queue name which goes on the “#PBS -N {name}” line of the pbs header

  • ncpus_per_node (int) – Number of CPU cores per node

  • ngpus_per_node (int) – Number of GPUs per node

  • queue_node_limit (int) – Maximum number of nodes allowed in this queue

  • time (int) – The requested job walltime in hours

  • mem (str) – The requested memory size. String to allow specifying in G, MB, etc.

  • profile_file – The file setting the environment to source inside the PBS job. Set to ‘’ if you do not wish to source a file.

  • requested_number_of_nodes (int) – The number of compute nodes to request

queue_name: str

The name of the queue which goes on the #PBS -N {queue_name} line of the pbs header

model: Optional[str]

The processor model if it needs to be specified. The associated PBS header line is #PBS -l select=#:ncpus=#:mpiprocs=#:model={model} If left as None, the :model={mode} will not be added to the header line

group_list: Optional[str]

The group for the group_list entry of the pbs header if necessary. The associated PBS header line is #PBS -W group_list={group_list}

mem: Optional[str]

Requested memory size on the select line. Need to include units in the str. The associated PBS header line is #PBS -l select=#:mem={mem}

array_range: Optional[str]

Index range for PBS array of jobs The associated PBS header line is #PBS -J {array_range}

mail_options: str

pbs -m mail options. ‘e’ at exit, ‘b’ at beginning, ‘a’ at abort

mail_list: Optional[str]

pbs -M mail list. Who to email when mail_options are triggered

dependency_type: str

Type of dependency if dependency active. Default is ‘afterok’ which only launches the new job if the previous one was successful.

mpiexec: str

mpiexec, mpirun, mpiexec_mpt, etc.

Type

The mpi execution command name

ranks_per_node_flag: str

Command line option for mpiexec to specify the number of MPI ranks for host/node. Default is to set it based on the mpiexec version.

property requested_number_of_nodes

The number of nodes to request. That is, the ‘select’ number in the #PBS -l select={requested_number_of_nodes}:ncpus=40:mpiprocs=40.

Type

int

classmethod k4(time=72, profile_filename='~/.bashrc', requested_number_of_nodes=1)

Constructor for the K4 queues on LaRC’s K cluster including K4-standard-512.

Parameters
  • time (int) – The requested job walltime in hours

  • profile_file – The file setting the environment to source inside the PBS job

  • requested_number_of_nodes (int) – The number of compute nodes to request

classmethod k3c(time=72, profile_filename='~/.bashrc', requested_number_of_nodes=1)

Constructor for the K3b queues on LaRC’s K cluster.

Parameters
  • time (int) – The requested job walltime in hours

  • profile_file – The file setting the environment to source inside the PBS job

  • requested_number_of_nodes (int) – The number of compute nodes to request

classmethod k3b(time=72, profile_filename='~/.bashrc', requested_number_of_nodes=1)

Constructor for the K3b queues on LaRC’s K cluster.

Parameters
  • time (int) – The requested job walltime in hours

  • profile_file – The file setting the environment to source inside the PBS job

  • requested_number_of_nodes (int) – The number of compute nodes to request

classmethod k3a(time=72, profile_filename='~/.bashrc', requested_number_of_nodes=1)

Constructor for the K3a queue on LaRC’s K cluster.

Parameters
  • time (int) – The requested job walltime in hours

  • profile_file – The file setting the environment to source inside the PBS job

  • requested_number_of_nodes (int) – The number of compute nodes to request

classmethod nas(group_list, proc_type='broadwell', queue_name='long', time=72, mem=None, profile_filename='~/.bashrc', requested_number_of_nodes=1)

Constructor for the queues at NAS. Must specify the group_list

Parameters
  • group_list (str) – The charge number or group for the group_list entry of the pbs header. The associated PBS header line is “#PBS -W group_list={group_list}”.

  • proc_type (str) – The type of processor to submit to. Can write out or just the first 3 letters: ‘cas’, ‘sky’, ‘bro’, ‘has’, ‘ivy’, ‘san’.

  • queue_name (str) – Which queue to submit to: devel, debug, normal, long, etc.

  • time (int) – The requested job walltime in hours

  • profile_file – The file setting the environment to source inside the PBS job

create_mpi_command(command, output_root_name=None, openmp_threads=None, ranks_per_node=None)

Wrap a command with mpiexec and route its standard and error output to a file

Parameters
  • command (str) – The command thats needs to run in parallel

  • output_root_name (str) – The root name of the output file, {output_root_name}.out.

  • openmp_threads (int) – The number of openmp threads per mpi process.

  • ranks_per_node (int) – The number of MPI ranks per compute node.

Returns

full_command – The full command string.

Return type

str

launch(job_name, job_body, blocking=True, dependency=None)

Create a job script and launch the job

Parameters
  • job_name (str) – The name of the job.

  • job_body (List[str]) – List of commands to run in the body of the job.

  • blocking (bool) – If true, this function will wait for the job to complete before returning. If false, this function will launch the job but not wait for it to finish.

  • dependency (str) – Jobs that this one depends one. For PBS, these are colon separated in the string

Returns

command_output – The stdout of the launch command. If the job is successfully launch, this will be the job id.

Return type

str

property mpiprocs_per_node

The number of requested mpiprocs per node. If not set, the launcher will default to the number of cpus per node. #PBS -l select=1:ncpus=40:mpiprocs={mpiprocs_per_node}.

Type

int

property profile_filename

The file to source at the start of the pbs script to set the environment. Typical names include ‘~/.profile’, ‘~/.bashrc’, and ‘~/.cshrc’. If you do not wish to source a file, set to ‘’.

Type

str

write_job_file(job_filename, job_name, job_body, dependency=None)

Create a launch script file in the current directory for the commands defined in job_body.

Parameters
  • job_filename (str) – name of file to write to

  • job_name (str) – The name of the job.

  • job_body (List[str]) – List of commands to run in the body of the job.

  • dependency (str) – Jobs that this one depends one. For PBS, these are colon separated in the string

hashbang: str

The hashbang line which sets the shell for the PBS script. If unset, the default is #!/usr/bin/env {self.shell}.

shell

The shell flavor to use in the PBS job

queue_node_limit: int

The maximum number nodes allowed by the queue

time: int

The requested wall time for the pbs job(s) in hours

ncpus_per_node: int

The number of CPU cores per node.

ngpus_per_node: int

The number of GPUs per node.

tee_output: bool

If true, redirection of the output of mpi commands changed to tee

PBS’s classmethod constructors

from pbs4py import PBS

k4 = PBS.k4(time=48)
k3 = PBS.k3()
k3a = PBS.k3a()
nas = PBS.nas(group_list='n1337', proc_type='skylake', time = 72)

FakePBS Class

Some scripts may be set up with the PBS job handler originally, but you may want to run the script within an existing PBS job without launching new PBS jobs. The FakePBS object appears to driving scripts as a standard PBS object, but directly runs the commands instead of putting them into a PBS job and launching the job.

class pbs4py.fake_pbs.FakePBS(profile_filename='', stop_at_first_failure=False)

A fake PBS class for directly running commands while still calling as if it were a standard PBS driver. This can be used to seemless switch between modes where PBS jobs are launched for each “job”, or using a FakePBS object when you don’t want to launch a new pbs job for each “job”, e.g., driving a script while already within the PBS job.

A class for creating and running pbs jobs. Default queue properties are for K4.
Defaults not set during instantiation can be adjusted by directly modifying attributes.
Parameters
  • queue_name (str) – Queue name which goes on the “#PBS -N {name}” line of the pbs header

  • ncpus_per_node (int) – Number of CPU cores per node

  • ngpus_per_node (int) – Number of GPUs per node

  • queue_node_limit (int) – Maximum number of nodes allowed in this queue

  • time (int) – The requested job walltime in hours

  • mem (Optional[str]) – The requested memory size. String to allow specifying in G, MB, etc.

  • profile_file – The file setting the environment to source inside the PBS job. Set to ‘’ if you do not wish to source a file.

  • requested_number_of_nodes (int) – The number of compute nodes to request

launch(job_name, job_body, blocking=True, dependency=None)

Runs the commands in the job_body and determines if any failed based on status flags

Parameters
  • job_name (str) – [ignored]

  • job_body (List[str]) – List of commands to run

  • blocking (bool) – [ignored]

  • dependency (str) – [ignored]

Returns

pbs_command_output – Empty string but returning something to match true PBS launch output

Return type

str