Simulation tasks submitted through CLI are expected to be run in "batch" mode. Batch jobs are controlled by the so-called Batch Scripts (also referred to as Job Scripts), which are written by the user and then submitted to the resource management system. These scripts specify, at the very least, how many nodes and cores the job will use, how long the job will run, the name of the application to be run, and other important compute parameters.
Interactive parallel jobs are not supported on our platform by design. Users are encouraged to prototype calculations on the login node (using 2-8 CPU cores with < 1min walltime per user) instead, and submit larger debug tasks into the Debug queue designed specifically for testing purposes.
Our batch system is based on the PBS model 1, implemented with the Maui scheduler 2 and Torque resource manager [^3]. The actual execution of the parallel job, however, is handled by a special command, called a job launcher, which is implemented by a parallel (MPI) library. In a generic Linux environment, this utility is often labelled "mpirun".
The general layout structure of Batch Scripts is the object of this discussion.
The main "working" directory, which is important in the context of defining Batch Scripts, storing simulation files and submitting jobs via CLI, is described in this page.
We provide some examples on how to enter the relevant information about jobs in batch scripts here.