alces-job is a command-line tool that helps users quickly generate high-quality, ready-to-submit Slurm job scripts using templates, parameters, profiles, and site-specific defaults.
Ensure gem is installed by running
$ gem --versionGo to the releases page and download the gem file, or download it directly with
$ wget https://github.com/alces-software/alces-job/releases/download/v0.5.0/alces-job-0.5.0.gemRun:
$ gem install alces-job-0.5.0.gemVerify installation with
$ alces-job versionEnsure gem is installed by running
$ gem --versionRun:
$ git clone https://github.com/alces-software/alces-job
$ cd alces-jobBuild the gem with
$ gem build alces-job.gemspecInstall it with:
$ gem install alces-job-0.5.0.gemVerify installation with
$ alces-job versionThe basic utility of the command can be run with
$ alces-job base [OPTIONS]This will generate an output file, by default called job.sbatch, with any SBATCH options provided.
The default command supports the following flags:
--job-name NAME- Sets the Slurm job name (
#SBATCH --job-name=NAME).
- Sets the Slurm job name (
--nodes N- Sets the number of cluster nodes to request.
--ntasks N- Sets the total number of tasks for the job.
--cpus-per-task N- Sets the number of CPU cores to allocate per task.
--mem SIZE- Sets the amount of memory requested (e.g.
4G,2000M).
- Sets the amount of memory requested (e.g.
--time DURATION- Sets the walltime limit for the job (e.g.
02:00:00).
- Sets the walltime limit for the job (e.g.
--partition PARTITION- Sets the Slurm partition/queue to submit to.
--account ACCOUNT- Sets the Slurm account to charge.
--gres GRES- Sets generic resources like GPUs (
gpu:1,gpu:2).
- Sets generic resources like GPUs (
--output PATH- Sets the job output file path (
#SBATCH --output=PATH).
- Sets the job output file path (
--error PATH- Sets the job error file path (
#SBATCH --error=PATH).
- Sets the job error file path (
--mail-user ADDRESS- Sets the email address for Slurm notifications.
--mail-type TYPE- Sets the Slurm mail notification type (
BEGIN,END,FAIL, etc.).
- Sets the Slurm mail notification type (
--module NAME- Loads one or more environment modules before running the job.
--workdir PATH- Changes to the specified working directory inside the job script.
--command COMMAND- The shell command to run inside the generated job script.
--array ARRAY_SPEC- Sets the Slurm array specification (
#SBATCH --array=...).
- Sets the Slurm array specification (
--dependency DEPENDENCY- Sets the Slurm job dependency string (
#SBATCH --dependency=...).
- Sets the Slurm job dependency string (
--output-file PATH- Writes the generated script to a specific filename instead of
job.sbatch.
- Writes the generated script to a specific filename instead of
--submit- If present, submits the generated script to Slurm with
sbatchafter generation.
- If present, submits the generated script to Slurm with
--dry-run- If present, does not save the file, and instead outputs what would be saved to the console
Use these flags together to customize the generated Slurm script and optionally submit it automatically.
The tool has an interactive wizard that can be accessed by using the -i or --interactive flag on the base command
$ alces-job --interactiveThe generated Slurm script is produced from an ERB template in the templates/ directory.
By default the base command renders templates/default.erb, and the CLI passes the command options into the template as @context values. For example, --job-name, --nodes, --command, and other flags are available inside the template as <%= @context.job_name %>, <%= @context.nodes %>, and <%= @context.command %>.
Specialized commands such as mpi, gpu, and array select a different template by name before rendering so they can generate job scripts with the correct SBATCH boilerplate for that workload.
Custom templates can be added by creating a new ERB file in .alces-job/templates following the example of one given in ./templates. You can then call these templates by using the --template flag and specifying the name
System-wide templated can be created in a simelar manner, by creating an ERB file in /etc/alces-job/templates/. These templates can be called by any user, but are overwritten by their own ones.
List all the available templates to use
$ alces-job template listOutput the contents of the template to the console
$ alces-job template show TEMPLATEThe modify subcommand will modify an existing sbatch script with additional flags. It takes in all the options that the base command does with the exception of --dry-run and --output-file.
$ alces-job modify SBATCH_SCRIPT [OPTIONS]The validate subcommand will take an existing sbatch script as an input and tell you if the script is valid or not.
$ alces-job validate SBATCH_SCRIPTGenerate a basic job script:
$ alces-job base --job-name test-job --nodes 1 --ntasks 1 --cpus-per-task 2 --mem 4G --time 01:00:00 --command 'echo hello'Generate a GPU job script:
$ alces-job gpu --job-name gpu-job --nodes 1 --ntasks 1 --cpus-per-task 4 --mem 16G --gres gpu:1 --time 02:00:00 --command 'python train.py'Generate an MPI job script:
$ alces-job mpi --job-name mpi-job --nodes 2 --ntasks 32 --cpus-per-task 2 --mem 8G --time 04:00:00 --command 'mpirun ./app'Generate an array job script:
$ alces-job array --job-name array-job --nodes 1 --mem 2G --time 01:00:00 --array '1-10%2' --command 'echo task $SLURM_ARRAY_TASK_ID'Generate a serial job via a template:
$ alces-job base --job-name serial-job --mem 1G --time 01:00:00 --template serialUse the interactive mode to answer prompts instead of supplying flags manually:
$ alces-job interactiveShow help for any supported command:
$ alces-job --help
$ alces-job base --helpTo work on this project locally:
- Install dependencies:
$ bundle install- Run tests:
$ bundle exec rspec- Run the default Rake task:
$ bundle exec rake- Run style checks:
$ bundle exec rubocop- Build the gem locally:
$ gem build alces-job.gemspec- Run the CLI from source:
$ bundle exec ruby bin/alces-job base --helpThe project is structured around a simple CLI registry and a generator service:
bin/alces-jobis the executable entrypoint.lib/cli/cli.rbdefines theAlcesJob::CLIregistry and loads all commands.- Command classes live in
lib/cli/commands/and register themselves with Dry::CLI.base,gpu,mpi,array,serialconfig init,config update,interactive,validate,template-validate,modify,template list,template show,profile create,profile list,profile show, andversion
lib/services/generator.rbis responsible for rendering templates, saving the generated script, and submitting it to Slurm.- Templates live in
templates/and are selected by command-specific logic.default.erbis used for the base command.gpu.erb,mpi.erb, andarray.erbare used by their respective commands.
- The generator converts CLI options into
@contextusingOpenStruct, making options available inside ERB templates. lib/services/sysinfo/andlib/services/interactive_wizard.rbsupport interactive mode and config generation.
- CLI loads
bin/alces-job. lib/cli/cli.rbloads the command classes.- The selected command class builds the options hash.
- The command creates
AlcesJob::Services::Generatorwith those options. - The generator renders the chosen ERB template and writes it to disk.
- If
--submitis set, the generator callssbatchon the generated file.