This document describes some basic commands for use with the newhorizons cluster (2015) which uses the job scheduler slurm.
Sample job submission script
#!/bin/bash
#SBATCH -o output.txt
#SBATCH -e errors.txt
#SBATCH -J my_job
#SBATCH -n 4
#SBATCH –mem=2G
#SBATCH –time=1:00:00
#SBATCH -p short
#SBATCH –mail-type END,FAIL,TIME_LIMIT_80
Run script.sh in the medium queue with 1 CPU and 8 GB of RAM.
sbatch -p medium –mem=8G script.sh
Run script.sh in the medium queue with 4 CPUs on 1 node and 32 GB of RAM (4 x 8 GB).
sbatch -p medium -n 4 -N 1 –mem=8G script.sh
Run script.sh in the medium queue with a 2 hour time limit, on a node with a GPU, requesting 16 CPUs and 19 GB of RAM (16 x 1 GB).
sbatch -p medium –time=2:00:00 –constraint=gpu -n 16 -N 1 –mem=1G script.sh
Run a parallel MPI job on 64 CPUs, executing script.sh.
sbatch -p medium -n 64 script.sh
Run an array job, executing script.sh 1000 times in parallel, each task with a different $SLURM_ARRAY_TASK_ID parameter between 1 and 1000.
sbatch -a 1-1000 script.sh
the limit is 50,000. You can limit the concurrent tasks by adding
%limit
sbatch -a 1-10000%200
This will run total 10000 tasks but only 200 concurrently.
A more complete example:
#!/bin/bash #SBATCH -e myerror.out #SBATCH --mem=16G #SBATCH -p medium # Important: the above files will be written to be all jobs, as they finish, and may result in gibberish # So instead of using the above -o, use it on the command line: # sbatch -o slurm-%A_%a.out --array=1-3 -N1 script # Where %A=SLURM_ARRAY_JOB_ID and %a=SLURM_ARRAY_TASK_ID # This one will take the index presented in the array submit and send it to wait_array.sh cd /home/holton/Slurm ./wait_array.sh $SLURM_JOBID $SLURM_ARRAY_JOB_ID $SLURM_ARRAY_TASK_ID # Or, you can just redirect the above command using the variables, like this: # ./wait_array.sh $SLURM_JOBID $SLURM_ARRAY_JOB_ID $SLURM_ARRAY_TASK_ID > $SLURM_ARRAY_TASK_ID.file.out # same diff echo "" echo "it is done. this is the last message that will appear after the output from the above job"
and run it like this:
sbatch -o slurm-%A_%a.out -a 1-4 -N1 array_simple.sh
Start an interactive job with 1 CPU and 8 GB of RAM.
qlogin -l mem=8G
Start an interactive job with 4 CPUs and 8 GB (4 x 2 GB) of RAM.
qlogin -l mem=2G -n 4