OpenOnDemand Job Templates User Guide

This guide explains how to use job templates in OpenOnDemand to submit SLURM jobs efficiently.

What are Job Templates?

Job templates are pre-configured SLURM job scripts that help you quickly submit common types of jobs without writing scripts from scratch. Each template includes:

Pre-configured SLURM directives (cores, memory, time limits)
Example code or workflows
Documentation and best practices
Ready-to-run scripts

Accessing the Job Composer

Log in to OpenOnDemand at your cluster's URL
Click on "Jobs" in the top navigation menu
Select "Job Composer" from the dropdown

You'll see the Job Composer interface with:

List of your existing jobs (left sidebar)
Job details and files (main panel)
Action buttons (Submit, Edit, Delete, etc.)

Creating Jobs from Templates

Step 1: Create New Job from Template

In the Job Composer, click "New Job" button
Select "From Template"
Choose a template from the list (see Available Templates)
Click "Create New Job"

The template will be copied to your jobs directory with all necessary files.

Step 2: Review Job Location

Your new job is created in:

~/ondemand/data/sys/myjobs/projects/default/<job-id>/

Each job gets a unique directory containing:

script.sh - The SLURM job script
Template-specific files (e.g., example R scripts, definition files)
README.md - Documentation (if included)

Step 3: Understand the Job Structure

Every job template includes a script.sh file with SLURM directives at the top:

#!/bin/bash
#SBATCH --job-name=my_job        # Job name
#SBATCH --time=01:00:00          # Time limit (HH:MM:SS)
#SBATCH --partition=normal       # Queue/partition
#SBATCH -n 1                     # Number of tasks
#SBATCH -c 4                     # CPU cores per task
#SBATCH --mem=8G                 # Memory
#SBATCH --output=%x-%j.out       # Output file
#SBATCH --error=%x-%j.err        # Error file

Editing Job Scripts

Using the Built-in Editor

In Job Composer, select your job from the left sidebar
Click on script.sh in the file list
Click "Edit" button
Make your changes in the editor
Click "Save" when done

Common Edits

Change Resource Requirements

Edit the SLURM directives to match your needs:

#SBATCH --time=04:00:00          # Increase time limit
#SBATCH -c 8                     # Use more CPU cores
#SBATCH --mem=32G                # Request more memory
#SBATCH --partition=gpu          # Use GPU partition

Add Your Data Files

Edit the script to point to your actual data:

# Change this:
Rscript hello.R

# To this:
Rscript /path/to/your/analysis.R

Modify Job Name and Output

#SBATCH --job-name=my_analysis   # Descriptive name
#SBATCH --output=results_%j.out  # Custom output filename

Uploading Additional Files

In Job Composer, select your job
Click "Open Dir" to open the job directory in the file browser
Use "Upload" button to add your data files
Update the script to reference your uploaded files

Submitting Jobs

Submit Your Job

Select your job in the Job Composer
Review the script and ensure all settings are correct
Click the "Submit" button

You'll see a confirmation message with the job ID (e.g., "Job submitted successfully with ID: 12345")

What Happens Next?

Queued: Job enters the SLURM queue
Running: Job starts when resources are available
Completed: Job finishes (check output files for results)
Failed: Job encountered an error (check error file)

Monitoring Jobs

View Active Jobs

Click "Jobs" → "Active Jobs" in the top menu
You'll see all your running and pending jobs
Information displayed:
- Job ID
- Job Name
- Status (Running, Pending, Completed, Failed)
- Time elapsed
- Nodes/cores used

Check Job Output

While the job is running or after completion:

In Job Composer, select your job
Click on the output file (e.g., my_job-12345.out)
Click "View" to see the contents
Click "Refresh" to update (for running jobs)

View Job Details

For detailed job information:

Go to "Jobs" → "Active Jobs"
Click on your job ID
View comprehensive details:
- Start time
- Resource usage
- Node assignment
- Full job parameters

Available Templates

Basic R Serial Job

Template: rscript

Purpose: Run R scripts on a single core

Includes:

script.sh - SLURM job script
hello.R - Example R script with system information

Use cases:

Data analysis
Statistical computing
Report generation

How to customize:

Replace hello.R with your R script or upload your own

Edit script.sh to reference your script:

srun /usr/bin/apptainer exec /data/apps/rstudio.sif Rscript your_script.R

Adjust resources (memory, cores, time) as needed

Build Custom Apptainer Image

Template: apptainer_builder

Purpose: Build custom container images based on RStudio

Includes:

script.sh - Build automation script
rstudio_custom.def - Apptainer definition file
README.md - Detailed instructions

Use cases:

Installing additional R packages
Adding system dependencies (GDAL, PROJ, etc.)
Creating reproducible environments
Custom software stacks

How to customize:

Edit rstudio_custom.def to add your packages:

%post
    apt-get update
    apt-get install -y your-system-packages

    R --slave -e 'install.packages(c("your", "packages"))'

Submit the job (build takes 1-4 hours)
Image is saved to $HOME/apps/rstudio_custom.sif
Use in future jobs or RStudio sessions

Advanced Usage

Creating Job Arrays

Run the same script multiple times with different parameters:

Edit your script.sh and add:

#SBATCH --array=1-10           # Run 10 instances

Use $SLURM_ARRAY_TASK_ID in your script:

Rscript analysis.R $SLURM_ARRAY_TASK_ID

Job Dependencies

Run jobs in sequence:

Submit first job and note the job ID (e.g., 12345)
Create second job with dependency:
```
#SBATCH --dependency=afterok:12345
```

Using Custom Container Images

After building a custom image:

Edit your job script

Change the container path:

# Instead of:
apptainer exec /data/apps/rstudio.sif Rscript script.R

# Use:
apptainer exec $HOME/apps/rstudio_custom.sif Rscript script.R

Email Notifications

Get notified when jobs complete:

#SBATCH --mail-type=END,FAIL     # Email on end or failure
#SBATCH --mail-user=your.email@example.com

Using GPU Resources

For GPU-accelerated jobs:

#SBATCH --partition=gpu          # GPU partition
#SBATCH --gres=gpu:1             # Request 1 GPU
#SBATCH --gres=gpu:2             # Or request 2 GPUs

Parallel Processing in R

For multi-core R jobs:

#SBATCH -c 8                     # Request 8 cores

In your R script:

library(parallel)
library(doParallel)

# Use all available cores
n_cores <- as.numeric(Sys.getenv("SLURM_CPUS_PER_TASK"))
registerDoParallel(cores = n_cores)

# Your parallel code here
results <- foreach(i = 1:1000) %dopar% {
    # Computation
}

Troubleshooting

Job Stays in Pending State

Possible causes:

Requested resources not available
Partition full or not accessible
Time limit too high for requested partition
Account/QOS limits reached

Solutions:

Check active jobs: "Jobs" → "Active Jobs"
Reduce resource requests (cores, memory, time)
Try a different partition
Contact admin if issue persists

Job Fails Immediately

Check the error file (*.err):

In Job Composer, select your job
Open the .err file
Look for error messages

Common issues:

File not found: Check paths are absolute or relative to job directory
Permission denied: Ensure files are readable
Module not loaded: Container may be missing dependencies
Syntax errors: Review script for typos

Out of Memory Errors

Symptoms:

Job fails with "out of memory" or "killed" message
Exit code 137

Solutions:

Increase memory request:

#SBATCH --mem=32G              # Instead of 8G

Use memory-efficient approaches in your code
Split job into smaller chunks

Container Image Not Found

Error: FATAL: container not found

Solutions:

Check the container path in your script
Verify the container exists:
```
ls -l /data/apps/rstudio.sif
```
If using custom image, ensure build completed:
```
ls -l $HOME/apps/rstudio_custom.sif
```

Job Takes Too Long

Options:

Request more time:
```
#SBATCH --time=12:00:00
```
Optimize your code (vectorization, parallel processing)
Use more CPU cores if parallelizable
Check if you're in the correct partition for long jobs

Can't Edit Files

If editor doesn't work:

Click "Open Dir" in Job Composer
Use the file browser's built-in editor
Or download file, edit locally, re-upload

Need to Cancel a Job

Go to "Jobs" → "Active Jobs"
Find your job in the list
Click "Delete" or "Cancel" button
Confirm the cancellation

Best Practices

Resource Estimation

Start small: Begin with minimal resources and increase if needed
Monitor usage: Check actual resource usage after jobs complete
Be realistic: Don't over-request resources (wastes queue time)

File Organization

Testing

Test with small datasets first
Use short time limits for testing
Verify output before running large batches
Use interactive sessions for debugging

Reproducibility

Document software versions
Use container images for consistent environments
Save SLURM scripts with your results
Note date and job ID in your analysis notes

Getting Help

Resources

Documentation: Check README files in templates
Active Jobs: Monitor job status and resource usage
Error Logs: Always check .err files for failures

Contact Support

If you encounter persistent issues:

Note the job ID
Save error messages
Document what you've tried
Contact your HPC admin

Quick Reference

Common SLURM Directives

#SBATCH --job-name=name          # Job name
#SBATCH --partition=normal       # Queue/partition
#SBATCH --time=HH:MM:SS          # Time limit
#SBATCH -n 1                     # Number of tasks
#SBATCH -c 4                     # CPUs per task
#SBATCH --mem=8G                 # Memory per node
#SBATCH --mem-per-cpu=2G         # Memory per CPU
#SBATCH --output=file.out        # Output file
#SBATCH --error=file.err         # Error file
#SBATCH --mail-type=ALL          # Email notifications
#SBATCH --mail-user=email        # Email address
#SBATCH --gres=gpu:1             # GPU request
#SBATCH --array=1-10             # Job array
#SBATCH --dependency=afterok:123 # Job dependency

Environment Variables in Jobs

$SLURM_JOB_ID                    # Job ID
$SLURM_JOB_NAME                  # Job name
$SLURM_SUBMIT_DIR                # Submission directory
$SLURM_JOB_NODELIST              # Assigned nodes
$SLURM_NTASKS                    # Number of tasks
$SLURM_CPUS_PER_TASK             # CPUs per task
$SLURM_ARRAY_TASK_ID             # Array index

Useful Commands (in scripts)

cd $SLURM_SUBMIT_DIR             # Go to submission directory
echo "Job ID: $SLURM_JOB_ID"    # Print job info
date                              # Timestamp
hostname                          # Node name

Example Workflow

Complete Example: Running an R Analysis

Create job from template:
- Jobs → Job Composer → New Job → From Template
- Select "Basic R Serial Job"
Upload your R script:
- Click "Open Dir"
- Upload your analysis.R file

Edit the job script:

#!/bin/bash
#SBATCH --job-name=my_analysis
#SBATCH --time=02:00:00
#SBATCH -c 4
#SBATCH --mem=16G

cd $SLURM_SUBMIT_DIR
srun apptainer exec /data/apps/rstudio.sif Rscript analysis.R

Submit the job:
- Click "Submit"
- Note the job ID
Monitor progress:
- Jobs → Active Jobs
- Check output file periodically
Review results:
- Open .out file when job completes
- Download output files if needed

Last Updated: 2025-12-03

Questions? Contact your SciIT team.

Open OnDemand User Guide

OpenOnDemand Job Templates User Guide

Apptainer User Guide for R on VoWa HPC

OpenOnDemand Job Templates User Guide

What are Job Templates?

Accessing the Job Composer

Creating Jobs from Templates

Step 1: Create New Job from Template

Step 2: Review Job Location

Step 3: Understand the Job Structure

Editing Job Scripts

Using the Built-in Editor

Common Edits

Change Resource Requirements

Add Your Data Files

Modify Job Name and Output

Uploading Additional Files

Submitting Jobs

Submit Your Job

What Happens Next?

Monitoring Jobs

View Active Jobs

Check Job Output

View Job Details

Available Templates

Basic R Serial Job

Build Custom Apptainer Image

Advanced Usage

Creating Job Arrays

Job Dependencies

Using Custom Container Images

Email Notifications

Using GPU Resources

Parallel Processing in R

Troubleshooting

Job Stays in Pending State

Job Fails Immediately

Out of Memory Errors

Container Image Not Found

Job Takes Too Long

Can't Edit Files

Need to Cancel a Job

Best Practices

Resource Estimation

File Organization

Testing

Reproducibility

Getting Help

Resources

Contact Support

Quick Reference

Common SLURM Directives

Environment Variables in Jobs

Useful Commands (in scripts)

Example Workflow

Complete Example: Running an R Analysis

No Comments