Cluster Usage

December 18, 2020

Summary

Filesystem
- Paths
- Quotas
- Usage
Software
- Module System
- Installs
- Management
Job Scheduling
- Node
- Partition
- Limits
- Jobs

Filesystem: Paths

RoadMap

Filesystem: Paths

Symlink ( dotted lines) - A shortcut to another directory or file
Mount ( Local/ Shared) - An entry point to a disk or storage device (ie. 'C:/' or Google Drive)

Filesystem: Paths

Case sensitive

All paths and commands are case sensitive, an uppercase letter is not the same as a lowercase letter.

Path Types

Absolute path - Full path from root to current working directory

/rhome/username/workshop_dir/

Relative path - Partial path or non-absolute path (current directory implied)

workshop_dir/

Filesystem: Quotas

All storage has limits.

Local Storage (ie. laptop hard drive)

Only exists on a single machine (node) and is limited by disk size.
Shared Storage (ie. Google Drive)

Exists accross all machines (nodes) and is limited by a quota.

Filesystem: Usage

Make workshop directory, if it does not already exist:

mkdir -p ~/workshop_dir

Check directory size:

du -hs ~/workshop_dir

Check local node storage:

df -h /tmp

df -h /scratch

Filesystem: Usage

Check GPFS storage, “blocks” is used space and available space is “quota”:

check_quota home

check_quota bigdata

https://hpcc.ucr.edu/manuals_linux-cluster_storage

Software: Module System

This system allows multiple versions of software to be loaded and unloaded.

To view software that is available:

module avail

To search for a specific software:

module avail samtools
# OR
hpcc-software samtools

Software: Module System

Load software into current environment:

module load samtools

List currently loaded software modules:

module list

Remove software from current environment:

module unload samtools

https://hpcc.ucr.edu/manuals_linux-cluster_start#modules

Software: Installs

Python

For a basic Python package (pypi) you can use pip to install it:

pip install PKGNAME --user

For example, here is how you would install the camelcase package:

pip install camelcase --user

Software: Installs

For an R package you can use the install fuction (CRAN):

install.packages('PKGNAME')

Or you can use the install function from BiocManager:

BiocManager::install('PKGNAME')

https://hpcc.ucr.edu/manuals_linux-cluster_package-manage.html#r-1

Software: Management

Conda - A software management system that allows you to install thousands of software packages and tools, including R and Python languages.

Full instructions regarding conda setup can be found here.

Singularity - A Linux container system (similar to Docker) which allows users to prepare a Linux environment from scratch.

Some singularity examples can be found here.

A previous workshop regarding custom software installs utilizing the above technologies can be found here.

Software: Management

Conda

List current conda virtual environments:

conda env list

Create a Python 3 environment named python3:

conda create -n python3 python=3

Install Python package with conda:

conda install -n python3 numpy

Note: If package fails to be found, search on the Anaconda Website. After searching click on one of the results and the command for installing will be provided. Remember to add your -n python3 environment name.

Software: Management

Conda

After the conda environment is setup and numpy is installed, we can test it with the following:

conda activate python3
python -c 'import numpy as np; a = np.arange(15).reshape(3, 5); print(a)'

Software: Management

Singularity

Warning: This is a demo, should be used for advanced projects

You may need a singularity image if…

You may want to build/control your own Linux environment
Your software requires older, or newer, libraries
Installation instructions are for Ubuntu

Software: Management

Singularity

First you must get your own Linux machine, and install Singularity. Perhaps the easiest way to do this is mentioned here.

After this you can use pre-built images or try to build a custom singularity image:

Pre-Built

singularity exec docker://ubuntu:latest echo "Hello Dinosaur!"

Custom

Create a Singularity definition file
Build container image based on definition file
Run shell inside image to test

Software Management

Definition File

Make file myLinuxEnv.def with the following content:

bootstrap: docker
From: ubuntu:latest

%post
  apt update
  apt install httpd

Software Management

Build Container Image

Run the following command using defenition file:

singularity build myLinuxEnv.sing myLinuxEnv.def

Software Management

Test

Test the image buy going inside it:

singularity shell myLinuxEnv.sing

Once the Singularity image is tested, transfer it to the cluster (SCP/SFTP), and execute it within a job like so:

module load singularity
singularity exec myLinuxEnv.sing 'cat /etc/lsb-release'

Job Scheduling: Slurm

https://slurm.schedmd.com/archive/slurm-19.05.0/

Job Scheduling: Node

What is a Compute Node?

https://hpcc.ucr.edu/hardware

Job Scheduling: Partitions

What is a Partition?

Logical groups of nodes, to allow more efficient allocation and managment of resources.

Intel Partition

CPU - 2 cores Default, 256 Cores Max
RAM - 1GB Default, 1TB Max
Time - 7 days Default, 30 Days Max

https://hpcc.ucr.edu/manuals_linux-cluster_jobs.html#partitions

Job Scheduling: Partitions

Default?

Fallback to this value if not explicitly provided.
Maximum?

Upper limit of what can be requested.

For more details regarding our partitions, please review our Cluster Jobs: Partitions manual page.

Job Scheduling: Status

List all jobs owned by you and status:

squeue -u $USER

List all group jobs and status:

squeue -A $GROUP

Job Scheduling: Limits

List current Slurm limits:

slurm_limits

List CPUs currently used by you:

user_cpus

List CPUs currently used by entire group (primary):

group_cpus

Job Scheduling: Jobs

Submission

Move into workshop directory:

cd ~/workshop_dir

Download example job submission script:

# Non-Stats
wget -O basic_job.sh https://bit.ly/33rozLX

# Stats Department
wget -O basic_job.sh https://bit.ly/2KBaIOs

Check job submission script contents (use arrow keys to navigate and ctrl+x to quit):

nano basic_job.sh

Job Scheduling: Jobs

Submission

Submit as non-interactive job:

sbatch basic_job.sh

Submit interactive job:

srun -p short --pty bash -l

# OR

srun -p statsdept --pty bash -l

Job Scheduling: Jobs

Status

Check job status:

squeue -u $USER

Check results:

cat slurm-2909103.out

https://hpcc.ucr.edu/manuals_linux-cluster_jobs.html#submitting-jobs