- Filesystem
- Paths
- Quotas
- Usage
- Software
- Module System
- Installs
- Management
- Job Scheduling
- Node
- Partition
- Limits
- Jobs
December 18, 2020
RoadMap
Symlink ( dotted lines) - A shortcut to another directory or file
Mount (
Local/
Shared) - An entry point to a disk or storage device (ie. 'C:/'
or Google Drive
)
Case sensitive
All paths and commands are case sensitive, an uppercase letter is not the same as a lowercase letter.
Path Types
/rhome/username/workshop_dir/
workshop_dir/
All storage has limits.
Local Storage (ie. laptop hard drive)
Only exists on a single machine (node) and is limited by disk size.
Shared Storage (ie. Google Drive)
Exists accross all machines (nodes) and is limited by a quota.
Make workshop directory, if it does not already exist:
mkdir -p ~/workshop_dir
Check directory size:
du -hs ~/workshop_dir
Check local node storage:
df -h /tmp
df -h /scratch
Check GPFS storage, “blocks” is used space and available space is “quota”:
check_quota home
check_quota bigdata
https://hpcc.ucr.edu/manuals_linux-cluster_storage
This system allows multiple versions of software to be loaded and unloaded.
To view software that is available:
module avail
To search for a specific software:
module avail samtools # OR hpcc-software samtools
Load software into current environment:
module load samtools
List currently loaded software modules:
module list
Remove software from current environment:
module unload samtools
https://hpcc.ucr.edu/manuals_linux-cluster_start#modules
Python
For a basic Python
package (pypi) you can use pip
to install it:
pip install PKGNAME --user
For example, here is how you would install the camelcase
package:
pip install camelcase --user
R
For an R
package you can use the install fuction (CRAN):
R
install.packages('PKGNAME')
Or you can use the install function from BiocManager:
R
BiocManager::install('PKGNAME')
https://hpcc.ucr.edu/manuals_linux-cluster_package-manage.html#r-1
R
and Python
languages.Full instructions regarding conda setup can be found here.
Some singularity examples can be found here.
A previous workshop regarding custom software installs utilizing the above technologies can be found here.
Conda
List current conda virtual environments:
conda env list
Create a Python
3 environment named python3
:
conda create -n python3 python=3
Install Python package with conda:
conda install -n python3 numpy
Note: If package fails to be found, search on the Anaconda Website. After searching click on one of the results and the command for installing will be provided. Remember to add your
-n python3
environment name.
Conda
After the conda environment is setup and numpy
is installed, we can test it with the following:
conda activate python3 python -c 'import numpy as np; a = np.arange(15).reshape(3, 5); print(a)'
Singularity
Warning: This is a demo, should be used for advanced projects
You may need a singularity image if…
Ubuntu
Singularity
First you must get your own Linux machine, and install Singularity. Perhaps the easiest way to do this is mentioned here.
After this you can use pre-built images or try to build a custom singularity image:
Pre-Built
singularity exec docker://ubuntu:latest echo "Hello Dinosaur!"
Custom
Definition File
Make file myLinuxEnv.def
with the following content:
bootstrap: docker From: ubuntu:latest %post apt update apt install httpd
Build Container Image
Run the following command using defenition file:
singularity build myLinuxEnv.sing myLinuxEnv.def
Test
Test the image buy going inside it:
singularity shell myLinuxEnv.sing
Once the Singularity
image is tested, transfer it to the cluster (SCP/SFTP), and execute it within a job like so:
module load singularity singularity exec myLinuxEnv.sing 'cat /etc/lsb-release'
https://slurm.schedmd.com/archive/slurm-19.05.0/
What is a Compute Node?
What is a Partition?
Logical groups of nodes, to allow more efficient allocation and managment of resources.
Intel Partition
Default?
Fallback to this value if not explicitly provided.
Maximum?
Upper limit of what can be requested.
For more details regarding our partitions, please review our Cluster Jobs: Partitions manual page.
List all jobs owned by you and status:
squeue -u $USER
List all group jobs and status:
squeue -A $GROUP
List current Slurm limits:
slurm_limits
List CPUs currently used by you:
user_cpus
List CPUs currently used by entire group (primary):
group_cpus
Submission
Move into workshop directory:
cd ~/workshop_dir
Download example job submission script:
# Non-Stats wget -O basic_job.sh https://bit.ly/33rozLX # Stats Department wget -O basic_job.sh https://bit.ly/2KBaIOs
Check job submission script contents (use arrow keys to navigate and ctrl+x
to quit):
nano basic_job.sh
Submission
Submit as non-interactive job:
sbatch basic_job.sh
Submit interactive job:
srun -p short --pty bash -l # OR srun -p statsdept --pty bash -l
Status
Check job status:
squeue -u $USER
Check results:
cat slurm-2909103.out
https://hpcc.ucr.edu/manuals_linux-cluster_jobs.html#submitting-jobs