<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>HPCC – HPC Cluster</title><link>https://hpcc.ucr.edu/manuals/hpc_cluster/</link><description>Recent content in HPC Cluster on HPCC</description><generator>Hugo -- gohugo.io</generator><atom:link href="https://hpcc.ucr.edu/manuals/hpc_cluster/index.xml" rel="self" type="application/rss+xml"/><item><title>Manuals: Introduction</title><link>https://hpcc.ucr.edu/manuals/hpc_cluster/intro/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://hpcc.ucr.edu/manuals/hpc_cluster/intro/</guid><description>
&lt;h2 id="introduction">Introduction&lt;/h2>
&lt;p>This manual provides an introduction to the usage of the HPCC cluster.
All servers and compute resources of the HPCC cluster are available to researchers from all departments and colleges at UC Riverside for a minimal recharge fee &lt;a href="../../about/facility/rates">(see rates)&lt;/a>.
To request an account, please email &lt;a href="mailto:support@hpcc.ucr.edu">support@hpcc.ucr.edu&lt;/a>.
The latest hardware/facility description for grant applications is available &lt;a href="https://goo.gl/43eOwQ">here&lt;/a>.&lt;/p>
&lt;h2 id="overview">Overview&lt;/h2>
&lt;h3 id="storage">Storage&lt;/h3>
&lt;ul>
&lt;li>Four enterprise class HPC storage systems&lt;/li>
&lt;li>Approximately 6 PB of total network storage (3,072 TB production and 3,072 TB backup)&lt;/li>
&lt;li>GPFS (NFS and SAMBA via GPFS)&lt;/li>
&lt;li>Automatic snapshots and archival backups&lt;/li>
&lt;/ul>
&lt;h3 id="network">Network&lt;/h3>
&lt;ul>
&lt;li>Ethernet
&lt;ul>
&lt;li>1 Gb/s switch x 5&lt;/li>
&lt;li>1 Gb/s switch 10 Gig uplink&lt;/li>
&lt;li>10 Gb/s switch for Campus wide Science DMZ&lt;/li>
&lt;li>redundant, load balanced, robust mesh topology&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Interconnect
&lt;ul>
&lt;li>56 Gb/s InfiniBand (FDR)&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="head-nodes">Head Nodes&lt;/h3>
&lt;p>All users should access the cluster via ssh through cluster.hpcc.ucr.edu, this address will automatically balance traffic to one of the available head nodes.&lt;/p>
&lt;ul>
&lt;li>Jay
&lt;ul>
&lt;li>Resources: 64 cores, 512 GB memory&lt;/li>
&lt;li>Primary function: submitting jobs to the queuing system&lt;/li>
&lt;li>Secondary function: development; code editing and running small (under 50 % CPU and under 1 GB RAM) sample jobs&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Lark
&lt;ul>
&lt;li>Resources: 64 cores, 512 GB memory&lt;/li>
&lt;li>Primary function: submitting jobs to the queuing system&lt;/li>
&lt;li>Secondary function: development; code editing and running small (under 50 % CPU and under 1 GB RAM) sample jobs&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h3 id="worker-nodes">Worker Nodes&lt;/h3>
&lt;ul>
&lt;li>Batch
&lt;ul>
&lt;li>c01-c48: each with 64 AMD cores and 512 GB memory&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Intel
&lt;ul>
&lt;li>i01-i40: each with 32 Intel Broadwell cores and 512 GB memory&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Epyc
&lt;ul>
&lt;li>r21-r38: each with 64 AMD EPYC cores and 1 TB memory&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Highmem
&lt;ul>
&lt;li>h01-h06: each with 32 Intel cores and 1024 GB memory&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>GPU
&lt;ul>
&lt;li>gpu01-gpu02: each with 32 (HT) cores Intel Haswell CPUs and 2 x NVIDIA Kepler K80 GPUs (12GB and 2496 CUDA cores per GPU) and 128 GB memory&lt;/li>
&lt;li>gpu03-gpu04: each with 48 (HT) cores Intel Broadwell CPUs and 4 x NVIDIA Kepler K80 GPUs (12GB and 2496 CUDA cores per GPU) and 512 GB memory&lt;/li>
&lt;li>gpu05: 64 (HT) cores Intel Broadwell CPUs and 2 x NVIDIA Pascal P100 GPUs (16GB and 3584 CUDA cores per GPU) and 256 GB memory&lt;/li>
&lt;li>gpu06-gpu08: with 64-128 (HT) cores AMD CPUs and 8 x NVIDIA A100 GPUs (80GB and 6912 CUDA cores per GPU) and 1,024 GB memory&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul></description></item><item><title>Manuals: Getting Started</title><link>https://hpcc.ucr.edu/manuals/hpc_cluster/start/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://hpcc.ucr.edu/manuals/hpc_cluster/start/</guid><description>
&lt;h2 id="login-from-mac-linux-mobaxterm">Login from Mac, Linux, MobaXTerm&lt;/h2>
&lt;p>The initial login brings users into the cluster head node (i.e. jay, lark). From there, users can submit jobs via &lt;code>srun&lt;/code>/&lt;code>sbatch&lt;/code> to the compute nodes to perform intensive tests.
Since all machines are mounting a centralized file system, users will always see the same home directory on all systems. Therefore, there is no need to copy files from one machine to another.&lt;/p>
&lt;p>Open the terminal and type&lt;/p>
&lt;pre>&lt;code class="language-bash">ssh -X username@cluster.hpcc.ucr.edu
&lt;/code>&lt;/pre>
&lt;h2 id="login-from-windows">Login from Windows&lt;/h2>
&lt;p>Please refer to the login instructions of our &lt;a href="../../manuals/linux_basics/intro/#windows">Linux Basics manual&lt;/a>.&lt;/p>
&lt;h2 id="change-password">Change Password&lt;/h2>
&lt;ol>
&lt;li>Login via SSH using the Terminal on Mac/Linux or MobaXTerm on Windows&lt;/li>
&lt;/ol>
&lt;ul>
&lt;li>Once you have logged in type the following command:&lt;/li>
&lt;/ul>
&lt;pre>&lt;code>passwd
&lt;/code>&lt;/pre>
&lt;ul>
&lt;li>Enter the old password (the random characters that you were given as your initial password)&lt;/li>
&lt;li>Enter your new password&lt;/li>
&lt;/ul>
&lt;p>The password minimum requirements are:&lt;/p>
&lt;ul>
&lt;li>Total length at least 8 characters long&lt;/li>
&lt;li>Must have at least 3 of the following:
&lt;ul>
&lt;li>Lowercase character&lt;/li>
&lt;li>Uppercase character&lt;/li>
&lt;li>Number&lt;/li>
&lt;li>Punctuation character&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;h2 id="modules">Modules&lt;/h2>
&lt;p>All software used on the HPC cluster is managed through a simple module system.
You must explicitly load and unload each package as needed.
More advanced users may want to load modules within their bashrc, bash_profile, or profile files.&lt;/p>
&lt;h3 id="available-modules">Available Modules&lt;/h3>
&lt;p>To list all available software modules, execute the following:&lt;/p>
&lt;pre>&lt;code class="language-bash">module avail
&lt;/code>&lt;/pre>
&lt;p>This should output something like:&lt;/p>
&lt;pre>&lt;code class="language-bash">------------------------ /opt/linux/rocky/8.x/x86_64/modules -------------------------
AAFTF/0.5.0 workspace/scratch &amp;lt;aL&amp;gt;
abyss/2.3.4 wtdbg2/2.5
almabte/1.3.2 xpdf/4.03
alphafold/2.3.0 xsv/0.13.0
amber/22_mpi_cuda yq/4.35.1
amptk/1.6 zoem/21-341
...
&lt;/code>&lt;/pre>
&lt;h3 id="using-modules">Using Modules&lt;/h3>
&lt;p>To load a module, run:&lt;/p>
&lt;pre>&lt;code class="language-bash">module load &amp;lt;software name&amp;gt;[/&amp;lt;version&amp;gt;]
&lt;/code>&lt;/pre>
&lt;p>For example, to load R version 4.1.2, run:&lt;/p>
&lt;pre>&lt;code class="language-bash">module load R/4.1.2
&lt;/code>&lt;/pre>
&lt;p>To load the default version of the tophat module, run:&lt;/p>
&lt;pre>&lt;code class="language-bash">module load tophat
&lt;/code>&lt;/pre>
&lt;h3 id="show-loaded-modules">Show Loaded Modules&lt;/h3>
&lt;p>To show what modules you have loaded at any time, you can run:&lt;/p>
&lt;pre>&lt;code class="language-bash">module list
&lt;/code>&lt;/pre>
&lt;p>Depending on what modules you have loaded, it will produce something like this:&lt;/p>
&lt;pre>&lt;code class="language-bash">Currently Loaded Modulefiles:
1) vim/7.4.1952 3) slurm/16.05.4 5) R/3.3.0 7) less-highlight/1.0 9) python/3.6.0
2) tmux/2.2 4) openmpi/2.0.1-slurm-16.05.4 6) perl/5.20.2 8) iigb_utilities/1
&lt;/code>&lt;/pre>
&lt;h3 id="unloading-software">Unloading Software&lt;/h3>
&lt;p>Sometimes you want to no longer have a piece of software in path. To do this you unload the module by running:&lt;/p>
&lt;pre>&lt;code class="language-bash">module unload &amp;lt;software name&amp;gt;
&lt;/code>&lt;/pre>
&lt;h2 id="databases">Databases&lt;/h2>
&lt;h3 id="loading-databases">Loading Databases&lt;/h3>
&lt;p>&lt;a href="http://www.ncbi.nlm.nih.gov/">NCBI&lt;/a>, &lt;a href="http://en.wikipedia.org/wiki/Pfam#External_links">PFAM&lt;/a>, and &lt;a href="http://www.uniprot.org/">Uniprot&lt;/a>, do not need to be downloaded by users. They are installed as modules on the cluster.&lt;/p>
&lt;pre>&lt;code>module load db-ncbi
module load db-pfam
module load db-uniprot
&lt;/code>&lt;/pre>
&lt;p>Specific database release numbers can be identified by the version label on the module:&lt;/p>
&lt;pre>&lt;code>module avail db-ncbi
----------------- /usr/local/Modules/3.2.9/modulefiles -----------------
db-ncbi/20140623(default)
&lt;/code>&lt;/pre>
&lt;h3 id="using-databases">Using Databases&lt;/h3>
&lt;p>In order to use the loaded database users can simply provide the corresponding environment variable (NCBI_DB, UNIPROT_DB, PFAM_DB, etc&amp;hellip;) for the proper path in their executables.&lt;/p>
&lt;p>This is the old deprecated BLAST and it may not work in the near future, however if you require it:&lt;/p>
&lt;pre>&lt;code>blastall -p blastp -i proteins.fasta -d $NCBI_DB/nr -o blastp.out
&lt;/code>&lt;/pre>
&lt;p>You can can also use this method if you require the old version of BLAST (old BLAST with legacy support):&lt;/p>
&lt;pre>&lt;code>BLASTBIN=`which legacy_blast.pl | xargs dirname`
legacy_blast.pl blastall -p blastp -i proteins.fasta -d $NCBI_DB/nr -o blast.out --path $BLASTBIN
&lt;/code>&lt;/pre>
&lt;p>This is the preferred/recommended method (BLAST+):&lt;/p>
&lt;pre>&lt;code>blastp -query proteins.fasta -db $NCBI_DB/nr -out proteins_blastp.txt
&lt;/code>&lt;/pre>
&lt;p>Usually, we store the most recent release and 2-3 previous releases of each database. This way time consuming projects can use the same database version throughout their lifetime without always updating to the latest releases.&lt;/p>
&lt;h3 id="additional-features">Additional Features&lt;/h3>
&lt;p>There are additional features and operations that can be done with the module command. Please run the following to get more information:&lt;/p>
&lt;pre>&lt;code class="language-bash">module help
&lt;/code>&lt;/pre>
&lt;h2 id="quotas">Quotas&lt;/h2>
&lt;h3 id="cpu-and-memory">CPU and Memory&lt;/h3>
&lt;p>Please refer to our &lt;a href="https://hpcc.ucr.edu/manuals/hpc_cluster/queue/">Queue Policies&lt;/a> page for details regarding CPU and Memory limits.&lt;/p>
&lt;h3 id="data-storage">Data Storage&lt;/h3>
&lt;p>A standard user account has a storage quota of 20GB. Much more storage space, in the range of many TBs, can be made available in a user account&amp;rsquo;s bigdata directory. The amount of storage space available in bigdata depends on a user group&amp;rsquo;s annual subscription. The pricing for extending the storage space in the bigdata directory is available &lt;a href="../../about/overview/access/">here&lt;/a>.&lt;/p>
&lt;h2 id="whats-next">What&amp;rsquo;s Next?&lt;/h2>
&lt;p>You should now know the following:&lt;/p>
&lt;ol>
&lt;li>Basic orginization of the cluster&lt;/li>
&lt;li>How to login to the cluster&lt;/li>
&lt;li>How to use the Module system to gain access to the cluster software&lt;/li>
&lt;li>CPU, storage, and memory limitations (quotas and hardware limits)&lt;/li>
&lt;/ol>
&lt;p>Now you can start using the cluster.&lt;/p>
&lt;p>The HPCC cluster uses the Slurm queuing system and thus the recommended way to run your jobs (scripts, pipelines, experiments, etc&amp;hellip;) is to submit them to this queuing system by using &lt;code>sbatch&lt;/code>.
Please &lt;strong>DO NOT RUN ANY&lt;/strong> computationally intensive tasks on any head node (i.e. jay, lark). If this policy is violated, your process will either run very slow or be killed automatically.
The head nodes (login nodes) are a shared resource and should be accessible by all users. Negatively impacting performance would affect all users on the system and will not be tolerated.&lt;/p></description></item><item><title>Manuals: Managing Jobs</title><link>https://hpcc.ucr.edu/manuals/hpc_cluster/jobs/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://hpcc.ucr.edu/manuals/hpc_cluster/jobs/</guid><description>
&lt;h2 id="what-is-a-job">What is a Job?&lt;/h2>
&lt;p>Submitting and managing jobs is at the heart of using the cluster. A &amp;lsquo;job&amp;rsquo; refers to the script, pipeline or experiment that you run on the nodes in the cluster.&lt;/p>
&lt;h2 id="partitions">Partitions&lt;/h2>
&lt;p>Jobs are submitted to so-called partitions (or queues). Each partition is a group of nodes, often with similar hardware specifications (e.g. CPU or RAM configurations). The quota policies applying to each partitions are outlined on the &lt;a href="https://hpcc.ucr.edu/manuals/hpc_cluster/queue/">Queue Policies&lt;/a> page. For more detailed hardware info, see the &lt;a href="https://hpcc.ucr.edu/about/hardware/details/#worker-nodes">Hardware Details&lt;/a> page.&lt;/p>
&lt;ul>
&lt;li>epyc
&lt;ul>
&lt;li>Nodes: r21-r38&lt;/li>
&lt;li>CPU: AMD&lt;/li>
&lt;li>Supported Extensions&lt;sup id="fnref:1">&lt;a href="#fn:1" class="footnote-ref" role="doc-noteref">1&lt;/a>&lt;/sup>: AVX, AVX2, SSE, SSE2, SSE4&lt;/li>
&lt;li>RAM: 1 GB default&lt;/li>
&lt;li>Time (walltime): 168 hours (7 days) default&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>intel
&lt;ul>
&lt;li>Default partition&lt;/li>
&lt;li>Nodes: i01-02,i17-i40&lt;/li>
&lt;li>CPU: Intel&lt;/li>
&lt;li>Supported Extensions&lt;sup id="fnref:1">&lt;a href="#fn:1" class="footnote-ref" role="doc-noteref">1&lt;/a>&lt;/sup>: AVX, AVX2, SSE, SSE2, SSE4&lt;/li>
&lt;li>RAM: 1 GB default&lt;/li>
&lt;li>Time (walltime): 168 hours (7 days) default&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>batch
&lt;ul>
&lt;li>Nodes: c01-c48&lt;/li>
&lt;li>CPU: AMD&lt;/li>
&lt;li>Supported Extensions&lt;sup id="fnref:1">&lt;a href="#fn:1" class="footnote-ref" role="doc-noteref">1&lt;/a>&lt;/sup>: AVX, SSE, SSE2, SSE4&lt;/li>
&lt;li>RAM: 1 GB default&lt;/li>
&lt;li>Time (walltime): 168 hours (7 days) default&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>highmem
&lt;ul>
&lt;li>Nodes: h01-h06&lt;/li>
&lt;li>CPU: Intel&lt;/li>
&lt;li>Supported Extensions&lt;sup id="fnref:1">&lt;a href="#fn:1" class="footnote-ref" role="doc-noteref">1&lt;/a>&lt;/sup>: AVX, SSE, SSE2, SSE4&lt;/li>
&lt;li>RAM: 100 GB to 1000 GB&lt;/li>
&lt;li>Time (walltime): 48 hours (2 days) default&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>highclock
&lt;ul>
&lt;li>Nodes: hz01-hz04&lt;/li>
&lt;li>CPU: Intel&lt;/li>
&lt;li>Supported Extensions&lt;sup id="fnref:1">&lt;a href="#fn:1" class="footnote-ref" role="doc-noteref">1&lt;/a>&lt;/sup>: AVX, SSE, SSE2, SSE4&lt;/li>
&lt;li>RAM: 1 GB default&lt;/li>
&lt;li>Time (walltime): 168 hours (7 days) default&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>gpu
&lt;ul>
&lt;li>Nodes: gpu01-gpu08&lt;/li>
&lt;li>CPU: AMD/Intel&lt;/li>
&lt;li>GPUs: NVIDIA k80, p100, a100, h100&lt;/li>
&lt;li>RAM: 1 GB default&lt;/li>
&lt;li>Time (walltime): 48 hours (2 days) default&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>short
&lt;ul>
&lt;li>Nodes: Mixed set of nodes from batch, intel, and group partitions&lt;/li>
&lt;li>Cores: AMD/Intel&lt;/li>
&lt;li>RAM: 1 GB default&lt;/li>
&lt;li>Time (walltime): 2 hours Maximum&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>short_gpu
&lt;ul>
&lt;li>Nodes: gpu01-gpu10&lt;/li>
&lt;li>CPU: AMD/Intel&lt;/li>
&lt;li>GPUs: NVIDIA k80, p100, a100, h100, ada6000&lt;/li>
&lt;li>RAM: 1 GB default&lt;/li>
&lt;li>Time (walltime): 2 hours Maximum&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Lab Partitions
&lt;ul>
&lt;li>If your lab has purchased nodes then you will have a priority partition with the same name as your group (ie. girkelab).&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>In order to submit a job to different partitions add the optional &amp;lsquo;-p&amp;rsquo; parameter with the name of the partition you want to use:&lt;/p>
&lt;pre>&lt;code class="language-bash">sbatch -p batch SBATCH_SCRIPT.sh
sbatch -p highmem SBATCH_SCRIPT.sh
sbatch -p epyc SBATCH_SCRIPT.sh
sbatch -p gpu SBATCH_SCRIPT.sh
sbatch -p intel SBATCH_SCRIPT.sh
sbatch -p highclock SBATCH_SCRIPT.sh
sbatch -p mygroup SBATCH_SCRIPT.sh
&lt;/code>&lt;/pre>
&lt;h2 id="slurm">Slurm&lt;/h2>
&lt;p>Slurm is used as a queuing system across all head nodes. &lt;a href="#getting-started">SSH directly into the cluster&lt;/a> and your connection will be automatically load balanced to a head node:&lt;/p>
&lt;pre>&lt;code class="language-bash">ssh -XY cluster.hpcc.ucr.edu
&lt;/code>&lt;/pre>
&lt;h3 id="resources-and-limits">Resources and Limits&lt;/h3>
&lt;p>To see your limits you can do the following:&lt;/p>
&lt;pre>&lt;code class="language-bash">slurm_limits
&lt;/code>&lt;/pre>
&lt;p>Check total number of cores used by your group in the all partitions:&lt;/p>
&lt;pre>&lt;code class="language-bash">group_cpus
&lt;/code>&lt;/pre>
&lt;p>However this does not tell you when your job will start, since it depends on the duration of each job.
The best way to do this is with the &amp;ldquo;&amp;ndash;start&amp;rdquo; flag on the squeue command:&lt;/p>
&lt;pre>&lt;code class="language-bash">squeue --start -u $USER
&lt;/code>&lt;/pre>
&lt;h3 id="submitting-jobs">Submitting Jobs&lt;/h3>
&lt;p>There are 2 basic ways to submit jobs; non-interactive and interactive. Slurm will automatically start within the directory where you submitted the job from, so keep that in mind when you use relative file paths.&lt;/p>
&lt;h4 id="non-interactive-submission">Non-interactive Submission&lt;/h4>
&lt;p>Non-interactive jobs are submitted as SBATCH scripts, an example is as follows:&lt;/p>
&lt;pre>&lt;code class="language-bash">sbatch SBATCH_SCRIPT.sh
&lt;/code>&lt;/pre>
&lt;p>Here is an example of an SBATCH script:&lt;/p>
&lt;pre>&lt;code class="language-bash">#!/bin/bash -l
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=10
#SBATCH --mem=10G
#SBATCH --time=1-00:15:00 # 1 day and 15 minutes
#SBATCH --mail-user=useremail@address.com
#SBATCH --mail-type=ALL
#SBATCH --job-name=&amp;quot;just_a_test&amp;quot;
#SBATCH -p epyc # You can use any of the following; epyc, intel, batch, highmem, gpu
# Print current date
date
# Load samtools
module load samtools
# Concatenate BAMs
samtools cat -h header.sam -o out.bam in1.bam in2.bam
# Print name of node
hostname
&lt;/code>&lt;/pre>
&lt;p>The above job will request 1 node, 10 cores (parallel threads), 10GB of memory, for 1 day and 15 minutes. An email will be sent to the user when the status of the job changes (Start, Failed, Completed).
For more information regarding parallel/multi core jobs refer to &lt;a href="#parallelization">Parallelization&lt;/a>.&lt;/p>
&lt;h4 id="interactive-submission">Interactive Submission&lt;/h4>
&lt;p>Interactive jobs are submitted using &lt;code>srun&lt;/code>. An example is as follows:&lt;/p>
&lt;pre>&lt;code class="language-bash">srun --pty bash -l
&lt;/code>&lt;/pre>
&lt;p>If you do not specify a partition then the &amp;ldquo;epyc&amp;rdquo; partition is used by default.&lt;/p>
&lt;p>Here is a more complete example:&lt;/p>
&lt;pre>&lt;code class="language-bash">srun --mem=1gb --cpus-per-task 1 --ntasks 1 --time 10:00:00 --x11 --pty bash -l
&lt;/code>&lt;/pre>
&lt;p>The above example enables X11 forwarding and requests 1GB of memory and 1 core for 10 hours within an interactive session.&lt;/p>
&lt;h3 id="feature-constraints">Feature Constraints&lt;/h3>
&lt;p>Using the &lt;code>--constraint&lt;/code> (or &lt;code>-C&lt;/code> flag) allows you to fine-tune what type of machine your job can run on, mainly useful on the &amp;ldquo;short&amp;rdquo; partitions. Our &lt;a href="https://docs.google.com/spreadsheets/d/1SVH1-c1i075vjt-B0wNPiK87wmLkPltWJlIPgLkmoqU/">Node List&lt;/a> contains all of the different nodes that we have (both public and private) as well as any feature constraints they have.&lt;/p>
&lt;p>&lt;a href="https://docs.google.com/spreadsheets/d/1SVH1-c1i075vjt-B0wNPiK87wmLkPltWJlIPgLkmoqU/">Node List&lt;/a>&lt;/p>
&lt;p>For more info on hardware details, see our &lt;a href="https://hpcc.ucr.edu/about/hardware/details/#worker-nodes">Hardware Details&lt;/a> page.&lt;/p>
&lt;h4 id="constraint-examples">Constraint Examples&lt;/h4>
&lt;p>Since jobs on the &amp;ldquo;short&amp;rdquo; partition can run on any node, jobs can be narrowed down using constraints.&lt;/p>
&lt;p>If you require an Intel node of any generation:&lt;/p>
&lt;pre>&lt;code>srun -p short -t 2:00:00 -c 8 --mem 8GB --constraint intel --pty bash -l
&lt;/code>&lt;/pre>
&lt;p>If you require an AMD node, but want it to be Rome or Milan generation (ie. &lt;strong>not&lt;/strong> Abu Dhabi):&lt;/p>
&lt;pre>&lt;code>srun -p short -t 2:00:00 -c 8 --mem 8GB --constraint &amp;quot;amd&amp;amp;(rome|milan)&amp;quot; --pty bash -l
&lt;/code>&lt;/pre>
&lt;p>If you want to run on a modern GPU machine, requesting 1 GPU:&lt;/p>
&lt;pre>&lt;code>srun -p short_gpu -t 2:00:00 -c 8 --mem 8GB --gpus=1 --constraint &amp;quot;gpu_latest&amp;quot; --pty bash -l
&lt;/code>&lt;/pre>
&lt;blockquote>
&lt;p>When using constraints with GPUs, make sure to request a generic GPU&lt;/p>
&lt;/blockquote>
&lt;h3 id="monitoring-jobs">Monitoring Jobs&lt;/h3>
&lt;p>To check on your jobs states, run the following:&lt;/p>
&lt;pre>&lt;code class="language-bash">squeue -u $USER --start
&lt;/code>&lt;/pre>
&lt;p>To list all the details of a specific job (the JOBID can be found using &lt;code>squeue&lt;/code>), run the following:&lt;/p>
&lt;pre>&lt;code class="language-bash">scontrol show job JOBID
&lt;/code>&lt;/pre>
&lt;p>To view past jobs and their details, run the following:&lt;/p>
&lt;pre>&lt;code class="language-bash">sacct -u $USER -l
&lt;/code>&lt;/pre>
&lt;p>You can also adjust the start &lt;code>-S&lt;/code> time and/or end &lt;code>-E&lt;/code> time to view, using the YYYY-MM-DD format.
For example, the following command uses start and end times:&lt;/p>
&lt;pre>&lt;code class="language-bash">sacct -u $USER -S 2018-01-01 -E 2018-08-30 -l | less -S # Type 'q' to quit
&lt;/code>&lt;/pre>
&lt;p>Custom command for summarizing activity of all users on cluster&lt;/p>
&lt;pre>&lt;code class="language-bash">jobMonitor # or qstatMonitor
&lt;/code>&lt;/pre>
&lt;h3 id="canceling-jobs">Canceling Jobs&lt;/h3>
&lt;p>In cancel/stop your job run the following:&lt;/p>
&lt;pre>&lt;code class="language-bash">scancel JOBID
&lt;/code>&lt;/pre>
&lt;p>You can also cancel multiple jobs:&lt;/p>
&lt;pre>&lt;code class="language-bash">scancel JOBID1 JOBID2 JOBID3
&lt;/code>&lt;/pre>
&lt;p>If you want to cancel/stop/kill ALL your jobs it is possible with the following:&lt;/p>
&lt;pre>&lt;code class="language-bash"># Be very careful when running this, it will kill all your jobs.
squeue --user $USER --noheader --format '%i' | xargs scancel
&lt;/code>&lt;/pre>
&lt;p>For more information please refer to &lt;a href="https://slurm.schedmd.com/scancel.html" title="Slurm scancel doc">Slurm scancel documentation&lt;/a>.&lt;/p>
&lt;h3 id="optimizing-jobs">Optimizing Jobs&lt;/h3>
&lt;p>After a job has been completed, you can use &lt;code>seff ##&lt;/code> (&amp;quot;##&amp;quot; being your Slurm Job ID) to check how many resources your job consumed during it&amp;rsquo;s run. &lt;code>seff&lt;/code> is only useful &lt;strong>after&lt;/strong> a job has completed, and will not give useful information on currently-running jobs.&lt;/p>
&lt;p>For example:&lt;/p>
&lt;pre>&lt;code>$ seff 123123
Job ID: 123123
Cluster: hpcc
User/Group: your_username/yourlab
State: COMPLETED (exit code 0)
Nodes: 1
Cores per node: 20
CPU Utilized: 03:26:14
CPU Efficiency: 95.04% of 03:37:00 core-walltime
Job Wall-clock time: 00:10:51
Memory Utilized: 81.20 GB
Memory Efficiency: 81.20% of 100.00 GB
&lt;/code>&lt;/pre>
&lt;p>In the above example, we can see good utilization of the CPU cores (95%) as well as good utilization of memory usage (81%).&lt;/p>
&lt;p>If CPU Efficiency is low, make sure that the program(s) you are running makes use of multi-threading correctly. Requesting more cores for a job will not make your program run faster if it does not properly take advantage of them.&lt;/p>
&lt;p>If Memory Efficiency is low, then you can try reducing the requested memory for a job. &lt;strong>Note:&lt;/strong> Just because you see your job uses 81.20GB of memory &lt;strong>does not&lt;/strong> mean that next time you should request exactly 81.20GB of memory. Variations in input data &lt;strong>will&lt;/strong> cause different memory usage characteristics. You should try to aim to request ~20% higher memory then will actually be used to account for any spikes in memory usage. Slurm might miss some quick spikes of memory usage, but the Operating System will not. In this regard it&amp;rsquo;s better to overestimate on initial runs, and scale back once you find a good limit.&lt;/p>
&lt;h3 id="slurm-job-reasonerror-codes">Slurm Job Reason/Error Codes&lt;/h3>
&lt;p>If a job is stuck in the queue or fails to start, there are typically Slurm error codes assigned that explain the reason. Typically these are a bit hard to parse, so below is a table of common error codes and how to work around them.&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Error Code&lt;/th>
&lt;th>Reason&lt;/th>
&lt;th>Fix&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>Resources&lt;/td>
&lt;td>This isn&amp;rsquo;t an error, but rather why your job can&amp;rsquo;t start immediately.&lt;/td>
&lt;td>Once requested resources are available, then your job will start.&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Priority&lt;/td>
&lt;td>A job with a higher priority than yours is pending and needs to run first.&lt;/td>
&lt;td>You have likely submitted many jobs in a short period of time and Slurm&amp;rsquo;s Fair-Share algorithm is allowing other higher priority jobs to run first.&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>QOSMaxWallDurationPerJobLimit&lt;/td>
&lt;td>The time limit requested on the selected partition goes over the limits. For example, requesting 3 days on the &amp;ldquo;short&amp;rdquo; partition.&lt;/td>
&lt;td>Make sure that you are within the partition&amp;rsquo;s time limit. Please refer to the &lt;a href="https://hpcc.ucr.edu/manuals/hpc_cluster/queue/#partition-quotas">Queue Policies&lt;/a> page for the per-partition time limits.&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>AssocGrpCpuLimit&lt;/td>
&lt;td>You are exceeding the Per-User CPU limit on a specific partition.&lt;/td>
&lt;td>You must wait until jobs finish within a partition to free up resources to allow additional jobs to run.&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>AssocGrpMemLimit&lt;/td>
&lt;td>You are exceeding the Per-User Memory limit on a specific partition.&lt;/td>
&lt;td>You must wait until jobs finish within a partition to free up resources to allow additional jobs to run.&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>AssocGrpGRES&lt;/td>
&lt;td>You are exceeding the Per-User GRES (GPU) limit&lt;/td>
&lt;td>You must wait until your GPU jobs finish to free up resources to allow additional jobs to run.&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>MaxSubmitJobLimit&lt;/td>
&lt;td>You are trying to submit more than 5000 jobs. There is a 5000 job limit per-user for queued and running jobs.&lt;/td>
&lt;td>Wait until some of your jobs finish, then you can continue submitting jobs.&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>ReqNodeNotAvail, Reserved for maintenance&lt;/td>
&lt;td>The time limit of your job would cause it to overlap with an upcoming maintenance.&lt;/td>
&lt;td>You can either reduce your job&amp;rsquo;s runtime or wait for the maintenance to complete.&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>PartitionConfig&lt;/td>
&lt;td>The job has been queued to the wrong partition under the wrong account.&lt;/td>
&lt;td>Some partitions require that you queue under a specific account. eg. preempt jobs need to use the preempt account (&lt;code>-A preempt&lt;/code>)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>QOSMinGRES&lt;/td>
&lt;td>The job has not requested the minimum resources required for the specified partition.&lt;/td>
&lt;td>Some partitions require that you request a minimum number of resources. For example, for &amp;ldquo;highmem&amp;rdquo; you must request &amp;gt;= 100GB, and for &amp;ldquo;gpu&amp;rdquo; you must request a GPU using the &lt;code>--gres&lt;/code> flag.&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>This is only a small number of the most common reasons. For a full list please see Slurm&amp;rsquo;s &lt;a href="https://slurm.schedmd.com/job_reason_codes.html">Job Reason Codes&lt;/a> page. If you are confused as to why you&amp;rsquo;re getting a specific reason, please reach out to support.&lt;/p>
&lt;h3 id="advanced-jobs">Advanced Jobs&lt;/h3>
&lt;p>There is a third way of submitting jobs by using steps.
Single Step submission:&lt;/p>
&lt;pre>&lt;code class="language-bash">srun &amp;lt;command&amp;gt;
&lt;/code>&lt;/pre>
&lt;p>Under a single step job your command will hang until appropriate resources are found and when the step command is finished the results will be sent back on STDOUT. This may take some time depending on the job load of the cluster.
Multi Step submission:&lt;/p>
&lt;pre>&lt;code class="language-bash">salloc -N 4 bash -l
srun &amp;lt;command&amp;gt;
...
srun &amp;lt;command&amp;gt;
exit
&lt;/code>&lt;/pre>
&lt;p>Under a multi step job the salloc command will request resources and then your parent shell will be running on the head node. This means that all commands will be executed on the head node unless preceeded by the srun command. You will also need to exit this shell in order to terminate your job.&lt;/p>
&lt;h4 id="array-jobs">Array Jobs&lt;/h4>
&lt;p>If a large batch of fairly similar jobs need to be submitted, an Array Job might be a good option. For an array job, include the &lt;code>--array&lt;/code> parameter in your sbatch script, similar to the following:&lt;/p>
&lt;pre>&lt;code class="language-bash">#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=2 # This will be the number of CPUs per individual array job
#SBATCH --mem=1G # This will be the memory per individual array job
#SBATCH --time=0-00:15:00 # 15 minutes
#SBATCH --array=1-2500
#SBATCH --job-name=&amp;quot;just_a_test&amp;quot;
echo &amp;quot;I have array ID ${SLURM_ARRAY_TASK_ID}&amp;quot;
&lt;/code>&lt;/pre>
&lt;p>Within each job, the &lt;code>SLURM_ARRAY_TASK_ID&lt;/code> environment variable is set and can be used to slightly change how each job is run.&lt;/p>
&lt;p>Note that there is a 2500 job limit for array jobs.&lt;/p>
&lt;p>More information can be found on the &lt;a href="https://slurm.schedmd.com/job_array.html">Slurm Documentation&lt;/a> including other Environment Variables that are set per-job.&lt;/p>
&lt;h3 id="highmem-jobs">Highmem Jobs&lt;/h3>
&lt;p>The highmem partition does not have a default amount of memory set, however it does has a minimum limit of 100GB per job. This means that you need to explicity request at least 100GB or more of memory.&lt;/p>
&lt;p>Non-Interactive:&lt;/p>
&lt;pre>&lt;code class="language-bash">sbatch -p highmem --mem=100g --time=24:00:00 SBATCH_SCRIPT.sh
&lt;/code>&lt;/pre>
&lt;p>Interactive&lt;/p>
&lt;pre>&lt;code class="language-bash">srun -p highmem --mem=100g --time=24:00:00 --pty bash -l
&lt;/code>&lt;/pre>
&lt;p>Of course you should adjust the time argument according to your job requirements.&lt;/p>
&lt;h3 id="gpu-jobs">GPU Jobs&lt;/h3>
&lt;p>GPU nodes have multiple GPUs, and vary in type (K80, P100, A100, or H100). This means you need to request how many GPUs and of what type that you would like to use.&lt;/p>
&lt;p>To request a gpu of any type, only indicate how many GPUs you would like to use.&lt;/p>
&lt;p>Non-Interactive:&lt;/p>
&lt;pre>&lt;code class="language-bash">sbatch -p gpu --gres=gpu:1 --mem=100g --time=1:00:00 SBATCH_SCRIPT.sh
&lt;/code>&lt;/pre>
&lt;p>Interactive&lt;/p>
&lt;pre>&lt;code class="language-bash">srun -p gpu --gres=gpu:4 --mem=100g --time=1:00:00 --pty bash -l
&lt;/code>&lt;/pre>
&lt;p>Since the HPCC Cluster has many different types of GPUs installed (eg. K80, P100, A100, H100), GPUs can be requested explicitly by type. More info on what GPUs are available can be found in the &lt;a href="https://hpcc.ucr.edu/about/hardware/details/#worker-nodes">Worker Node&lt;/a> section of our &lt;a href="https://hpcc.ucr.edu/about/hardware/details/">Hardware Details&lt;/a> page.&lt;/p>
&lt;p>Non-Interactive:&lt;/p>
&lt;pre>&lt;code class="language-bash">sbatch -p gpu --gres=gpu:k80:1 --mem=100g --time=1:00:00 SBATCH_SCRIPT.sh
sbatch -p gpu --gres=gpu:p100:1 --mem=100g --time=1:00:00 SBATCH_SCRIPT.sh
sbatch -p gpu --gres=gpu:a100:1 --mem=100g --time=1:00:00 SBATCH_SCRIPT.sh
&lt;/code>&lt;/pre>
&lt;p>Interactive&lt;/p>
&lt;pre>&lt;code class="language-bash">srun -p gpu --gres=gpu:k80:1 --mem=100g --time=1:00:00 --pty bash -l
srun -p gpu --gres=gpu:p100:1 --mem=100g --time=1:00:00 --pty bash -l
srun -p gpu --gres=gpu:a100:1 --mem=100g --time=1:00:00 --pty bash -l
&lt;/code>&lt;/pre>
&lt;p>Of course you should adjust the time argument according to your job requirements.&lt;/p>
&lt;p>Once your job starts your code must reference the environment variable &amp;ldquo;CUDA_VISIBLE_DEVICES&amp;rdquo; which will indicate which GPUs have been assigned to your job. Most CUDA enabled software, like MegaHIT, will check this environment variable and automatically limit accordingly.&lt;/p>
&lt;p>For example, after reserving 4 GPUs for a NAMD2 job:&lt;/p>
&lt;pre>&lt;code class="language-bash">echo $CUDA_VISIBLE_DEVICES
0,1,2,3
namd2 +idlepoll +devices $CUDA_VISIBLE_DEVICES MD1.namd
&lt;/code>&lt;/pre>
&lt;p>Each group is limited to a maximum of 8 GPUs on the gpu partition. Please be respectful of others and keep in mind that the GPU nodes are a limited shared resource.
Since the CUDA libraries will only run with GPU hardware, development and compiling of code must be done within a job session on a GPU node.&lt;/p>
&lt;p>Here are a few more examples of jobs that utilize more complex features (ie. array, dependency, MPI etc):
&lt;a href="https://github.com/ucr-hpcc/hpcc_slurm_examples">Slurm Examples&lt;/a>&lt;/p>
&lt;h3 id="web-browser-access">Web Browser Access&lt;/h3>
&lt;h4 id="ports">Ports&lt;/h4>
&lt;p>Some jobs require web browser access in order to utilize the software effectively.
These kinds of jobs typically use (bind) ports in order to provide a graphical user interface (GUI) through a web browser.
Users are able to run jobs that use (bind) ports on a compute node.
Any port can be used on any compute node, as long as the port number is greater than 1000 and it is not already in use (bound).&lt;/p>
&lt;h4 id="tunneling">Tunneling&lt;/h4>
&lt;p>Once a job is running on a compute node and bound to a port, you may access this compute node via a web browser.
This is accomplished by using 2 chained SSH tunnels to route traffic through our firewall.
This acts much like 2 runners in a relay race, handing the baton to the next runner, to get past a security checkpoint.&lt;/p>
&lt;p>Running the following command on your local machine will create a tunnel that goes though a headnode and connect to a
compute node on a particular port.&lt;/p>
&lt;pre>&lt;code class="language-bash">ssh -NL 8888:NodeName:8888 username@cluster.hpcc.ucr.edu
&lt;/code>&lt;/pre>
&lt;p>Port 8888 (first) is the local port you will be using on your local machine.
NodeName is the compute node where where job is running, which can be found by using the &lt;code>squeue -u $USER&lt;/code> command.
Port 8888 (second) is the remote port on the compute node.
Again, the NodeName and ports will be different depending on where your job runs and what port your job uses.&lt;/p>
&lt;p>At this point you may need to provide a password to make the SSH tunnel.
Once this has succeeded, the command will hang (this is normal).
Leave this session connected, if you close it your tunnel will be closed.&lt;/p>
&lt;p>Then open a browser on your local computer (PC/laptop) and point it to:&lt;/p>
&lt;pre>&lt;code>http://localhost:8888
&lt;/code>&lt;/pre>
&lt;p>If your job uses TSL/SSL, so you may need to try https if the above does not work:&lt;/p>
&lt;pre>&lt;code>https://localhost:8888
&lt;/code>&lt;/pre>
&lt;h4 id="examples">Examples&lt;/h4>
&lt;ol>
&lt;li>
&lt;p>A perfect example of this method is used for Jupyter Lab/Notebook. For more details please refer to the &lt;a href="https://hpcc.ucr.edu/manuals/linux_basics/text/#jupyter-server">JupyterLab Usage&lt;/a> page.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>RStudio Server instances can also be started directly on a compute node and accessed via an SSH tunnel. For details see &lt;a href="https://hpcc.ucr.edu/manuals/linux_basics/text/#2-compute-node-instance">here&lt;/a>.&lt;/p>
&lt;/li>
&lt;/ol>
&lt;h3 id="desktop-environments">Desktop Environments&lt;/h3>
&lt;h4 id="vnc-server-cluster">VNC Server (cluster)&lt;/h4>
&lt;p>&lt;strong>Start VNC Server&lt;/strong>&lt;/p>
&lt;p>Log into the cluster:&lt;/p>
&lt;pre>&lt;code class="language-bash">ssh username@cluster.hpcc.ucr.edu
&lt;/code>&lt;/pre>
&lt;p>The VNC programs are only available on Compute Nodes, and additionally the first time you run the vncserver it will need to be configured:&lt;/p>
&lt;pre>&lt;code class="language-bash">srun -p epyc -c 2 --mem 4GB -t 10:00 --pty bash -l # Start compute session
vncserver -fg # Configure VNC
exit # Leave compute session
&lt;/code>&lt;/pre>
&lt;p>You should set a password for yourself, and the read-only password is optional.&lt;/p>
&lt;p>After your vncserver is configured, submit a vncserver job to get it started:&lt;/p>
&lt;pre>&lt;code class="language-bash">sbatch -p epyc --cpus-per-task=4 --mem=10g --time=2:00:00 --wrap='vncserver -fg' --output='vncserver-%j.out'
&lt;/code>&lt;/pre>
&lt;blockquote>
&lt;p>Note: Appropriate job resources should be requested based on the processes you will be running from within the VNC session.&lt;/p>
&lt;/blockquote>
&lt;p>Check the contents of your job log to determine the &lt;code>NodeName&lt;/code> and &lt;code>Port&lt;/code> you were assigned:&lt;/p>
&lt;pre>&lt;code class="language-bash">cat vncserver-*.out
&lt;/code>&lt;/pre>
&lt;p>The contents of your slurm job log should be similar to the following:&lt;/p>
&lt;pre>&lt;code class="language-bash">vncserver
New 'i54:1' desktop is i54:1
Creating default startup script /rhome/username/.vnc/xstartup
Starting applications specified in /rhome/username/.vnc/xstartup
Log file is /rhome/username/.vnc/i54:1.log
&lt;/code>&lt;/pre>
&lt;p>The VNC &lt;code>Port&lt;/code> used should be 5900+N, N being the display number mentioned above in the format &lt;code>NodeName&lt;/code>:&lt;code>DisplayNumber&lt;/code> (ie. &lt;code>i54:1&lt;/code>).
In this example (default), the port is &lt;code>5901&lt;/code>, if this &lt;code>Port&lt;/code> were already in use then the vncserver will automatically increment the DisplayNumber and you might find something like &lt;code>i54:2&lt;/code> or &lt;code>i54:3&lt;/code> and so on.&lt;/p>
&lt;p>&lt;strong>Stop VNC Server&lt;/strong>&lt;/p>
&lt;p>To stop the vncserver, you can click on the logout option from the upper right hand menu from within your VNC desktop environment.
If you want to kill your vncserver manually, then you will need to do the following:&lt;/p>
&lt;pre>&lt;code class="language-bash">ssh NodeName 'vncserver -kill :DisplayNumber'
&lt;/code>&lt;/pre>
&lt;p>You will need to replace &lt;code>NodeName&lt;/code> with the node name of your where your job is running, and the &lt;code>DisplayNumber&lt;/code> with the DisplayNumber from your slurm job log.&lt;/p>
&lt;h4 id="vnc-client-desktoplaptop">VNC Client (Desktop/Laptop)&lt;/h4>
&lt;p>After you know the &lt;code>NodeName&lt;/code> and VNC &lt;code>Port&lt;/code> you should be able to create an SSH tunnel to your vncserver, like so:&lt;/p>
&lt;pre>&lt;code class="language-bash">ssh -N -L Port:NodeName:Port cluster.hpcc.ucr.edu
&lt;/code>&lt;/pre>
&lt;p>Now let us create an SSH tunnel on your local machine (desktop/laptop) using the &lt;code>NodeName&lt;/code> and VNC &lt;code>Port&lt;/code> from above:&lt;/p>
&lt;pre>&lt;code class="language-bash">ssh -L 5901:i54:5901 cluster.hpcc.ucr.edu
&lt;/code>&lt;/pre>
&lt;p>After you have logged into the cluster with this shell, log into the node where your VNC server is running:&lt;/p>
&lt;pre>&lt;code class="language-bash">ssh NodeName
&lt;/code>&lt;/pre>
&lt;p>After you have logged into the correct &lt;code>NodeName&lt;/code>, just let this terminal sit here, do not close it.&lt;/p>
&lt;p>Then launch vncviewer on your local system (laptop/workstation), like so:&lt;/p>
&lt;pre>&lt;code class="language-bash">vncviewer localhost:5901
&lt;/code>&lt;/pre>
&lt;p>After launching the vncviewer, and providing your VNC password (not your cluster password), you should be able to see a Linux desktop environment.&lt;/p>
&lt;p>For more information regarding tunnels and VNC in MS Windows, please refer &lt;a href="https://docs.ycrc.yale.edu/clusters-at-yale/access/vnc/">More VNC Info&lt;/a>.&lt;/p>
&lt;!--
### Licenses
The cluster currently supports [Commercial Software](/about/software/commercial/). Since most of the licenses are campus wide there is no need to track individual jobs. One exception is the Intel Parallel Suite, which contains the Intel compilers.
The `--licenses` flag is used to request a license for Intel compilers, for example:
```bash
srun --license=intel:1 -p short --mem=10g --cpus-per-task=10 --time=2:00:00 --pty bash -l
module load intel
icc -help
```
The above interactive submission will request 1 Intel license, 10GB of RAM, 10 CPU cores for 2 hours on the short partition.
The short parititon can only be used for a maximum of 2 hours, however for compilation this could be sufficient.
It is recommended that you separate your compilation job from your computation/analysis job.
This way you will only have the license checked out for the duration of compilation, and not the during the execution of the analysis.
-->
&lt;h2 id="parallelization">Parallelization&lt;/h2>
&lt;p>There are 3 major ways to parallelize work on the cluster:&lt;/p>
&lt;ol>
&lt;li>Batch&lt;/li>
&lt;li>Thread&lt;/li>
&lt;li>MPI&lt;/li>
&lt;/ol>
&lt;h3 id="parallel-methods">Parallel Methods&lt;/h3>
&lt;p>For &lt;strong>batch&lt;/strong> jobs, all that is required is that you have a way to split up the data and submit multiple jobs running with the different chunks.
Some data sets, for example a FASTA file is very easy to split up (ie. fasta-splitter). This can also be more easily achieved by submitting an array job. For more details please refer to &lt;a href="#advanced-jobs">Advanced Jobs&lt;/a>.&lt;/p>
&lt;p>For &lt;strong>threaded&lt;/strong> jobs, your software must have an option referring to &amp;ldquo;number of threads&amp;rdquo; or &amp;ldquo;number of processors&amp;rdquo;. Once the thread/processor option is identified in the software, (ie. blastn flag &lt;code>-num_threads 4&lt;/code>) you can use that as long as you also request the same number of CPU cores (ie. slurm flag &lt;code>--cpus-per-task=4&lt;/code>).&lt;/p>
&lt;p>For &lt;strong>MPI&lt;/strong> jobs, your software must be MPI enabled. This generally means that it was compiled with MPI libraries. Please refer to the user manual of the software you wish to use as well as our documentation regarding &lt;a href="#mpi">MPI&lt;/a>. It is important that the number of cores used is equal to the number requested.&lt;/p>
&lt;p>In Slurm you will need 2 different flags to request cores, which may seem similar, however they have different purposes:&lt;/p>
&lt;ul>
&lt;li>The &lt;code>--cpus-per-task=N&lt;/code> will provide N number of virtual cores with locality as a factor.
Closer virtual cores can be faster, assuming there is a need for rapid communication between threads.
Generally, this is good for threading, however not so good for independent subprocesses nor for MPI.&lt;/li>
&lt;li>The &lt;code>--ntasks=N&lt;/code> flag will provide N number of physical cores on a single or even multiple nodes.
These cores can be further away, since the need for physical CPUs and dedicated memory is more important.
Generally this is good for independent subprocesses, and MPI, however not so good for threading.&lt;/li>
&lt;/ul>
&lt;p>Here is a table to better explain when to use these Slurm options:&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Slurm Flag&lt;/th>
&lt;th>Single Threaded&lt;/th>
&lt;th>Multi Threaded (OpenMP)&lt;/th>
&lt;th>MPI only&lt;/th>
&lt;th>MPI + Multi Threaded (hybrid)&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;code>--cpus-per-task&lt;/code>&lt;/td>
&lt;td>&lt;/td>
&lt;td>X&lt;/td>
&lt;td>&lt;/td>
&lt;td>X&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;code>--ntasks&lt;/code>&lt;/td>
&lt;td>&lt;/td>
&lt;td>&lt;/td>
&lt;td>X&lt;/td>
&lt;td>X&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>As you can see:&lt;/p>
&lt;ol>
&lt;li>A single threaded job would use neither Slurm option, since Slurm already assumes at least a single core.&lt;/li>
&lt;li>A multi threaded OpenMP job would use &lt;code>--cpus-per-task&lt;/code>.&lt;/li>
&lt;li>A MPI job would use &lt;code>--ntasks&lt;/code>.&lt;/li>
&lt;li>A Hybrid job would use both.&lt;/li>
&lt;/ol>
&lt;p>For more details on how these Slurm options work please review &lt;a href="https://slurm.schedmd.com/mc_support.html">Slurm Multi-core/Multi-thread Support&lt;/a>.&lt;/p>
&lt;h4 id="mpi">MPI&lt;/h4>
&lt;p>MPI stands for the Message Passing Interface. MPI is a standardized API typically used for parallel and/or distributed computing.
The HPCC cluster has a custom compiled versions of MPI that allows users to run MPI jobs across multiple nodes.
These types of jobs have the ability to take advantage of hundreds of CPU cores symultaniously, thus improving compute time.&lt;/p>
&lt;p>Many implementations of MPI exists, however we only support the following:&lt;/p>
&lt;ul>
&lt;li>&lt;a href="http://www.open-mpi.org/">Open MPI&lt;/a>&lt;/li>
&lt;li>&lt;a href="http://www.mpich.org/">MPICH&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://software.intel.com/en-us/mpi-developer-guide-linux">IMPI&lt;/a>&lt;/li>
&lt;/ul>
&lt;p>For general information on MPI under Slurm look &lt;a href="https://slurm.schedmd.com/mpi_guide.html">here&lt;/a>.
If you need to compile an MPI application then please email &lt;a href="mailto:support@hpcc.ucr.edu">support@hpcc.ucr.edu&lt;/a> for assistance.&lt;/p>
&lt;p>When submitting MPI jobs it is best to ensure that the nodes are identical, since MPI is sensitive to differences in CPU and/or memory speeds.
The &lt;code>batch&lt;/code> and &lt;code>intel&lt;/code> partitions are designed to be homogeneous, however, the &lt;code>short&lt;/code> partition is a mixed set of nodes.
When using the &lt;code>short&lt;/code> partition for MPI append the constraint flag for Slurm.&lt;/p>
&lt;p>&lt;strong>Short Example&lt;/strong>&lt;/p>
&lt;p>Here is an example that shows how to ensure that your job will only run on &lt;code>intel&lt;/code> nodes from the &lt;code>short&lt;/code> partition:&lt;/p>
&lt;pre>&lt;code class="language-bash">sbatch -p short --constraint=intel myJobScript.sh
&lt;/code>&lt;/pre>
&lt;p>&lt;strong>NAMD Example&lt;/strong>&lt;/p>
&lt;p>To run a NAMD2 process as an OpenMPI job on the cluster:&lt;/p>
&lt;ol>
&lt;li>
&lt;p>Log-in to the cluster&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Create SBATCH script&lt;/p>
&lt;pre>&lt;code class="language-bash">#!/bin/bash -l
#SBATCH -J c3d_cr2_md
#SBATCH -p epyc
#SBATCH --ntasks=32
#SBATCH --mem=16gb
#SBATCH --time=01:00:00
# Load needed modules
# You could also load frequently used modules from within your ~/.bashrc
module load slurm # Should already be loaded
module load openmpi # Should already be loaded
module load namd
# Run job utilizing all requested processors
# Please visit the namd site for usage details: http://www.ks.uiuc.edu/Research/namd/
mpirun --mca btl ^tcp namd2 run.conf &amp;amp;&amp;gt; run_namd.log
&lt;/code>&lt;/pre>
&lt;/li>
&lt;li>
&lt;p>Submit SBATCH script to Slurm queuing system&lt;/p>
&lt;pre>&lt;code class="language-bash">sbatch run_namd.sh
&lt;/code>&lt;/pre>
&lt;/li>
&lt;/ol>
&lt;p>&lt;strong>Maker Example&lt;/strong>&lt;/p>
&lt;p>OpenMPI does not function properly with Maker, you must use MPICH.
Our version of MPICH does not use the mpirun/mpiexec wrappers, instead use srun:&lt;/p>
&lt;pre>&lt;code class="language-bash">#!/bin/bash -l
#SBATCH -p epyc
#SBATCH --ntasks=32
#SBATCH --mem=16gb
#SBATCH --time=01:00:00
# Load maker
module load maker/2.31.11
mpirun maker # Provide appropriate maker options here
&lt;/code>&lt;/pre>
&lt;h2 id="more-examples">More examples&lt;/h2>
&lt;p>The range of differing jobs and how to submit them is endless:&lt;/p>
&lt;pre>&lt;code>1. Singularity containers
2. Database services
3. Graphical user interfaces
4. Etc ...
&lt;/code>&lt;/pre>
&lt;p>For a growing list of examples please visit &lt;a href="https://github.com/ucr-hpcc/hpcc_slurm_examples">HPCC Slurm Examples&lt;/a>.&lt;/p>
&lt;section class="footnotes" role="doc-endnotes">
&lt;hr>
&lt;ol>
&lt;li id="fn:1" role="doc-endnote">
&lt;p>These only list the most common CPU Extensions for each platform. A full list of supported extensions can be found using the &lt;code>lscpu&lt;/code> command on the respective node type.&amp;#160;&lt;a href="#fnref:1" class="footnote-backref" role="doc-backlink">&amp;#x21a9;&amp;#xfe0e;&lt;/a>&lt;/p>
&lt;/li>
&lt;/ol>
&lt;/section></description></item><item><title>Manuals: Queue Policies</title><link>https://hpcc.ucr.edu/manuals/hpc_cluster/queue/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://hpcc.ucr.edu/manuals/hpc_cluster/queue/</guid><description>
&lt;h2 id="start-times">Start Times&lt;/h2>
&lt;p>Start times are a great way to track your jobs:&lt;/p>
&lt;pre>&lt;code class="language-bash">squeue -u $USER --start
&lt;/code>&lt;/pre>
&lt;p>Start times are rough estimates based on the current state of the queue.&lt;/p>
&lt;h2 id="partition-quotas">Partition Quotas&lt;/h2>
&lt;p>Each partition has a specific usecase. Below outlines each partition, it&amp;rsquo;s usecase, as well as any job/user/group limits that are in place.
Empty boxes imply no limit, but is still limited by the next higher limit. Job limits are capped by user limits, and user limits are capped by group limits.&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Partition Name&lt;/th>
&lt;th>Usecase&lt;/th>
&lt;th>Per-User Limit&lt;/th>
&lt;th>Per-Job Limit&lt;/th>
&lt;th>Max Job Time&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>epyc (2021 CPU), intel (2016 CPU), batch (2012 CPU)&lt;/td>
&lt;td>CPU Intensive Workloads, Multithreaded, MPI, OpenMP&lt;/td>
&lt;td>384 Cores, 1TB memory &lt;sup id="fnref:1">&lt;a href="#fn:1" class="footnote-ref" role="doc-noteref">1&lt;/a>&lt;/sup>&lt;/td>
&lt;td>64GB memory per Core &lt;sup id="fnref:2">&lt;a href="#fn:2" class="footnote-ref" role="doc-noteref">2&lt;/a>&lt;/sup>,&lt;sup id="fnref:3">&lt;a href="#fn:3" class="footnote-ref" role="doc-noteref">3&lt;/a>&lt;/sup>&lt;/td>
&lt;td>30 Days&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>short&lt;/td>
&lt;td>Short CPU Intensive Workloads, Multithreaded, MPI, OpenMP&lt;/td>
&lt;td>384 Cores, 1TB memory&lt;/td>
&lt;td>64GB memory per Core, 2-hour time limit&lt;/td>
&lt;td>2 Hours&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>highmem&lt;/td>
&lt;td>Memory Intensive Workloads&lt;/td>
&lt;td>32 Cores, 2TB memory&lt;/td>
&lt;td>&lt;/td>
&lt;td>30 Days&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>highclock&lt;/td>
&lt;td>Low Parallelism Workloads&lt;/td>
&lt;td>32 Cores, 256GB memory&lt;/td>
&lt;td>&lt;/td>
&lt;td>30 Days&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>gpu&lt;/td>
&lt;td>GPU-Enabled Workloads&lt;/td>
&lt;td>4 GPUs&lt;sup id="fnref:4">&lt;a href="#fn:4" class="footnote-ref" role="doc-noteref">4&lt;/a>&lt;/sup>,48 Cores, 512GB memory&lt;/td>
&lt;td>16 Cores, 256GB memory &lt;sup id="fnref:2">&lt;a href="#fn:2" class="footnote-ref" role="doc-noteref">2&lt;/a>&lt;/sup>,&lt;sup id="fnref:5">&lt;a href="#fn:5" class="footnote-ref" role="doc-noteref">5&lt;/a>&lt;/sup>&lt;/td>
&lt;td>7 Days&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>In addition to the above limits, there is:&lt;/p>
&lt;ul>
&lt;li>A 768 core group limit that spans across all users in a group across all partitions.&lt;/li>
&lt;li>A 8 GPU group limit that spans across all users in a group across all GPU-enabled partitions.&lt;/li>
&lt;/ul>
&lt;p>For more detailed information on node hardware, see our &lt;a href="https://docs.google.com/spreadsheets/d/1SVH1-c1i075vjt-B0wNPiK87wmLkPltWJlIPgLkmoqU/">Node List&lt;/a> spreadsheet.&lt;/p>
&lt;p>Attempting to allocate more member than a node can support, eg 500GB on an Intel node, will cause the job to immediately fail. Limits are for actively running jobs,
and any newly queued job that exceeds a limit will be queued until resources become available. If you require additional
resourced beyond the listed limits, please see the &amp;ldquo;&lt;a href="#additional-resource-request">Additional Resource Request&lt;/a>&amp;rdquo; section below.&lt;/p>
&lt;p>Partition quotas can also be viewed on the cluster using the &lt;code>slurm_limits&lt;/code> command.&lt;/p>
&lt;p>Additionally, users can have up to 5000 jobs in queue/running at the same time. Attempting to queue more than 5000 jobs will cause jobs submissions to fail with the reason &amp;ldquo;MaxSubmitJobLimit&amp;rdquo;.&lt;/p>
&lt;h3 id="external-labs">External Labs&lt;/h3>
&lt;p>Labs external to UCR will have reduced resource limits as follows:&lt;/p>
&lt;ul>
&lt;li>Labs will have a CPU quota of 256 cores across all lab users across all partitions&lt;/li>
&lt;li>Per user CPU quotas on epyc, intel, batch, and short will be 128 cores&lt;/li>
&lt;li>Per user CPU quotas on highmem will be 16&lt;/li>
&lt;li>GPU quotas on the gpu partition will be 4 per-lab, and 2 per-user&lt;/li>
&lt;li>CPU quotas on the gpu partition will be 24 per-user and 8 per-job&lt;/li>
&lt;/ul>
&lt;h3 id="private-node-ownership">Private Node Ownership&lt;/h3>
&lt;p>Labs have the ability to purchase nodes and connect them to the cluster for increased quotas. More information can be found in the &lt;a href="https://hpcc.ucr.edu/about/overview/access/#ownership-models">Ownership Model&lt;/a> section of our Access page.&lt;/p>
&lt;h3 id="additional-resource-request">Additional Resource Request&lt;/h3>
&lt;p>Sometimes, whether it be due to deadlines or technical limitations, more resources might be needed than are supplied by default. If you require
a temporary increase in quotas, please reach out to &lt;a href="mailto:support@hpcc.ucr.edu">support@hpcc.ucr.edu&lt;/a> with a complete &amp;ldquo;&lt;a href="https://docs.google.com/document/d/1Ate2yOdmaYrwzcjNp8S4-8VeAuH2WYwAuyVxAv0FDfo/">Justification for Quota Exception&lt;/a>&amp;rdquo; form.
The following are typical circumstances that could justify increased quotas:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Urgent Deadlines&lt;/strong>: ie. Grant submissions, conference presentations, paper deadlines&lt;/li>
&lt;li>&lt;strong>Special Technical Needs&lt;/strong>: The limits do not meet the technical requirements for the program(s) that are trying to be ran.&lt;/li>
&lt;/ul>
&lt;p>The amount of additional resources, the length of time that the resources are needed, and the frequency of the requests are all factors that determine whether your request
will be accepted. It also must be within the capacity of the HPCC&amp;rsquo;s infrastructure while also ensuring minimal disruption to other users. The final decision of approving
exception requests, and how many extra resources to provide, will be decided by the HPCC Staff, the Director, and in exceptional cases the HPCC Oversight Committee.&lt;/p>
&lt;p>Requests limited by unoptimized code/datasets or strictly for the sake of convenience will be denied.&lt;/p>
&lt;p>Additionally at this time we are unable to grant additional resource requests for external labs due to how our cluster is partially subsidized by our campus. We appologize for this,
and suggest looking into national computing facilities or cloud offerings to fill the gap in compute.&lt;/p>
&lt;h3 id="example-scenarios">Example Scenarios&lt;/h3>
&lt;h4 id="per-job-limit">Per-Job Limit&lt;/h4>
&lt;p>A job is submitted on the gpu partition. The job requests 32 cores.&lt;/p>
&lt;blockquote>
&lt;p>This job will not be able to be submitted, as 32 cores is above the partition&amp;rsquo;s 16 core per-job limit.&lt;/p>
&lt;/blockquote>
&lt;h4 id="per-user-limit">Per-User Limit&lt;/h4>
&lt;p>You submit a job the highmem partition, requesting 32 cores.&lt;/p>
&lt;blockquote>
&lt;p>This job will start successfully, as it is within the partition&amp;rsquo;s core limit.&lt;/p>
&lt;/blockquote>
&lt;p>You submit a second job while the first job is still running. The new job is requesting 32 cores.&lt;/p>
&lt;blockquote>
&lt;p>Because you are at your per-user core limit on the highmem partition, the second job will be queued until the first job finishes.&lt;/p>
&lt;/blockquote>
&lt;h4 id="per-lab-limit">Per-Lab Limit&lt;/h4>
&lt;p>User A submits a job requesting 384 cores. User B submits a job requesting 384 cores.&lt;/p>
&lt;blockquote>
&lt;p>Because each user is within their per-user limits and the lab is within their limit, the jobs will run in parallel.&lt;/p>
&lt;/blockquote>
&lt;p>User C submits a job, requesting 16 cores.&lt;/p>
&lt;blockquote>
&lt;p>Because User A and User B are using all 768 cores within the lab, User C&amp;rsquo;s job will be queued until either User A&amp;rsquo;s or User B&amp;rsquo;s jobs finishes.&lt;/p>
&lt;/blockquote>
&lt;h2 id="changing-partitions">Changing Partitions&lt;/h2>
&lt;p>In &lt;code>srun&lt;/code> commands and &lt;code>sbatch&lt;/code> scripts, the &lt;code>-p&lt;/code> or &lt;code>--partition&lt;/code> flag controls which partition/queue a job will run on. For example,
using &lt;code>-p epyc&lt;/code> will have your job queued and ran on the &lt;code>epyc&lt;/code> partition. For more examples and information on running jobs,
see the &lt;a href="https://hpcc.ucr.edu/manuals/hpc_cluster/jobs/">Managing Jobs&lt;/a> page of our documentation.&lt;/p>
&lt;h2 id="fair-share">Fair-Share&lt;/h2>
&lt;p>Users that have not submitted any jobs in a long time usually have a higher priority over others that have ran jobs recently.
Thus the estimated start times can be extended to allow everyone their fair share of the system.
This prevents a few large groups from dominating the queuing system for long periods of time.&lt;/p>
&lt;p>You can see with the &lt;code>sqmore&lt;/code> command what priority your job has (list is sorted from lowest to highest priority).
You can also check to see how your group&amp;rsquo;s priority is compared to other groups on the cluster with the &amp;ldquo;sshare&amp;rdquo; command.&lt;/p>
&lt;p>For example:&lt;/p>
&lt;pre>&lt;code class="language-bash">sshare
&lt;/code>&lt;/pre>
&lt;p>It may also be useful to see your entire group&amp;rsquo;s fairshare score and who has used the most shares:&lt;/p>
&lt;pre>&lt;code class="language-bash">sshare -A $GROUP --all
&lt;/code>&lt;/pre>
&lt;p>Lastley, if you only want to see your own fairshare score:&lt;/p>
&lt;pre>&lt;code class="language-bash">sshare -A $GROUP -u $USER
&lt;/code>&lt;/pre>
&lt;p>The fairshare score is a number between 0 and 1. The best score being 1, and the worst being 0.
The fairshare score approches zero the more resource you (or your group) consume.
Your individual consumption of resources (usage) does affect your entire group&amp;rsquo;s fiarshare score.
The affects of your running/completed jobs on your fairshare score are halved each day (half-life).
Thus, after waiting several days without running any jobs, you should see an improvment in your fairshare score.&lt;/p>
&lt;p>Here is a very good &lt;a href="https://www.rc.fas.harvard.edu/fairshare/">explaination of fairshare&lt;/a>.&lt;/p>
&lt;h2 id="priority">Priority&lt;/h2>
&lt;p>The fairshare score and jobs queue wait time is used to calculate your job&amp;rsquo;s priority.
You can use the &lt;code>sprio&lt;/code> command to check the priority of your jobs:&lt;/p>
&lt;pre>&lt;code>sprio -u $USER
&lt;/code>&lt;/pre>
&lt;p>Even if your group has a lower fairshare score, your job may still have a very high priority.
This would be likely due to the job&amp;rsquo;s queue wait time, and it should start as soon as possible regardless of fairshare score.
You can use the &lt;code>sqmore&lt;/code> command to see a list of all jobs sorted by priority.&lt;/p>
&lt;h2 id="backfill">Backfill&lt;/h2>
&lt;p>Some small jobs may start before yours, only if they can complete before yours starts and thus not negatively affecting your start time.&lt;/p>
&lt;h2 id="priority-partition">Priority Partition&lt;/h2>
&lt;p>Some groups on our system have purchased additional hardware. These nodes will not be affected by the fairshare score.
This is because jobs submitted to the group&amp;rsquo;s partition will be evaluated first before any other jobs that have been submitted to those nodes from a different partition.&lt;/p>
&lt;h2 id="using-the-preempt-partitions-tenative">Using the Preempt Partitions (TENATIVE)&lt;/h2>
&lt;p>&lt;strong>NOTE&lt;/strong> The full release of the preempt partition is planned for future release and &lt;strong>is not&lt;/strong> yet available!&lt;/p>
&lt;p>This guide assumes that you know how to run Interactive and Batch jobs through Slurm. This is a more advanced job that expands on other jobs, so if you are not familiar with running jobs then please see the &lt;a href="https://hpcc.ucr.edu/manuals/hpc_cluster/jobs/">Managing Jobs&lt;/a> page of our documentation.&lt;/p>
&lt;p>There are two partitions that will have preemption enabled: &amp;ldquo;preempt&amp;rdquo; for CPU jobs, and &amp;ldquo;preempt_gpu&amp;rdquo; for GPU jobs.&lt;/p>
&lt;p>To fully take advantage of preemption, your jobs must be be able to tolerate being cancelled at a random time and restarted at some later point in the future. When your job is preempted, it will be cancelled and requeued. When the job is elegible to start again, it will start from the beginning of the sbatch script as if it were newly run. Jobs run under the preemption-enabled partitions run at a lower priority than the other public partitions as preemption&amp;rsquo;s main job is to attempt to fill capacity that would be otherwise idle.&lt;/p>
&lt;p>Your job is only guaranteed &lt;strong>1 minute&lt;/strong> of uninterrupted runtime after it starts before it is elegible to be preempted by higher priority jobs.&lt;/p>
&lt;h3 id="job-limitations">Job Limitations&lt;/h3>
&lt;h4 id="time">Time&lt;/h4>
&lt;p>As mentioned above, jobs can be killed at any time after the 1 minute grace period. Jobs should be set up such that any initialization steps that cannot tolerate being randomly killed happen within that first minute. The max walltime of a job is currently set to 1 day (24 hours). The time limit can may be changed in the future depending on how the community utilizes the partitions.&lt;/p>
&lt;h4 id="resources">Resources&lt;/h4>
&lt;p>Currently, users are allowed to use an equal number of CPU cores as their current CPU limit. If you&amp;rsquo;re currently allowed to use 384 cores on the public partitions, then you can use 384 cores on the preempt partition. The same applies to memory. For the GPU partition, users are currently allowed to use 1 GPU on the &amp;ldquo;preempt_gpu&amp;rdquo; partition. Resource limits might be changed in the future depending on how the community utilizes the partitions.&lt;/p>
&lt;h3 id="starting-a-job">Starting a job&lt;/h3>
&lt;p>Similar to other partitions, you must specifically queue jobs to the &lt;code>preempt&lt;/code> partition. One special thing that is required is to also specify the &lt;code>preempt&lt;/code> account using &lt;code>-A preempt&lt;/code>. Jobs started on the preempt partition &lt;strong>do not&lt;/strong> count against your lab&amp;rsquo;s CPU quota.&lt;/p>
&lt;h4 id="interactive-example">Interactive Example&lt;/h4>
&lt;p>To start a CPU preemptable interactive job, you can build off of the following command:&lt;/p>
&lt;pre>&lt;code class="language-bash">srun -A preempt -p preempt -c 8 --mem 8GB --pty bash -l
&lt;/code>&lt;/pre>
&lt;p>This will start a job with 8 cores and 8GB of memory on the &lt;code>preempt&lt;/code> partition under the &lt;code>preempt&lt;/code> account. Jobs that do not explicitly state &lt;code>-A preempt&lt;/code> will fail to start. Note that because this is a preemptable job, your session can be terminated at any moment without notice after the 1 minute grace period.&lt;/p>
&lt;p>To start a GPU preemptable interactive job, you can build off of the following command:&lt;/p>
&lt;pre>&lt;code class="language-bash">srun -A preempt -p preempt_gpu --gres=gpu:1 -c 8 --mem 8GB --pty bash -l
&lt;/code>&lt;/pre>
&lt;h4 id="non-interactive-batch-example">Non-interactive (batch) Example&lt;/h4>
&lt;p>As with all preemptable jobs, batch jobs can be cancelled at any time without notice and your programs/scripts &lt;em>must&lt;/em> be able to tolerate this. Jobs that have been preempted will automatically be requeued to resume running at a later time when resources become available. The &lt;code>$SLURM_RESTART_COUNT&lt;/code> environment variable can be used to check if the job has been preempted and restarted to allow you to recover and resume running.&lt;/p>
&lt;p>To start a batch job, you can build off of the following sbatch file:&lt;/p>
&lt;pre>&lt;code>#!/bin/bash -l
#SBATCH -A preempt # Remember to use the &amp;quot;preempt&amp;quot; account!
#SBATCH -p preempt
#SBATCH -c 8
#SBATCH --mem 8GB
#SBATCH --time 1-00:00:00
# Check if this is the first run or a resumed job
if [ &amp;quot;$SLURM_RESTART_COUNT&amp;quot; -eq 0 ]; then
echo &amp;quot;This is the first time running the job&amp;quot;
# Put the code for the first run here
# Example: initializing data or setting up environment
# Remember that a job only has 1 minute of guaranteed runtime. Keep
# any initialization/recovery short otherwise it might be interrupted
else
echo &amp;quot;The job is being resumed after a preemption&amp;quot;
# Put the code for a resumed job here
# Example: resuming from a checkpoint or continuing work
# Remember that a job only has 1 minute of guaranteed runtime. Keep
# any initialization/recovery short otherwise it might be interrupted
fi
# Common job code that runs regardless of first run or resume
echo &amp;quot;Running main job tasks...&amp;quot;
# Put your main job code here
&lt;/code>&lt;/pre>
&lt;p>Jobs that do not explicitly state &lt;code>#SBATCH -A preempt&lt;/code> will fail to start. Note that because this is a preemptable job, your job can be cancelled at any moment without notice.&lt;/p>
&lt;h4 id="selecting-resources">Selecting Resources&lt;/h4>
&lt;p>Similar to the &amp;ldquo;short&amp;rdquo; partition, the &amp;ldquo;preempt&amp;rdquo; partition is a union of all public and private machines, excluding specialty partitions like highmem, highclock, GPU, etc. This means that if you do not specify any restrictions, your job can run on nodes in the batch, intel, or epyc partition. If a certain architecture is required for your job, then you can use the &lt;code>--constraint&lt;/code> flag. Similarly, the &amp;ldquo;preempt_gpu&amp;rdquo; partition is a union of all public and private GPU machines, similar to &amp;ldquo;short_gpu&amp;rdquo;. Constraints can be used on the &amp;ldquo;preempt_gpu&amp;rdquo; partition as well to use specific resources.&lt;/p>
&lt;p>For example, if you want your job to run on an Intel machine, you can include &lt;code>#SBATCH --constraint=intel&lt;/code> in your sbatch script, or &lt;code>--constraint=intel&lt;/code> in your srun command. If you want either an Intel or Epyc Rome machine, then you could use &lt;code>#SBATCH --constraint=intel|rome&lt;/code> in your sbatch script, or &lt;code>constraint=intel|rome&lt;/code> in your srun command. More information on constraints is available in the &lt;a href="https://slurm.schedmd.com/sbatch.html#OPT_constraint">Slurm Documentation&lt;/a>.&lt;/p>
&lt;p>To view which nodes contain which features, see the Feature Constraints listed on the &lt;a href="https://hpcc.ucr.edu/manuals/hpc_cluster/jobs/#feature-constraints">Feature Constraints&lt;/a> page&lt;/p>
&lt;section class="footnotes" role="doc-endnotes">
&lt;hr>
&lt;ol>
&lt;li id="fn:1" role="doc-endnote">
&lt;p>The 384 core/1TB limit is per-user across all CPU compute partitions (epyc, intel, and batch). Attempting to run more then 384 cores, even if across multiple CPU compute partitions, will be queued until resources become available. Specialized partitions (eg. short, highmem, gpu) will not count against the CPU compute partition&amp;rsquo;s quotas.&amp;#160;&lt;a href="#fnref:1" class="footnote-backref" role="doc-backlink">&amp;#x21a9;&amp;#xfe0e;&lt;/a>&lt;/p>
&lt;/li>
&lt;li id="fn:2" role="doc-endnote">
&lt;p>A 64GB-per-core limit is placed to prevent over allocating memory compared to CPUs. If more than a 64GB-per-core ratio is requested, the core count will be increased to match.&amp;#160;&lt;a href="#fnref:2" class="footnote-backref" role="doc-backlink">&amp;#x21a9;&amp;#xfe0e;&lt;/a>&lt;/p>
&lt;/li>
&lt;li id="fn:3" role="doc-endnote">
&lt;p>Allocatable memory per-node in the &lt;strong>epyc&lt;/strong> partition is limited to &lt;strong>~950GB&lt;/strong> to allow for system overhead.&amp;#160;&lt;a href="#fnref:3" class="footnote-backref" role="doc-backlink">&amp;#x21a9;&amp;#xfe0e;&lt;/a>&lt;/p>
&lt;/li>
&lt;li id="fn:4" role="doc-endnote">
&lt;p>If a user needs more than 4 GPUs, please contact &lt;a href="mailto:support@hpcc.ucr.edu">support@hpcc.ucr.edu&lt;/a> with a short justification for a temporary increase.&amp;#160;&lt;a href="#fnref:4" class="footnote-backref" role="doc-backlink">&amp;#x21a9;&amp;#xfe0e;&lt;/a>&lt;/p>
&lt;/li>
&lt;li id="fn:5" role="doc-endnote">
&lt;p>Allocatable memory per-node in the &lt;strong>gpu&lt;/strong> partition is dependent on the node. 115GB for gpu[01-02], 500GB for gpu[03-04], 200GB for gpu05, 922GB for gpu06, 950GB for gpu[07-08]&amp;#160;&lt;a href="#fnref:5" class="footnote-backref" role="doc-backlink">&amp;#x21a9;&amp;#xfe0e;&lt;/a>&lt;/p>
&lt;/li>
&lt;/ol>
&lt;/section></description></item><item><title>Manuals: Package Management</title><link>https://hpcc.ucr.edu/manuals/hpc_cluster/package_manage/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://hpcc.ucr.edu/manuals/hpc_cluster/package_manage/</guid><description>
&lt;h2 id="python">Python&lt;/h2>
&lt;p>The scope of this manual is a brief introduction on how to manage Python packages.&lt;/p>
&lt;h3 id="python-versions">Python Versions&lt;/h3>
&lt;p>Different Python versions do not play nice with each other. It is best to only load one Python module at any given time.
The miniconda3 module for Python is the default version. This will enable users to leverage the conda installer, but with as few Python packages pre-installed as possible. This is to avoid conflicts with future needs of individuals.&lt;/p>
&lt;h4 id="conda">Conda&lt;/h4>
&lt;p>We have several Conda software modules:&lt;/p>
&lt;ol>
&lt;li>miniconda3 - Basic Python 3 install (Default)&lt;/li>
&lt;li>anaconda - Full Python 3 install
For more information regarding our module system please refer to &lt;a href="../../manuals/hpc_cluster/start/#modules">Environment Modules&lt;/a>.&lt;/li>
&lt;/ol>
&lt;p>The miniconda modules are very basic installs, however users can choose to unload this basic install for a fuller one (anaconda), like so:&lt;/p>
&lt;pre>&lt;code class="language-bash">module load miniconda3
&lt;/code>&lt;/pre>
&lt;p>After loading anaconda, you will see that there are many more Python packages installed (ie. numpy, scipy, pandas, jupyter, etc&amp;hellip;).
For a list of installed Python packages try the following:&lt;/p>
&lt;pre>&lt;code class="language-bash">pip list
&lt;/code>&lt;/pre>
&lt;h4 id="virtual-environments">Virtual Environments&lt;/h4>
&lt;p>Sometimes it is best to create your own environment in which you have full control over package installs.
Conda allows you to do this through virtual environments.&lt;/p>
&lt;h5 id="initialize">Initialize&lt;/h5>
&lt;p>Conda will now auto initialize when you load the corresponding module. No need to run the &lt;code>conda init&lt;/code> or make any modifications to your &lt;code>~/.bashrc&lt;/code> file.&lt;/p>
&lt;h5 id="configure">Configure&lt;/h5>
&lt;p>Installing many packages can consume a large (ie. &amp;gt;20GB) amount of disk space, thus it is recommended to store conda environments under your bigdata space.
If you have bigdata, create the &lt;code>.condarc&lt;/code> file (otherwise conda environments will be created under your home directory).&lt;/p>
&lt;p>Create the file &lt;code>.condarc&lt;/code> in your home, with the following content:&lt;/p>
&lt;pre>&lt;code>channels:
- defaults
pkgs_dirs:
- ~/bigdata/.conda/pkgs
envs_dirs:
- ~/bigdata/.conda/envs
auto_activate_base: false
&lt;/code>&lt;/pre>
&lt;blockquote>
&lt;p>After changing the configuration, environments can be moved to the new bigdata location using &lt;code>conda rename -n NAME NAME_tmp&lt;/code>, then &lt;code>conda rename -n NAME_tmp NAME&lt;/code> to return it to it&amp;rsquo;s original name. Replacing &lt;code>NAME&lt;/code> with the name of the environment you wish to move. If you receive an error while trying to rename, try activting the base conda environment using &lt;code>conda activate base&lt;/code> and running the &lt;code>conda rename&lt;/code> commands again.&lt;/p>
&lt;/blockquote>
&lt;blockquote>
&lt;p>It&amp;rsquo;s also recommended to clean your old conda packages using &lt;code>conda clean -a&lt;/code>. Note that this command can take a while (&amp;raquo;1 hour) if there are a lot of downloaded packages.&lt;/p>
&lt;/blockquote>
&lt;p>Create a Python 3.10 conda environment, like so:&lt;/p>
&lt;pre>&lt;code class="language-bash">module load miniconda3 # Should already be auto-loaded during login
conda create -n NameForNewEnv python=3.10 # Many Python versions are available
&lt;/code>&lt;/pre>
&lt;h5 id="activating">Activating&lt;/h5>
&lt;p>Once your virtual environment has been created, you need to activate it before you can use it:&lt;/p>
&lt;pre>&lt;code class="language-bash">conda activate NameForNewEnv
&lt;/code>&lt;/pre>
&lt;p>With more modules being added as conda environments, it&amp;rsquo;s sometimes requried to &amp;ldquo;stack&amp;rdquo; user environments on top of module-provided environments.
Running &lt;code>conda activate&lt;/code> will deactivate the current environment before activating the new environment..
To counter this, the &lt;code>--stack&lt;/code> flag can be used to effectively &amp;ldquo;combine&amp;rdquo; environments. For example &lt;code>conda activate --stack NameForNewEnv&lt;/code>. Please see the conda page
on &lt;a href="https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#nested-activation">Nested Activation&lt;/a> for more details.&lt;/p>
&lt;h5 id="deactivating">Deactivating&lt;/h5>
&lt;p>In order to exit from your virtual environment, do the following:&lt;/p>
&lt;pre>&lt;code class="language-bash">conda deactivate
&lt;/code>&lt;/pre>
&lt;h5 id="installing-packages">Installing packages&lt;/h5>
&lt;p>Before installing your packages, make sure you are on a computer node. This ensures your downloads to be done quickly and with less chance of running out of memory. This can be done using the following command:&lt;/p>
&lt;pre>&lt;code class="language-bash">srun -p short -c 4 --mem=10g --pty bash -l # Adjust the resource request as needed
&lt;/code>&lt;/pre>
&lt;p>Here is a simple example for installing packages under your Python virtual environment via conda:&lt;/p>
&lt;pre>&lt;code class="language-bash">conda install -n NameForNewEnv PackageName
&lt;/code>&lt;/pre>
&lt;p>You may need to enable an additional channel to install the package (refer to your package&amp;rsquo;s documentation):&lt;/p>
&lt;pre>&lt;code class="language-bash">conda install -n NameForNewEnv -c ChannelName PackageName
&lt;/code>&lt;/pre>
&lt;h5 id="cloning">Cloning&lt;/h5>
&lt;p>It is possible for you to copy an existing environment into a new environment:&lt;/p>
&lt;pre>&lt;code class="language-bash">conda create --name AnotherNameForNewEnv --clone NameForNewEnv
&lt;/code>&lt;/pre>
&lt;h5 id="listing-environments">Listing Environments&lt;/h5>
&lt;p>Run the following to get a list of currently installed conda evironments:&lt;/p>
&lt;pre>&lt;code class="language-bash">conda env list
&lt;/code>&lt;/pre>
&lt;h5 id="removing">Removing&lt;/h5>
&lt;p>If you wish to remove a conda environment run the following:&lt;/p>
&lt;pre>&lt;code class="language-bash">conda env remove --name myenv
&lt;/code>&lt;/pre>
&lt;h4 id="more-info">More Info&lt;/h4>
&lt;p>For more information regarding conda please visit &lt;a href="https://conda.io/docs/user-guide/">Conda Docs&lt;/a>.&lt;/p>
&lt;h3 id="jupyter">Jupyter&lt;/h3>
&lt;p>You can run jupyter as an interactive job or you can use the web instance, see &lt;a href="https://hpcc.ucr.edu/manuals/linux_basics/text/#jupyter-server">Jupyter Usage&lt;/a> for details.&lt;/p>
&lt;h4 id="virtual-environments-kernels">Virtual Environments (Kernels)&lt;/h4>
&lt;p>In order to use a custom Python/Conda virtual environment within Jupyter, it must be configured as a kernel. You will need to do the following:&lt;/p>
&lt;pre>&lt;code class="language-bash"># Create a virtual environment named &amp;quot;ipykernel_py3&amp;quot;, if you don't already have one
# It can be named whatever you like, &amp;quot;ipykernel_py3&amp;quot; is just an example.
# You can also indicate a more specific version of Python here. Otherwise you'll get
# the latest version provided by Anaconda.
conda create -n ipykernel_py3 python=3 ipykernel
# Load the new environment
conda activate ipykernel_py3
# Install kernel
# --name is used to define the internal name used by Jupyter, and should not contain spaces.
# --display-name is the name you will see in the Jupyter web interface, should be descriptive.
python -m ipykernel install --user --name ipykernel_py3 --display-name &amp;quot;IPyKernel (Python 3)&amp;quot;
&lt;/code>&lt;/pre>
&lt;p>Now when you visit the notebook you should see the option &amp;ldquo;JupyterPy3&amp;rdquo; when you click the &amp;ldquo;New&amp;rdquo; dropdown menu in the upper left corner of the home page.&lt;/p>
&lt;p>To remove an unwanted kernel, use the following commands:&lt;/p>
&lt;pre>&lt;code class="language-bash">jupyter kernelspec list # List available kernels
jupyter kernelspec uninstall UNWANTEDKERNEL
&lt;/code>&lt;/pre>
&lt;blockquote>
&lt;p>Replace UNWANTEDKERNEL with the name of the kernel you wish to remove&lt;/p>
&lt;/blockquote>
&lt;p>Further reading: &lt;a href="https://ipython.readthedocs.io/en/stable/install/kernel_install.html">Installing the IPython kernel&lt;/a>&lt;/p>
&lt;h4 id="r">R&lt;/h4>
&lt;p>For instructions on how to configure your R environment please visit &lt;a href="https://github.com/IRkernel/IRkernel">IRkernel&lt;/a>.
Since we should already have IRkernel install in the latest version of R, you would only need to do the following within R:&lt;/p>
&lt;pre>&lt;code class="language-R">IRkernel::installspec(name = 'ir44', displayname = 'R 4.0.1')
&lt;/code>&lt;/pre>
&lt;h2 id="r-1">R&lt;/h2>
&lt;p>This section is regarding how to manage R packages.&lt;/p>
&lt;h3 id="current-r-version">Current R Version&lt;/h3>
&lt;blockquote>
&lt;p>NOTE: Please be aware that this version of R is built with &lt;code>GCC/8.3.0&lt;/code>, which means that previously compiled modules may be incompatible.&lt;/p>
&lt;/blockquote>
&lt;p>Currently the default version of R is &lt;code>R/4.3.0&lt;/code> and is loaded automatically for you.&lt;/p>
&lt;p>When a new release of R is available, you should reinstall any local R packages, however keep in mind of the following:&lt;/p>
&lt;ul>
&lt;li>Remove redundantly installed local R packages with the &lt;code>RdupCheck&lt;/code> command.&lt;/li>
&lt;li>Newer version of R packages are not backward compatible, once installed they only work for that specific version of &lt;code>R&lt;/code>.&lt;/li>
&lt;/ul>
&lt;h3 id="older-r-versions">Older R Versions&lt;/h3>
&lt;p>You can load other versions of R with the following:&lt;/p>
&lt;pre>&lt;code class="language-bash">module unload R
module avail R
module load R/VERSION
&lt;/code>&lt;/pre>
&lt;h3 id="installing-r-packages">Installing R Packages&lt;/h3>
&lt;p>The default version of &lt;code>R&lt;/code> has many of the most popular &lt;code>R&lt;/code> packages already installed and available.
It is also possible for you to install additional R packages in your local environment.&lt;/p>
&lt;p>Only install packages if they are not already available, this will minimize issues later.
You can check the current version of &lt;code>R&lt;/code> from the command line, like so:&lt;/p>
&lt;pre>&lt;code class="language-bash">Rscript -e &amp;quot;library('some-package-name')&amp;quot;
&lt;/code>&lt;/pre>
&lt;p>Or you can check from within &lt;code>R&lt;/code>, like so:&lt;/p>
&lt;pre>&lt;code class="language-R">library('some-package-name')
&lt;/code>&lt;/pre>
&lt;p>If the package is not available, then proceed with installation.&lt;/p>
&lt;h4 id="bioconductor-packages">Bioconductor Packages&lt;/h4>
&lt;p>To install from Bioconductor you can use the following method:&lt;/p>
&lt;pre>&lt;code class="language-R">BiocManager::install(c(&amp;quot;package-to-install&amp;quot;, &amp;quot;another-packages-to-install&amp;quot;))
Update all/some/none? [a/s/n]: n
&lt;/code>&lt;/pre>
&lt;p>For more information please visit &lt;a href="https://www.bioconductor.org/install/">Bioconductor Install Page&lt;/a>.&lt;/p>
&lt;h4 id="github-packages">GitHub Packages&lt;/h4>
&lt;pre>&lt;code class="language-R"># Load devtools
library(devtools)
# Replace name with the GitHub account/repo
install_github(&amp;quot;duncantl/RGoogleDocs&amp;quot;)
&lt;/code>&lt;/pre>
&lt;h4 id="local-packages">Local Packages&lt;/h4>
&lt;pre>&lt;code class="language-R"># Replace URL with your URL or local path to your .tar.gz file
install.packages(&amp;quot;http://hartleys.github.io/QoRTs/QoRTs_LATEST.tar.gz&amp;quot;,repos=NULL,type=&amp;quot;source&amp;quot;)
&lt;/code>&lt;/pre></description></item><item><title>Manuals: Selected Research Software Usage</title><link>https://hpcc.ucr.edu/manuals/hpc_cluster/selected_software/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://hpcc.ucr.edu/manuals/hpc_cluster/selected_software/</guid><description>
&lt;blockquote>
&lt;p>The below links to usage manuals of selected research software requiring more complex usage instructions.&lt;/p>
&lt;/blockquote></description></item><item><title>Manuals: Data Storage</title><link>https://hpcc.ucr.edu/manuals/hpc_cluster/storage/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://hpcc.ucr.edu/manuals/hpc_cluster/storage/</guid><description>
&lt;h2 id="dashboard">Dashboard&lt;/h2>
&lt;p>HPCC cluster users are able to check on their home and bigdata storage usage from the &lt;a href="https://dashboard.hpcc.ucr.edu">Dashboard Portal&lt;/a>. Note that there is a known issue with the dashboard&amp;rsquo;s display of usage when users are a part of multiple labs.&lt;/p>
&lt;h2 id="home">Home&lt;/h2>
&lt;p>Home directories are where you start each session on the HPC cluster and where your jobs start when running on the cluster. This is usually where you place the scripts and various things you are working on. This space is very limited. Please remember that the home storage space quota per user account is 20 GB.&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Path&lt;/th>
&lt;th>/rhome/&lt;code>username&lt;/code>&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>User Availability&lt;/td>
&lt;td>All Users&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Node Availability&lt;/td>
&lt;td>All Nodes&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Quota Responsibility&lt;/td>
&lt;td>User&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h2 id="bigdata">Bigdata&lt;/h2>
&lt;p>Bigdata is an area where large amounts of storage can be made available to users. A lab purchases bigdata space separately from access to the cluster. This space is then made available to the lab via a shared directory and individual directories for each user.&lt;/p>
&lt;p>&lt;strong>Lab Shared Space&lt;/strong>
This directory can be accessed by the lab as a whole.&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Path&lt;/th>
&lt;th>/bigdata/&lt;code>labname&lt;/code>/shared&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>User Availability&lt;/td>
&lt;td>Labs that have purchased space.&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Node Availability&lt;/td>
&lt;td>All Nodes&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Quota Responsibility&lt;/td>
&lt;td>Lab&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>&lt;strong>Individual User Space&lt;/strong>
This directory can be accessed by specific lab members.&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Path&lt;/th>
&lt;th>/bigdata/&lt;code>labname&lt;/code>/&lt;code>username&lt;/code>&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>User Availability&lt;/td>
&lt;td>Labs that have purchased space.&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Node Availability&lt;/td>
&lt;td>All Nodes&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Quota Responsibility&lt;/td>
&lt;td>Lab&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h2 id="non-persistent-space">Non-Persistent Space&lt;/h2>
&lt;p>Frequently, there is a need for faster temporary storage. For example activities like the following would fall under this category:&lt;/p>
&lt;ol>
&lt;li>Output a significant amount of intermediate data during a job&lt;/li>
&lt;li>Access a dataset from a faster medium than bigdata or the home directories&lt;/li>
&lt;li>Write out lock files&lt;/li>
&lt;/ol>
&lt;p>These types of activities are well suited to the use of fast non-persistent spaces. Below are the filesystems available on the HPC cluster that would best suited for these actions.&lt;/p>
&lt;p>&lt;strong>SSD Backed Scratch Space&lt;/strong>
This space is much faster than the persistent space (/rhome,/bigdata), but slower than using RAM based storage. The scratch space should be used with the &lt;code>$SCRATCH&lt;/code> environment variable which is automatically set for each job. This location is local to the node it is ran on and is automatically deleted once a job has finished.&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Path&lt;/th>
&lt;th>/scratch&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>User Availability&lt;/td>
&lt;td>All Users&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Node Availability&lt;/td>
&lt;td>All Nodes&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Quota Responsibility&lt;/td>
&lt;td>N/A&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>&lt;strong>Temporary Space&lt;/strong>
This is a standard space available on all Linux systems. Please be aware that it is limited to the amount of free disk space on the node you are running on.&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Path&lt;/th>
&lt;th>/tmp&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>User Availability&lt;/td>
&lt;td>All Users&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Node Availability&lt;/td>
&lt;td>All Nodes&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Quota Responsibility&lt;/td>
&lt;td>N/A&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>&lt;strong>RAM Space&lt;/strong>
This type of space takes away from physical memory but allows extremely fast access to the files located on it. When submitting a job you will need to factor in the space your job is using in RAM as well. For example, if you have a dataset that is 1G in size and use this space, it will take at least 1G of RAM.&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Path&lt;/th>
&lt;th>/dev/shm&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>User Availability&lt;/td>
&lt;td>All Users&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Node Availability&lt;/td>
&lt;td>All Nodes&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Quota Responsibility&lt;/td>
&lt;td>N/A&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h2 id="usage-and-quotas">Usage and Quotas&lt;/h2>
&lt;p>To quickly check your usage and quota limits:&lt;/p>
&lt;pre>&lt;code class="language-bash">check_quota home
check_quota bigdata
&lt;/code>&lt;/pre>
&lt;p>To get the usage of your current directory, run the following command:&lt;/p>
&lt;pre>&lt;code class="language-bash">du -sh .
&lt;/code>&lt;/pre>
&lt;p>To calculate the sizes of each separate sub directory, run:&lt;/p>
&lt;pre>&lt;code class="language-bash">du -sch *
du -sch .[!.]* * | sort -h # includes hidden files and directories
&lt;/code>&lt;/pre>
&lt;p>This may take some time to complete, please be patient.&lt;/p>
&lt;p>For more information on your home directory, please see the &lt;a href="../../manuals/linux_basics/cmdline_basics/">Linux Basics Orientation&lt;/a>.&lt;/p>
&lt;h2 id="automatic-backups-and-snapshots">Automatic Backups and Snapshots&lt;/h2>
&lt;p>The cluster creates monthly backups, however it is still advantageous for users to periodically make copies of their critical data to a separate storage device.
The cluster is a production system for research computations with a very expensive high-performance storage infrastructure.&lt;/p>
&lt;p>Home snapshots are created daily and kept for one week.
Bigdata snapshots are created weekly and kept for one month.&lt;/p>
&lt;p>Home and bigdata backups are located under the following respective directories:&lt;/p>
&lt;pre>&lt;code class="language-bash">/rhome/.snapshots/
/bigdata/.snapshots/
&lt;/code>&lt;/pre>
&lt;p>The individual snapshot directories have names with numerical values in epoch time format.
The higher the value the more recent the snapshot.&lt;/p>
&lt;p>To view the exact time of when each snapshot was taken execute the following commands:&lt;/p>
&lt;pre>&lt;code class="language-bash">mmlssnapshot home
mmlssnapshot bigdata
&lt;/code>&lt;/pre></description></item><item><title>Manuals: Sharing Data</title><link>https://hpcc.ucr.edu/manuals/hpc_cluster/sharing/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://hpcc.ucr.edu/manuals/hpc_cluster/sharing/</guid><description>
&lt;h2 id="permissions">Permissions&lt;/h2>
&lt;p>It is useful to share data and results with other users on the cluster, and we
encourage collaboration. The easiest way to share a file is to place it in a
location that both users can access. Then the second user can simply copy it to
a location of their choice. However, this requires that the file permissions
permit the second user to read the file. Basic file permissions on Linux and
other Unix like systems are composed of three groups: owner, group, and other.
Each one of these represents the permissions for different groups of people:
the user who owns the file, all the group members of the group owner, and
everyone else, respectively Each group has 3 permissions: read, write, and
execute, represented as r,w, and x. For example the following file is owned by
the user &lt;code>username&lt;/code> (with read, write, and execute), owned by the group
&lt;code>groupname&lt;/code> (with read and execute), and everyone else cannot access it.&lt;/p>
&lt;pre>&lt;code class="language-bash">username@pigeon:~$ ls -l myFile
-rwxr-x--- 1 username groupname 1.6K Nov 19 12:32 myFile
&lt;/code>&lt;/pre>
&lt;p>If you wanted to share this file with someone outside the &lt;code>groupname&lt;/code> group, read permissions must be added to the file for &amp;lsquo;other&amp;rsquo;:&lt;/p>
&lt;pre>&lt;code class="language-bash">username@pigeon:~$ chmod o+r myFile
&lt;/code>&lt;/pre>
&lt;p>To learn more about ownership, permissions, and groups please visit &lt;a href="../../manuals/linux_basics/permissions/">Linux Basics Permissions&lt;/a>.&lt;/p>
&lt;h2 id="set-default-permissions">Set Default Permissions&lt;/h2>
&lt;p>In Linux, it is possible to set the default file permission for new files. This is useful if you are collaborating on a project, or frequently share files and you do not want to be constantly adjusting permissions The command responsible for this is called &amp;lsquo;umask&amp;rsquo;. You should first check what your default permissions currently are by running &amp;lsquo;umask -S&amp;rsquo;.&lt;/p>
&lt;pre>&lt;code class="language-bash">username@pigeon:~$ umask -S
u=rwx,g=rx,o=rx
&lt;/code>&lt;/pre>
&lt;p>To set your default permissions, simply run umask with the correct options. Please note, that this does not change permissions on any existing files, only new files created after you update the default permissions. For instance, if you wanted to set your default permissions to you having full control, your group being able to read and execute your files, and no one else to have access, you would run:&lt;/p>
&lt;pre>&lt;code class="language-bash">username@pigeon:~$ umask u=rwx,g=rx,o=
&lt;/code>&lt;/pre>
&lt;p>It is also important to note that these settings only affect your current session.
If you log out and log back in, these settings will be reset.
To make your changes permanent you need to add them to your &lt;code>.bashrc&lt;/code> file, which is a hidden file in your home directory (if you do not have a &lt;code>.bashrc&lt;/code> file, you will need to create an empty file called &lt;code>.bashrc&lt;/code> in your home directory).
Adding umask to your &lt;code>.bashrc&lt;/code> file is as simple as adding your umask command (such as &lt;code>umask u=rwx,g=rx,o=r&lt;/code>) to the end of the file.
Then simply log out and back in for the changes to take affect. You can double check that the settings have taken affect by running &lt;code>umask -S&lt;/code>.&lt;/p>
&lt;p>To learn more about umask please visit &lt;a href="http://www.cyberciti.biz/tips/understanding-linux-unix-umask-value-usage.html">What is Umask and How To Setup Default umask Under Linux?&lt;/a>.&lt;/p>
&lt;h2 id="file-transfers">File Transfers&lt;/h2>
&lt;p>For file transfers and data sharing, both command-line and GUI applications can
be used. For beginners we recommend the
&lt;a href="https://filezilla-project.org/">FileZilla&lt;/a> GUI application (download/install
from &lt;a href="https://filezilla-project.org/">here&lt;/a>) since it is available for most OSs
including macOS, Windows, Linux and ChromeOS. A basic user manual for FileZilla
is &lt;a href="https://wiki.filezilla-project.org/FileZilla_Client_Tutorial_(en)">here&lt;/a>
and a video tutorial is &lt;a href="https://www.youtube.com/watch?v=O3DudpEMPiY">here&lt;/a>.
Alternative user-friendly SCP/SFTP GUI applications include
&lt;a href="https://cyberduck.io/">Cyberduck&lt;/a> and
&lt;a href="https://winscp.net/eng/download.php">WinSCP&lt;/a> for Mac and Windows OSs,
respectively.&lt;/p>
&lt;h3 id="filezilla-usage">FileZilla Usage&lt;/h3>
&lt;p>FileZilla supports both &lt;a href="https://hpcc.ucr.edu/manuals/login/#passwordduo">Password+DUO&lt;/a>
and &lt;a href="https://hpcc.ucr.edu/manuals/login/#ssh-keys">SSH Key&lt;/a> based authentication methods.
Both options are described below.&lt;/p>
&lt;h4 id="authentication-with-passwordduo">Authentication with Password+DUO&lt;/h4>
&lt;p>When using &lt;code>FileZilla&lt;/code> a new site can be created by selecting &lt;code>File&lt;/code> &lt;strong>-&amp;gt;&lt;/strong> &lt;code>Site Manager&lt;/code>.
In the subsequent window, &lt;code>New Site&lt;/code> should be selected. Next the following information should be provided
in the right pane of the &lt;code>General&lt;/code> tab.&lt;/p>
&lt;pre>&lt;code>Protocol: SFTP
Host: cluster.hpcc.ucr.edu
Logon Type: Interactive
User: &amp;lt;username&amp;gt;
&lt;/code>&lt;/pre>
&lt;p>Under &lt;code>&amp;lt;username&amp;gt;&lt;/code> the actual username of an HPCC account should be provided. The &lt;code>Logon Type&lt;/code> can be
&lt;code>Interactive&lt;/code> or &lt;code>Key File&lt;/code> for &lt;a href="#passwordduo">Password+DUO&lt;/a> or &lt;a href="#ssh-keys">SSH Keys&lt;/a> authentication,
respectively. When choosing &lt;code>Password+DUO&lt;/code> authentication, the max connections should be configured.
To do so, navigate to the &lt;code>Transfer Settings&lt;/code> tab and make the following settings.&lt;/p>
&lt;pre>&lt;code> Limit Number of simultaneous connections: checked
Maximum number of connections: 1
&lt;/code>&lt;/pre>
&lt;p>By clicking &amp;ldquo;OK&amp;rdquo; the new site will be saved. Subsequently, one can select the new site from the main window by
clicking the arrow next to the site list, or just reopen the Site Manager and clicking the &amp;ldquo;connect&amp;rdquo;
button from the new site window.&lt;/p>
&lt;h4 id="authentication-with-ssh-key">Authentication with SSH Key&lt;/h4>
&lt;p>For &lt;a href="https://hpcc.ucr.edu/manuals/login/#ssh-keys">ssh key&lt;/a> based access, users
want to make the selections shown in the Figure below. For this access method
it is important to choose the &lt;code>Site Manager&lt;/code> option as FileZilla&amp;rsquo;s Quick Access
method will not work here.&lt;/p>
&lt;center>&lt;img title="FileZilla_ssh_key" src="https://girke.bioinformatics.ucr.edu/GEN242/tutorials/linux/images/FileZilla_ssh_key.png" width="600">&lt;img/>&lt;/center>
&lt;center>FileZilla settings with an SSH key. For generating SSH keys see &lt;a href="https://hpcc.ucr.edu/manuals/login/#ssh-keys">here&lt;/a>.&lt;/center>
&lt;h2 id="command-line-scp">Command-line SCP&lt;/h2>
&lt;p>Advantages of this method include: batch up/downloads and ease of automation. A detailed manual is available &lt;a href="https://linux.die.net/man/1/scp">here&lt;/a>.&lt;/p>
&lt;ul>
&lt;li>
&lt;p>To copy files To the server run the following on your workstation or laptop:&lt;/p>
&lt;p>&lt;code>scp -r &amp;lt;path_to_directory&amp;gt; &amp;lt;your_username&amp;gt;@&amp;lt;host_name&amp;gt;:&lt;/code>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>To copy files From the server run the following on your workstation or laptop:&lt;/p>
&lt;p>&lt;code>scp -r &amp;lt;your_username&amp;gt;@&amp;lt;host_name&amp;gt;:&amp;lt;path_to_directory&amp;gt; .&lt;/code>&lt;/p>
&lt;/li>
&lt;/ul>
&lt;h2 id="copying-bigdata">Copying bigdata&lt;/h2>
&lt;p>Rsync can:&lt;/p>
&lt;ul>
&lt;li>Copy (transfer) directories between different locations&lt;/li>
&lt;li>Perform transfers over the network via SSH&lt;/li>
&lt;li>Compare large data sets (&lt;code>-n, --dry-run&lt;/code> option)&lt;/li>
&lt;li>Resume interrupted transfers&lt;/li>
&lt;/ul>
&lt;p>Rsync Notes:&lt;/p>
&lt;ul>
&lt;li>Rsync can be used on Windows, but you must install &lt;a href="https://cygwin.com">Cygwin&lt;/a>. Most Mac and Linux systems already have rsync install by default.&lt;/li>
&lt;li>Always put the / after both folder names, e.g: &lt;code>FOLDER_A/&lt;/code> Failing to do so will result in the nesting folders every time you try to resume. If you don&amp;rsquo;t put / you will get a second folder_B inside folder_B &lt;code>FOLDER_A/FOLDER_A/&lt;/code>&lt;/li>
&lt;li>Rsync only copies by default.&lt;/li>
&lt;li>Once the rsync command is done, run it again. The second run will be shorter and can be used as a double check. If there was no output from the second run then nothing changed.&lt;/li>
&lt;li>To learn more try &lt;code>man rsync&lt;/code>&lt;/li>
&lt;/ul>
&lt;p>If you are transfering to, or from, your laptop/workstation it is required that you run the rsync command locally from your laptop/workstation.&lt;/p>
&lt;p>To transfer to the cluster:&lt;/p>
&lt;pre>&lt;code class="language-bash">rsync -av --progress FOLDER_A/ cluster.hpcc.ucr.edu:FOLDER_A/
&lt;/code>&lt;/pre>
&lt;p>To transfer from the cluster:&lt;/p>
&lt;pre>&lt;code class="language-bash">rsync -av --progress cluster.hpcc.ucr.edu:FOLDER_A/ FOLDER_A/
&lt;/code>&lt;/pre>
&lt;p>Rsync will use SSH and will ask you for your cluster password, the same way SSH or SCP does.&lt;/p>
&lt;p>If your rsync transer was interrupted, rsync can continue where it left off. Simply run the same command again to resume.&lt;/p>
&lt;h2 id="copying-large-folders-on-the-cluster-between-directories">Copying large folders on the cluster between Directories&lt;/h2>
&lt;p>If you want to syncronize the contents from one directory to another, then use the following:&lt;/p>
&lt;pre>&lt;code class="language-bash">rsync -av --progress PATH_A/FOLDER_A/ PATH_B/FOLDER_A/
&lt;/code>&lt;/pre>
&lt;p>Rsync does not move but only copies. Thus you would need to delete the original once you confirm that everything has been transfered.&lt;/p>
&lt;h2 id="copying-large-folders-between-the-cluster-and-other-servers">Copying large folders between the cluster and other servers&lt;/h2>
&lt;p>If you want to copy data from the cluster to your own server, or another remote system, use the following:&lt;/p>
&lt;pre>&lt;code class="language-bash">rsync -ai FOLDER_A/ sever2.xyz.edu:FOLDER_A/
&lt;/code>&lt;/pre>
&lt;p>The sever2.xyz.edu machine must be a server that accepts Rsync connections via SSH.&lt;/p>
&lt;h2 id="sharing-files-on-the-web">Sharing Files on the Web&lt;/h2>
&lt;blockquote>
&lt;p>Note: This is not intended to be used as a long-term solution or referenced in publications. It should be used for internal project purposes only. A long-term solution is required, please use a web or cloud-based installation.&lt;/p>
&lt;/blockquote>
&lt;p>Simply create a symbolic link or move the files into your html directory when you want to share them.
For exmaple, log into the HPC cluster and run the following:&lt;/p>
&lt;pre>&lt;code class="language-bash"># Make sure you have an html directory
mkdir ~/.html
#Make sure permissions are set correctly
chmod a+x ~/
chmod a+rx ~/.html
# Make a new web project directory
mkdir www-project
chmod a+rx www-project
# Create a default test file
echo '&amp;lt;h1&amp;gt;Hello!&amp;lt;/h1&amp;gt;' &amp;gt; ~/www-project/index.html
# Create shortcut/link for new web project in html directory
ln -s ~/www-project ~/.html/
&lt;/code>&lt;/pre>
&lt;p>Now, test it out by pointing your web-browser to &lt;a href="https://cluster.hpcc.ucr.edu/~username/www-project/">https://cluster.hpcc.ucr.edu/~username/www-project/&lt;/a>
Be sure to replace &lt;code>username&lt;/code> with your actual user name. &lt;strong>The forward slash at the end is important.&lt;/strong>&lt;/p>
&lt;h3 id="common-problems">Common Problems&lt;/h3>
&lt;h4 id="403-forbidden--you-dont-have-permissions">&amp;ldquo;403 Forbidden&amp;rdquo; / You don&amp;rsquo;t have permissions&lt;/h4>
&lt;p>If using a symbolic link to data stored elsewhere on the cluster, every folder in the tree leading up to the shared folder must, at a minimum, have the execute permission (&lt;code>chmod a+x folder_name&lt;/code>). For example, if you have a symbolic link to &lt;code>/bigdata/mylab/myuser/data/web-content&lt;/code>, then &lt;code>myuser&lt;/code>, &lt;code>data&lt;/code>, and &lt;code>web-content&lt;/code> must all have the execute permission (&lt;code>bigdata&lt;/code> and &lt;code>mylab&lt;/code> should already have them).&lt;/p>
&lt;h2 id="password-protect-web-pages">Password Protect Web Pages&lt;/h2>
&lt;p>Files in web directories can be password protected.
First create a password file and then create a new user:&lt;/p>
&lt;pre>&lt;code class="language-bash">touch ~/.html/.htpasswd
htpasswd ~/.html/.htpasswd newwebuser
&lt;/code>&lt;/pre>
&lt;p>This will prompt you to enter a password for the new user &amp;lsquo;newwebuser&amp;rsquo;.
Create a new directory, or go to an existing directory, that you want to password protect:&lt;/p>
&lt;pre>&lt;code class="language-bash">mkdir ~/.html/locked_dir
cd ~/.html/locked_dir
&lt;/code>&lt;/pre>
&lt;p>For the above commands you can choose any directory name you want.&lt;/p>
&lt;p>Then place the following content within a file called &lt;code>.htaccess&lt;/code>:&lt;/p>
&lt;pre>&lt;code class="language-apache">AuthName 'Please login'
AuthType Basic
AuthUserFile /rhome/username/.html/.htpasswd
require user newwebuser
&lt;/code>&lt;/pre>
&lt;p>Now, test it out by pointing your web-browser to &lt;a href="http://cluster.hpcc.ucr.edu/~username/locked_dir">http://cluster.hpcc.ucr.edu/~username/locked_dir&lt;/a>
Be sure to replace &lt;code>username&lt;/code> with your actual user name for the above code and URL.&lt;/p>
&lt;h2 id="google-drive">Google Drive&lt;/h2>
&lt;p>There are several tools used to transfer files from Google Drive to the cluster, however RClone may be the easiest to setup.&lt;/p>
&lt;ol>
&lt;li>
&lt;p>Create an &lt;code>SSH&lt;/code> tunnel to the cluster, (MS Windows users should use &lt;code>MobaXterm&lt;/code>):&lt;/p>
&lt;pre>&lt;code>ssh -L 53682:localhost:53682 username@cluster.hpcc.ucr.edu
&lt;/code>&lt;/pre>
&lt;/li>
&lt;li>
&lt;p>Once you have logged into the cluster with the above command, then load &lt;code>rclone&lt;/code> via the module system and run it, like so:&lt;/p>
&lt;pre>&lt;code>module load rclone
rclone config
&lt;/code>&lt;/pre>
&lt;/li>
&lt;li>
&lt;p>After that, follow this &lt;a href="https://rclone.org/drive/">RClone Walkthrough&lt;/a> to complete your setup.&lt;/p>
&lt;/li>
&lt;/ol>
&lt;h2 id="globus">Globus&lt;/h2>
&lt;p>See Globus page &lt;a href="https://hpcc.ucr.edu/manuals/hpc_cluster/data/globus/">here&lt;/a>.&lt;/p></description></item><item><title>Manuals: Security</title><link>https://hpcc.ucr.edu/manuals/hpc_cluster/security/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://hpcc.ucr.edu/manuals/hpc_cluster/security/</guid><description>
&lt;h2 id="protection-levels-and-classification">Protection Levels and Classification&lt;/h2>
&lt;p>UCR protection levels, and data classifications are outlined by UCOP as a UC wide policy: &lt;a href="https://security.ucop.edu/policies/institutional-information-and-it-resource-classification.html">UCOP Institutional Information and IT Resource Classification&lt;/a>
According to the above documentation, there are 4 levels of protection for 4 classifications of data:&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Protection Level&lt;/th>
&lt;th>Policy&lt;/th>
&lt;th>Examples&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>P1 - Minimal&lt;/td>
&lt;td>IS-1&lt;/td>
&lt;td>Internet facing websites, press releases, anything intended for public use&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>P2 - Low&lt;/td>
&lt;td>IS-2&lt;/td>
&lt;td>Unpublished research work, intellectual property NOT classified as P3 or P4&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>P3 - Moderate&lt;/td>
&lt;td>IS-3&lt;/td>
&lt;td>Research information classified by an Institutional Review Board as P3 (ie. dbGaP from NIH)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>P4 - High&lt;/td>
&lt;td>IS-4&lt;/td>
&lt;td>Protected Health Information (PHI/HIPAA), patient records, sensitive identifiable human subject research data, Social Security Numbers&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>The HPC cluster could be compliant with with other security polices (ie. NIH), however the policy must be reviewed by our security team.&lt;/p>
&lt;p>At this time the HPC cluster is not a IS-4 (P4) compliant cluster. If you have needs for very sensitive data, it may be best to work with UCSD and their &lt;a href="https://sherlock.sdsc.edu/">Sherlock service&lt;/a>.
Our cluster is IS-3 compliant, however there are several responsibilities that users will need to adhere to.&lt;/p>
&lt;h2 id="general-guidelines">General Guidelines&lt;/h2>
&lt;p>&lt;span style="color:red">First, please contact us (&lt;a href="mailto:support@hpcc.ucr.edu">support@hpcc.ucr.edu&lt;/a>) before transferring any data to the cluster.
After we have reviewed your needs, data classification and appropriate protection level, then it may be possible to proceed to use the HPCC.&lt;/span>&lt;/p>
&lt;p>Here are a few basic rules to keep in mind:&lt;/p>
&lt;ul>
&lt;li>Always be aware of access control methods (Unix permissions and ACLs), do not allow others to view the data (ie. chmod 400 filename)&lt;/li>
&lt;li>Do not make unnecessary copies of the data&lt;/li>
&lt;li>Do not transfer the data to insecure locations&lt;/li>
&lt;li>Encrypt data when/where possible&lt;/li>
&lt;li>Delete all data when it is no longer needed&lt;/li>
&lt;/ul>
&lt;h2 id="access-controls">Access Controls&lt;/h2>
&lt;p>When sharing files with others, it is imperative that proper permission are used.
However, basic Unix permissions (user,group,other) may not be adequate.
It is better to use ACLs in order to allow fine grained access to sensitive files.&lt;/p>
&lt;h3 id="gpfs-acls">GPFS ACLs&lt;/h3>
&lt;p>GPFS is used for most of our filesystems (/rhome and /bigdata) and it uses nfsv4 style ACLs.
Users are able to explicitly allow many individuals, or groups, access to specific files or directories.&lt;/p>
&lt;pre>&lt;code class="language-bash"># Get current permissions and store in acls file
mmgetacl /path/to/file &amp;gt; ~/acls.txt
# Edit acls file containing permissions
vim ~/acls.txt
# Apply new permissions to file
mmputacl -i ~/acls.txt /path/to/file
# Delete acls file
rm ~/acls.txt
&lt;/code>&lt;/pre>
&lt;p>For more information regarding GPFS ACLs refer to the following: &lt;a href="https://www.ibm.com/docs/en/storage-scale/5.2.2?topic=lists-traditional-gpfs-acl-administration">GPFS ACLs&lt;/a>. An example for granting another user access to a file is given &lt;a href="https://servicenow.iu.edu/kb?id=kb_article_view&amp;amp;sysparm_article=KB0024359#modify">here&lt;/a>.&lt;/p>
&lt;h3 id="xfs-acls">XFS ACLs&lt;/h3>
&lt;p>The XFS filesystem is used for the CentOS operating system and typical unix locations (/,/var,/tmp,etc).
For more information on how to use ACLs under XFS, please refer to the following: &lt;a href="https://vishmule.com/2015/06/11/access-control-list-acl-permissions-in-rhel7centos7/">CentOS 7 XFS&lt;/a>&lt;/p>
&lt;blockquote>
&lt;p>Note: ACLs are not applicable to gocryptfs, which is a FUSE filesystem, not GPFS nor XFS.&lt;/p>
&lt;/blockquote>
&lt;h2 id="encryption">Encryption&lt;/h2>
&lt;p>Under the IS-3 policy, P3 data encryption is mandatory.
It is best if you get into the habit of doing encryption in transit, as well as encryption at rest.
This means, when you move the data (transit) or when the data is not in use (rest), it should be encrypted.&lt;/p>
&lt;h3 id="in-transit">In Transit&lt;/h3>
&lt;p>When transferring files make sure that files are encrypted in flight with one of the following transfer protocols:&lt;/p>
&lt;ul>
&lt;li>SCP&lt;/li>
&lt;li>SFTP&lt;/li>
&lt;li>RSYNC (via SSH)&lt;/li>
&lt;/ul>
&lt;p>The destination for sensitive data on the cluster must also be encrypted at rest under one of the follow secure locations:&lt;/p>
&lt;ul>
&lt;li>/dev/shm/ - This location is in RAM, so it does not exist at rest (ensure proper ACLs)&lt;/li>
&lt;li>/run/user/$EUID/unencrypted - This location is manually managed, and should be created for access to unencrypted files.&lt;/li>
&lt;/ul>
&lt;p>It is also possible to encrypt your files with GPG (&lt;a href="https://kb.iu.edu/d/awio">GPG Example&lt;/a>), before they are transferred.
Thus, during transfer they will be GPG encrypted. However, decryption must occur in one of the secure locations mentioned above.&lt;/p>
&lt;blockquote>
&lt;p>Note: Never store passphrases/passwords/masterkeys in an unsecure location (ie. a plain text script under /rhome).&lt;/p>
&lt;/blockquote>
&lt;h3 id="at-rest">At Rest&lt;/h3>
&lt;p>There are 3 methods available on the cluster for encryption at rest:&lt;/p>
&lt;ol>
&lt;li>GPG encryption of files via the command line &lt;a href="https://kb.iu.edu/d/awio">GPG Example&lt;/a>, however you must ensure proper ACLs and decryption must occur in a secure location.&lt;/li>
&lt;li>Create your own location with &lt;a href="https://nuetzlich.net/gocryptfs/forward_mode_crypto/">gocryptfs&lt;/a>.&lt;/li>
&lt;/ol>
&lt;h4 id="gocryptfsmgr">GocryptfsMgr&lt;/h4>
&lt;p>You can use &lt;code>gocryptfs&lt;/code> directly or use the &lt;code>gocryptfsmgr&lt;/code>, which automates a few steps in order to simplify things.&lt;/p>
&lt;p>Here are the basics when using &lt;code>gocryptfsmgr&lt;/code>:&lt;/p>
&lt;pre>&lt;code class="language-bash"># Load the gocryptfs module. Not strictly required, but sets a handful of useful environment variables
module load gocryptfs
# Create new encrypted data directory
gocryptfsmgr create bigdata privatedata1
# List all encrypted and unencrypted (access point) directories
gocryptfsmgr list
# Unencrypted privatedata1 (create access point)
gocryptfsmgr open bigdata privatedata1 rw
# Transfer files (ie. SCP,SFTP,RSYNC)
scp user@remote-server:sensitive_file.txt $UNENCRYPTED/privatedata1/sensitive_file.txt
# Remove access point (re-encrypt) privatedata1
gocryptfsmgr close privatedata1
# Remove all access points (re-encrypt all)
gocryptfsmgr quit
&lt;/code>&lt;/pre>
&lt;p>For subsequent access to the encrypted space, (ie. computation or analysis) the follow procedure is recommended:&lt;/p>
&lt;pre>&lt;code class="language-bash"># Request a 2hr interactive job on an exclusive node, resources can be adjusted as needed
srun -p short --exclusive=user --pty bash -l
# Unencrypted privatedata1 in read-only mode (create access point)
gocryptfsmgr open bigdata privatedata1 ro
# Read file contents from privatedata1 (simulating work or analysis)
cat $UNENCRYPTED/privatedata1/sensitive_file.txt
# List all encrypted and unencrypted (access points) directories
gocryptfsmgr list
# Make sure we re-encrypt (close access point) for privatedata1
gocryptfsmgr close privatedata1
# Exit from interactive job
exit
&lt;/code>&lt;/pre>
&lt;p>With the above methods you can create multiple encrypted directories and access points and move between them.&lt;/p>
&lt;h4 id="gocryptfs">Gocryptfs&lt;/h4>
&lt;p>When using the &lt;code>gocryptfs&lt;/code> directly, you will need to know a bit more details on how it works.
The &lt;code>gocryptfs&lt;/code> module on the HPCC cluster uses these predefined variables:&lt;/p>
&lt;ol>
&lt;li>&lt;code>HOME_ENCRYPTED&lt;/code> = &lt;code>/rhome/$USER/encrypted&lt;/code> - Very small encrypted space, not recommended to use&lt;/li>
&lt;li>&lt;code>BIGDATA_ENCRYPTED&lt;/code> = &lt;code>/rhome/$USER/bigdata/encrypted&lt;/code> - Best encrypted space for private data sets&lt;/li>
&lt;li>&lt;code>SHARED_ENCRYPTED&lt;/code> = &lt;code>/rhome/$USER/shared/encrypted&lt;/code> - Encrypted space when intending to share data sets with group&lt;/li>
&lt;li>&lt;code>UNENCRYPTED&lt;/code> = &lt;code>/run/user/$UID/unencrypted&lt;/code> - Access directory where encrypted data will be viewed as unencrypted&lt;/li>
&lt;/ol>
&lt;p>Here is an example how to create an encrypted directory under the &lt;code>BIGDATA_ENCRYPTED&lt;/code> location using &lt;code>gocryptfs&lt;/code>:&lt;/p>
&lt;pre>&lt;code class="language-bash"># Load gocyptfs software
module load gocryptfs
# Create empty data directory
mkdir -p $BIGDATA_ENCRYPTED/privatedata1
# Then intialize empty directory and encrypt it
gocryptfs -aessiv -init $BIGDATA_ENCRYPTED/privatedata1
# Create access point directory where encrypted files will be viewed as unencrypted
mkdir -p $UNENCRYPTED/privatedata1
# After that mount the encrypted directory on the access point and open a new shell within it
gocryptfssh $BIGDATA_ENCRYPTED/privatedata1
# Transfer files (ie. SCP,SFTP,RSYNC)
scp user@remote-server:sensitive_file.txt $UNENCRYPTED/sensitive_file.txt
# Exiting the `gocryptfssh` shell will automatically close the mount
exit
&lt;/code>&lt;/pre>
&lt;p>For subsequent access to the encrypted space, (ie. computation or analysis) the follow procedure is recommended:&lt;/p>
&lt;pre>&lt;code class="language-bash"># Request a 2hr interactive job on an exclusive node, resources can be adjusted as needed
srun -p short --exclusive=user --pty bash -l
# Load cyptfs software
module load gocryptfs
# Create unencrypted directory
mkdir -p $UNENCRYPTED/privatedata1
# Mount encrypted filesystem as read-only and unmount idling for 1 hour
gocryptfs -ro -i 1h -sharedstorage $BIGDATA_ENCRYPTED/privatedata1 $UNENCRYPTED/privatedata1
# Read file contents (simulating work or analysis)
cat $UNENCRYPTED/privatedata1/sensitive_file.txt
# Manually close access point when analysis has completed
fusermount -u $UNENCRYPTED/privatedata1
# Delete old empty access point
rmdir $UNENCRYPTED/privatedata1
&lt;/code>&lt;/pre>
&lt;blockquote>
&lt;p>WARNING: Avoid writing to the same file at the same time from different nodes. The encrypted file system cannot handle simultaneous writes and will corrupt the file. If simultaneous jobs are necessary then using write mode from a head node and read-only mode from compute nodes may be the best solution here.
Also, be mindful of reamaining job time and make sure that you have unmounted the unencrypted directories before your job ends.&lt;/p>
&lt;/blockquote>
&lt;p>For another example on how to use gocrypfs on an HPC cluster: &lt;a href="https://hpc.uni.lu/blog/2018/sensitive-data-encryption-using-gocryptfs/">Luxembourg HPC gocryptfs Example&lt;/a>&lt;/p>
&lt;h2 id="deletion">Deletion&lt;/h2>
&lt;p>To ensure the complete removal of data, it is best to &lt;code>shred&lt;/code> files instead of removing them with &lt;code>rm&lt;/code>. The &lt;code>shred&lt;/code> program will overwrite the contents of a file with randomized data such that recovery of this file will be very difficult, if not impossible.&lt;/p>
&lt;p>Instead of using the common &lt;code>rm&lt;/code> command to delete something, please use the &lt;code>shred&lt;/code> command, like so:&lt;/p>
&lt;pre>&lt;code>shred -u somefile
&lt;/code>&lt;/pre>
&lt;p>The above command will overwrite the file with random data, and then remove (unlink) it.&lt;/p>
&lt;p>If we want to be even more secure, we can pass over the file seven times to ensure that reconstruction is nearly impossible, then remove it:&lt;/p>
&lt;pre>&lt;code>shred -v -n 6 -z -u somefile
&lt;/code>&lt;/pre></description></item><item><title>Manuals: Communicating</title><link>https://hpcc.ucr.edu/manuals/hpc_cluster/users/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://hpcc.ucr.edu/manuals/hpc_cluster/users/</guid><description>
&lt;h2 id="communicating-with-others">Communicating with others&lt;/h2>
&lt;p>The cluster is a shared resource, and communicating with other users can help to schedule large computations.&lt;/p>
&lt;p>&lt;strong>Looking-Up Specific Users&lt;/strong>&lt;/p>
&lt;p>A convenient overview of all users and their lab affiliations can be retrieved with the following command:&lt;/p>
&lt;pre>&lt;code class="language-bash">user_details.sh
&lt;/code>&lt;/pre>
&lt;p>You can search for specific users by running:&lt;/p>
&lt;pre>&lt;code class="language-bash">MATCH1='username1' # Searches by real name, and username, and email address and PI name
MATCH2='username2'
user_details.sh | grep -P &amp;quot;$MATCH1|$MATCH2&amp;quot;
&lt;/code>&lt;/pre>
&lt;p>&lt;strong>Listing Users with Active Jobs on the Cluster&lt;/strong>
To get a list of usernames:&lt;/p>
&lt;pre>&lt;code class="language-bash">squeue --format '%u' | sort | uniq
&lt;/code>&lt;/pre>
&lt;p>To get the list of real names:&lt;/p>
&lt;pre>&lt;code class="language-bash">grep &amp;lt;(user_details.sh | awk '{print $2,$3,$4}') -f &amp;lt;(squeue --format '%u' --noheader | sort | uniq) | awk '{print $1,$2}'
&lt;/code>&lt;/pre>
&lt;p>To get the list of emails:&lt;/p>
&lt;pre>&lt;code class="language-bash">grep &amp;lt;(user_details.sh | awk '{print $4,$5}') -f &amp;lt;(squeue --format '%u' --noheader | sort | uniq) | awk '{print $2}'
&lt;/code>&lt;/pre></description></item><item><title>Manuals: Terminal-based Working Environments</title><link>https://hpcc.ucr.edu/manuals/hpc_cluster/terminalide/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://hpcc.ucr.edu/manuals/hpc_cluster/terminalide/</guid><description>
&lt;h2 id="terminal-ides">Terminal IDEs&lt;/h2>
&lt;p>This page introduces &lt;a href="https://github.com/tmux/tmux/wiki">tmux&lt;/a> and &lt;a href="https://neovim.io/">Neovim&lt;/a> as terminal-based working environments for
working efficiently on remote systems like HPC clusters or cloud systems. They can
be used independently or in combination, and provide many useful
functionalities for working in local or remote terminal environments.
Users who prefer a more graphical environment,
&lt;a href="https://hpcc.ucr.edu/manuals/hpc_cluster/selected_software/vscode/">VSCode&lt;/a>
might be a good alternative.&lt;/p>
&lt;h2 id="tmux-virtual-terminal-multiplexer">Tmux: virtual terminal multiplexer&lt;/h2>
&lt;p>&lt;a href="https://github.com/tmux/tmux/wiki">Tmux&lt;/a> is a virtual terminal multiplexer providing persistent terminal sessions
that are de- and re-attachable. It is an incredible useful tool for terminal-based
work on remote systems. Major advantages of tmux are:&lt;/p>
&lt;ul>
&lt;li>Work on remote system cannot get lost due to disconnects. One can always re-attach to a session from the same or different computers.&lt;/li>
&lt;li>Many useful functionalities &amp;lsquo;&lt;em>power charge&lt;/em>&amp;rsquo; terminals.&lt;/li>
&lt;/ul>
&lt;p>&lt;a href="https://www.gnu.org/software/screen/">Screen&lt;/a> is a related virtual terminal multiplexer tool that can be used as an alternative (not covered here).
It has similar functionalities as tmux.&lt;/p>
&lt;p>Tmux can be downloaded and installed from
&lt;a href="https://github.com/tmux/tmux/wiki/Installing">here&lt;/a>. A custom tmux (and nvim)
environment with extensions can be installed by HPCC users with a
single command (here &lt;code>Install_Nvim-R_Tmux&lt;/code>). The same
script also installs several useful Nvim plugins (see
&lt;a href="https://hpcc.ucr.edu/manuals/hpc_cluster/terminalide/#nvim-introduction">below&lt;/a>).
Alternatively, the install script can be downloaded from
&lt;a href="https://github.com/tgirke/Nvim-R_Tmux#2-installation">here&lt;/a>. After installing
the provided tmux environment in a user account, one needs to log out and in
again to activate the environment. Note, installing the custom environment is
optional and not required for any of the following examples. Users also need to
be aware that the install script will make changes to their .bashrc&lt;code> and&lt;/code>.tmux.conf` files. If this is not desirable, then one can install the
components stepwise, or run the install, and then undo any configuration changes by
following the instructions printed to the screen during the install.&lt;/p>
&lt;center>&lt;img title="tmux" src="https://assets-global.website-files.com/607f4f6df411bd9e447dc7d8/607f4f6df411bd02d27dcb8b_tmux-vim-style-nav-with-fzf.gif" width="600">&lt;/center>
&lt;center>Tmux: Window Split into Several Panes&lt;/center>
&lt;h3 id="important-considerations-for-virtual-tmux-sessions">Important considerations for virtual tmux sessions&lt;/h3>
&lt;ul>
&lt;li>Both tmux and screen sessions run on the system, where they were initialized.&lt;/li>
&lt;li>To reattach to a specific session on a remote system, like the HPCC cluster, one needs to first log in to the same node (here headnode) and then re-attach to the corresponding tmux session.&lt;/li>
&lt;li>It is important not to run tmux (or screen) sessions on computer nodes since tmux sessions are persistent. Instead tmux sessions should be run on a headnode. From an open tmux session one can then log in to a computer node via &lt;code>srun&lt;/code>, or just submit jobs from a tmux session with &lt;code>sbatch&lt;/code>.&lt;/li>
&lt;/ul>
&lt;h3 id="start-tmux">Start Tmux&lt;/h3>
&lt;ul>
&lt;li>&lt;code>module load tmux&lt;/code>: only required on systems that use environment modules, and the tmux load command is not specified in a user&amp;rsquo;s .bashrc file&lt;/li>
&lt;li>&lt;code>tmux&lt;/code>: starts a new tmux session&lt;/li>
&lt;li>&lt;code>tmux a&lt;/code>: attaches to an existing session, or a default session of a system, &lt;em>e.g.&lt;/em> specified under &lt;code>~/.tmux.conf&lt;/code>&lt;/li>
&lt;li>&lt;code>tmux attach -t &amp;lt;id&amp;gt;&lt;/code>: attaches to a running session selected under &lt;code>&amp;lt;id&amp;gt;&lt;/code>&lt;/li>
&lt;li>&lt;code>tmux ls&lt;/code>: lists existing tmux sessions&lt;/li>
&lt;/ul>
&lt;h3 id="prefix">Prefix&lt;/h3>
&lt;p>The prefix for controlling tmux depends on a user&amp;rsquo;s settings in their &lt;code>~/.tmux.conf&lt;/code> file.&lt;/p>
&lt;ul>
&lt;li>&lt;code>Ctrl-b&lt;/code>: default is hard to type, and thus often not preferred&lt;/li>
&lt;li>&lt;code>Ctrl-a&lt;/code>: more commonly used, also on HPCC&lt;/li>
&lt;/ul>
&lt;p>The prefix can be changed by placing the following lines into &lt;code>~/.tmux.conf&lt;/code>.&lt;/p>
&lt;pre>&lt;code class="language-sh">unbind C-b
set -g prefix C-a
&lt;/code>&lt;/pre>
&lt;h3 id="mouse-support">Mouse Support&lt;/h3>
&lt;p>Mouse support in tmux can be enabled with the following command.&lt;/p>
&lt;ul>
&lt;li>&lt;code>Ctrl-a : set -g mouse on&lt;/code>&lt;/li>
&lt;/ul>
&lt;p>To turn mouse support on by default, include on a separate line of &lt;code>~/.tmux.conf&lt;/code> this command: &lt;code>set -g mouse on&lt;/code>&lt;/p>
&lt;h3 id="important-keybindings-for-tmux">Important keybindings for tmux&lt;/h3>
&lt;p>Tmux sessions are organized in panes, windows and sessions themselves, where a
window can have a single or several panes, and a session a single or several
windows. The following commands for controlling tmux are organized by pane-,
window- and session-level commands.&lt;/p>
&lt;p>&lt;strong>Pane-level commands&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>&lt;code>Ctrl-a %&lt;/code>: splits pane vertically&lt;/li>
&lt;li>&lt;code>Ctrl-a “&lt;/code>: splits pane horizontally&lt;/li>
&lt;li>&lt;code>Ctrl-a o&lt;/code> or &lt;code>Ctrl-a &amp;lt;arrow keys&amp;gt;&lt;/code>: jumps cursor to next pane&lt;/li>
&lt;li>&lt;code>Ctrl-a Ctrl-o&lt;/code>: swaps panes&lt;/li>
&lt;li>&lt;code>Ctrl-a &amp;lt;space bar&amp;gt;&lt;/code>: rotates pane arrangement&lt;/li>
&lt;li>&lt;code>Ctrl-a Alt &amp;lt;left or right&amp;gt;&lt;/code>: resizes to left or right&lt;/li>
&lt;li>&lt;code>Ctrl-a Esc &amp;lt;up or down&amp;gt;&lt;/code>: resizes to left or right&lt;/li>
&lt;li>&lt;code>Ctrl-a z&lt;/code>: zoom into split pane (full window view); press again to zoom out&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Window-level commands&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>&lt;code>Ctrl-a n&lt;/code>: switches to next tmux window&lt;/li>
&lt;li>&lt;code>Ctrl-a Ctrl-a&lt;/code>: switches to previous tmux window&lt;/li>
&lt;li>&lt;code>Ctrl-a c&lt;/code>: creates a new tmux window; any tmux window can be closed by typing &lt;code>exit&lt;/code> on the command prompt&lt;/li>
&lt;li>&lt;code>Ctrl-a 1&lt;/code>: switches to specific tmux window selected by number&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Session-level commands&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>&lt;code>Ctrl-a d&lt;/code>: detaches from current session&lt;/li>
&lt;li>&lt;code>Ctrl-a s&lt;/code>: switch between available tmux sessions&lt;/li>
&lt;li>&lt;code>$ tmux new -s &amp;lt;name&amp;gt;&lt;/code>: starts new session with a specific name&lt;/li>
&lt;li>&lt;code>$ tmux ls&lt;/code>: lists available tmux session(s)&lt;/li>
&lt;li>&lt;code>$ tmux attach -t &amp;lt;id&amp;gt;&lt;/code>: attaches to specific tmux session&lt;/li>
&lt;li>&lt;code>$ tmux attach&lt;/code>: reattaches to session&lt;/li>
&lt;li>&lt;code>$ tmux kill-session -t &amp;lt;id&amp;gt;&lt;/code>: kills a specific tmux session&lt;/li>
&lt;li>&lt;code>Ctrl-a : kill-session&lt;/code>: kills a session from tmux command mode that can be initiated with Ctrl-a :&lt;/li>
&lt;/ul>
&lt;h2 id="vimnvim-overview">Vim/Nvim overview&lt;/h2>
&lt;p>Vim is a widely used, extremely powerful and versatile text editor for coding
that is usually available on most Linux, Unix and macOS systems by default, and
also can be installed on Windows. The newer version is called Neovim or Nvim.
The main advantages of Nvim compared to Vim are its better performance and its
built-in terminal emulator facilitating the communication among Nvim and
interactive programming environments, such as command-lines, octave, R, etc.
Since Vim and Nvim are managed independently, one can easily install and use
them in parallel on the same system without interfering with each other. The
usage of Nvim is almost identical to Vim. Emacs is a powerful alternative that
can be used as an alternative to Nvim.&lt;/p>
&lt;center>&lt;img title="neovim" src="https://user-images.githubusercontent.com/16662357/128590006-0fc1451f-fac1-49b2-bb95-8aba21bfa44e.gif" width="600">&lt;/center>
&lt;center>Neovim Example with Autocompletion&lt;/center>
&lt;h3 id="nvim-introduction">Nvim introduction&lt;/h3>
&lt;p>The following opens a file (here &lt;code>myfile&lt;/code>) with nvim (or vim). If nvim is not
found then one might need to load it with &lt;code>module load neovim&lt;/code> first. A custom
nvim/tmux environment with extensions can be installed by HPCC users with the
&lt;code>Install_Nvim-R_Tmux&lt;/code> command. For details about this install script, see the
corresponding tmux section
&lt;a href="https://hpcc.ucr.edu/manuals/hpc_cluster/terminalide/#tmux-virtual-terminal-multiplexer">above&lt;/a>.&lt;/p>
&lt;h3 id="open-file-with-nvim">Open file with Nvim&lt;/h3>
&lt;pre>&lt;code class="language-sh">nvim myfile.txt # for neovim (or 'vim myfile.txt' for vim)
&lt;/code>&lt;/pre>
&lt;p>Tip: to always load Nvim with the standard &lt;code>vim&lt;/code> command, one can add &lt;code>alias vim=nvim&lt;/code> to &lt;code>~/.bashrc&lt;/code>.&lt;/p>
&lt;h3 id="three-main-modes">Three main modes&lt;/h3>
&lt;p>Within Vim/Nvim, there are three main modes: normal, insert and command mode. The most important commands
for navigating between the three modes are:&lt;/p>
&lt;ul>
&lt;li>&lt;code>i&lt;/code>: The &lt;code>i&lt;/code> key switches from the normal mode to the insert mode. The latter is used for typing.&lt;/li>
&lt;li>&lt;code>Esc&lt;/code>: The &lt;code>Esc&lt;/code> key switches from the insert mode back to the normal mode.&lt;/li>
&lt;li>&lt;code>:&lt;/code>: The &lt;code>:&lt;/code> key starts the command mode at the bottom of the screen.&lt;/li>
&lt;/ul>
&lt;h3 id="most-important-modifier-keys">Most important modifier keys&lt;/h3>
&lt;p>The arrow keys can be used to move the cursor in the text. Using &lt;code>Fn Up/Down key&lt;/code> allows to page through
the text quicker (more on this below). In the following command overview, all commands starting with &lt;code>:&lt;/code> need to be typed in the command mode.
All other commands are typed in the normal mode after pressing the &lt;code>Esc&lt;/code> key.&lt;/p>
&lt;ul>
&lt;li>&lt;code>:w&lt;/code>: save changes to file. If you are in editing mode you have to hit &lt;code>Esc&lt;/code> first.&lt;/li>
&lt;li>&lt;code>:q&lt;/code>: quit file that has not been changed&lt;/li>
&lt;li>&lt;code>:wq&lt;/code>: save and quit file&lt;/li>
&lt;li>&lt;code>:!q&lt;/code>: quit file without saving any changes&lt;/li>
&lt;/ul>
&lt;h3 id="mouse-support-1">Mouse support&lt;/h3>
&lt;p>When enabled, one can position the cursor anywhere with the mouse as well as resize Nvim split windows, and switch the scope from one window split to another.&lt;/p>
&lt;ul>
&lt;li>&lt;code>:set mouse=n&lt;/code> # enables mouse support, also try the &lt;code>a&lt;/code> option&lt;/li>
&lt;li>&lt;code>:set mouse-=n&lt;/code> # disables mouse support&lt;/li>
&lt;/ul>
&lt;p>To enable mouse support by default, add &lt;code>set mouse=n&lt;/code> to Nvim’s config file located in a user’s home under &lt;code>~/.config/nvim/init.vim&lt;/code>. The corresponding config file
for the older Vim version is &lt;code>~/.vimrc&lt;/code>.&lt;/p>
&lt;h3 id="moving-around">Moving around&lt;/h3>
&lt;ul>
&lt;li>&lt;code>arrow_keys&lt;/code>: move cursor in the text&lt;/li>
&lt;li>&lt;code>Fn Up/Down&lt;/code>: faster scrolling via paging.&lt;/li>
&lt;li>&lt;code>$&lt;/code> or &lt;code>0&lt;/code>: jump to back or beginning of line&lt;/li>
&lt;li>&lt;code>G&lt;/code> or &lt;code>gg&lt;/code>: jump to end of document and back to beginning&lt;/li>
&lt;li>&lt;code>w&lt;/code> or &lt;code>b&lt;/code>: move forward and backward by word&lt;/li>
&lt;li>&lt;code>)&lt;/code> or &lt;code>(&lt;/code>: move forward and backward by sentence&lt;/li>
&lt;/ul>
&lt;h3 id="important-keybindings">Important keybindings&lt;/h3>
&lt;ul>
&lt;li>&lt;code>:split&lt;/code> or &lt;code>:vsplit&lt;/code>: splits viewport (similar to pane split in tmux)&lt;/li>
&lt;li>&lt;code>gz&lt;/code>: maximizes size of viewport in normal mode (similar to Tmux&amp;rsquo;s &lt;code>Ctrl-a z&lt;/code> zoom utility)&lt;/li>
&lt;li>&lt;code>Ctrl-w w&lt;/code>: jumps cursor to other viewport and back&lt;/li>
&lt;li>&lt;code>Ctrl-w r&lt;/code>: swaps viewports&lt;/li>
&lt;li>&lt;code>Ctrl-w =&lt;/code>: resizes splits to equal size&lt;/li>
&lt;li>&lt;code>:resize &amp;lt;+5 or -5&amp;gt;&lt;/code>: resizes height by specified value&lt;/li>
&lt;li>&lt;code>:vertical resize &amp;lt;+5 or -5&amp;gt;&lt;/code>: resizes width by specified value&lt;/li>
&lt;li>&lt;code>Ctrl-w H&lt;/code> or &lt;code>Ctrl-w K&lt;/code>: toggles between horizontal/vertical splits&lt;/li>
&lt;li>&lt;code>:vsplit term://bash&lt;/code> or &lt;code>:terminal&lt;/code>: opens terminal in split mode or in a separate window, respectively.&lt;/li>
&lt;li>&lt;code>Ctrl-s and Ctrl-x&lt;/code>: freezes/unfreezes vim (some systems)&lt;/li>
&lt;/ul>
&lt;h3 id="powerful-features-of-command-mode">Powerful features of command mode&lt;/h3>
&lt;p>For example, search and replace with regular expression support. A detailed overview for using regular expressions in vim is &lt;a href="https://learnbyexample.gitbooks.io/vim-reference/content/Regular_Expressions.html">here&lt;/a>.&lt;/p>
&lt;ul>
&lt;li>&lt;code>/&lt;/code> or &lt;code>?&lt;/code>: search in text forward and backward&lt;/li>
&lt;li>&lt;code>:%s/search_pattern/replace_pattern/cg&lt;/code>: replacement syntax&lt;/li>
&lt;/ul>
&lt;h3 id="set-command">Set command&lt;/h3>
&lt;p>The &lt;code>set&lt;/code> command typed in the command mode provides access to a large number of additional functions. Only a small number of examples is given here.
For a more complete listing type &lt;code>:set all&lt;/code> or consult the vim help with &lt;code>:help&lt;/code>.&lt;/p>
&lt;ul>
&lt;li>&lt;code>:set wrap&lt;/code> or &lt;code>:set nowrap&lt;/code>: toggle for turning line wrapping on/off&lt;/li>
&lt;li>&lt;code>:set number&lt;/code> or &lt;code>:set nonumber&lt;/code>: toggle for turning line nubers on/off&lt;/li>
&lt;li>&lt;code>:set syntax=bash&lt;/code>: toggle syntax highlighting for different languages (e.g. python, perl, bash, etc) or turn off with &lt;code>set syntax=off&lt;/code>&lt;/li>
&lt;/ul>
&lt;h3 id="visual-mode">Visual mode&lt;/h3>
&lt;ul>
&lt;li>Initialized from normal mode with &lt;code>v&lt;/code>, &lt;code>V&lt;/code> or &lt;code>Ctrl + v&lt;/code>.&lt;/li>
&lt;li>Delete and copy selected text with &lt;code>d&lt;/code> and &lt;code>y&lt;/code>, respectively. For paste use &lt;code>p&lt;/code> from normal mode. The copied (yanked) text is stored in a separate vim clipboard.&lt;/li>
&lt;/ul>
&lt;h3 id="copy-and-delete-lines">Copy and delete lines&lt;/h3>
&lt;ul>
&lt;li>&lt;code>yy&lt;/code>: copies line where cursor is or those that are selected via visual mode. Paste works with &lt;code>p&lt;/code> as above.&lt;/li>
&lt;li>&lt;code>dd&lt;/code>: deletes line where cursor is or those that are selected via visual mode.&lt;/li>
&lt;/ul>
&lt;h3 id="indentation-guides">Indentation guides&lt;/h3>
&lt;p>Vertical indentation lines (guides) are useful for tracking context in code. To
enable indentation lines in nvim, one can use the
&lt;a href="https://github.com/lukas-reineke/indent-blankline.nvim">indent-blankline.nvim&lt;/a>
plugin. Installation and configuration instructions for this plugin are &lt;a href="https://github.com/tgirke/Nvim-R_Tmux#28-indentation-guides">here&lt;/a>.&lt;/p>
&lt;center>&lt;img title="indent_blankline" src="https://private-user-images.githubusercontent.com/12900252/265404807-64a1a3c6-74e6-4183-901d-ad94c1edc59c.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTE0MjAxMTgsIm5iZiI6MTcxMTQxOTgxOCwicGF0aCI6Ii8xMjkwMDI1Mi8yNjU0MDQ4MDctNjRhMWEzYzYtNzRlNi00MTgzLTkwMWQtYWQ5NGMxZWRjNTljLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDAzMjYlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwMzI2VDAyMjMzOFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTk4Nzk0NTk1YTg0YjZiMjZmMjA5MGYzYTAxNGFhNDg2ZjY4OTJkYTg5NWYxOTkwYTEwMzQxZDc0ZTVhMGY5ZjUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.ICFg324IVK2rsQBWBmc8MqNAF2Bh9zyRihje4FxFnkg" width="600">&lt;/center>
&lt;center>Indentation Guides with `indent-blankline.nvim` Plugin&lt;/center>
&lt;h3 id="help">Help&lt;/h3>
&lt;p>Vim has a comprehensive built-in help system. To access and navigate it, here
are some important commands. For a more detailed overview, visit this
&lt;a href="https://www.seanh.cc/2020/08/02/how-to-use-vim's-built-in-help/">Built-in Vim
Help&lt;/a> page.&lt;/p>
&lt;ul>
&lt;li>&lt;code>:help&lt;/code>: opens vim help system (&lt;code>:q&lt;/code> closes it)&lt;/li>
&lt;li>&lt;code>Ctrl-]&lt;/code> or &lt;code>Ctrl-[&lt;/code>: use in help to jump to tagged topic&lt;/li>
&lt;li>&lt;code>:help helphelp&lt;/code>: opens help as a file&lt;/li>
&lt;li>&lt;code>:help quickhelp&lt;/code> or &lt;code>:help index&lt;/code>: short help overview&lt;/li>
&lt;/ul>
&lt;h3 id="file-browser-built-into-vim-nerdtree">File browser built into vim: &lt;code>NERDtree&lt;/code>&lt;/h3>
&lt;p>NERDtree provides file browser functionality for Vim. To enable it, the NERDtree plugin needs to be installed. It is included in the account configuration
with &lt;code>Install_Nvim-R_Tmux&lt;/code> mentioned &lt;a href="https://hpcc.ucr.edu/manuals/hpc_cluster/terminalide/#tmux-virtual-terminal-multiplexer">above&lt;/a>. To use NERDtree, open
a file with vim/nvim and then type in normal mode &lt;code>zz&lt;/code>. The same command closes NERDtree. Note the default for opening NERDtree is &lt;code>:NERDtree&lt;/code> which has been
remapped here to &lt;code>zz&lt;/code> for quicker access.
The basic NERDtree usage is explained &lt;a href="https://github.com/tgirke/Nvim-R_Tmux#33-basic-nerdtree-usage">here&lt;/a>.&lt;/p>
&lt;center>&lt;img title="nerdtree" src="https://miro.medium.com/v2/resize:fit:828/format:webp/1*yFuOEvHxG9U0AUjrDlpbrQ.png" width="600">&lt;/center>
&lt;center>NERDtree in action&lt;/center>
&lt;h3 id="useful-resources-for-learning-vimnvim">Useful resources for learning vim/nvim&lt;/h3>
&lt;ul>
&lt;li>&lt;a href="http://www.openvim.com">Interactive Vim Tutorial&lt;/a>&lt;/li>
&lt;li>&lt;a href="http://vimdoc.sourceforge.net/">Official Vim Documentation&lt;/a>&lt;/li>
&lt;li>&lt;a href="../../manuals/linux_basics/vim/">HPCC Linux Manual&lt;/a>&lt;/li>
&lt;/ul>
&lt;h2 id="nvim-for-r-users-with-nvim-r-plugin">Nvim for R users with &lt;code>nvim-R&lt;/code> plugin&lt;/h2>
&lt;h3 id="basics">Basics&lt;/h3>
&lt;p>The &lt;code>Nvim-R&lt;/code> plugin provides a powerful command-line working environment for R
where users can send code from an R/Rmd script opened in Nvim to the R console.
Essentially, this provides an RStudio like working environment within a terminal,
which is often more flexible when working on remote systems than a GUI solution.
It also can be combined with tmux to support &amp;lsquo;persistent&amp;rsquo; R sessions that can be de- and
re-attached (see tmux session above).&lt;/p>
&lt;center>&lt;img title="Nvim-R" src="https://raw.githubusercontent.com/jalvesaq/Nvim-R/master/Nvim-R.gif" >&lt;/center>
&lt;center>Nvim-R IDE for R&lt;/center>
&lt;h3 id="quick-configuration-in-user-accounts">Quick configuration in user accounts&lt;/h3>
&lt;p>The following steps 1-3 can be skipped if Nvim, Tmux and nvimR are already configured on a user&amp;rsquo;s system or account. One can also follow the &lt;a href="https://github.com/tgirke/Nvim-R_Tmux">detailed
instructions&lt;/a> for installing &lt;code>Nvim-R-Tmux&lt;/code> from scratch.&lt;/p>
&lt;ol>
&lt;li>Log in to your user account on HPCC and execute &lt;code>Install_Nvim-R_Tmux&lt;/code> (old: &lt;code>install_nvimRtmux&lt;/code>). Additional details on this install are given in the tmux section &lt;a href="https://hpcc.ucr.edu/manuals/hpc_cluster/terminalide/#tmux-virtual-terminal-multiplexer">above&lt;/a>. Alternatively, one can use the step-by-step install &lt;a href="https://github.com/tgirke/Nvim-R_Tmux">here&lt;/a>.&lt;/li>
&lt;li>To enable the nvim-R-tmux environment, log out and in again.&lt;/li>
&lt;li>Follow usage instructions of next section.&lt;/li>
&lt;/ol>
&lt;h3 id="basic-usage-of-nvim-r-tmux">Basic usage of Nvim-R-Tmux&lt;/h3>
&lt;p>The official and much more detailed user manual for &lt;code>Nvim-R&lt;/code> is available &lt;a href="https://github.com/jalvesaq/Nvim-R/blob/master/doc/Nvim-R.txt">here&lt;/a>.
The following gives a short introduction into the basic usage of Nvim-R-Tmux:&lt;/p>
&lt;p>&lt;strong>1. Start tmux session&lt;/strong> (optional)&lt;/p>
&lt;p>Note, running Nvim from within a tmux session is optional. Skip this step if tmux functionality is not required (&lt;em>e.g.&lt;/em> re-attaching to sessions on remote systems).&lt;/p>
&lt;pre>&lt;code class="language-sh">tmux # starts a new tmux session
tmux a # attaches to an existing session
&lt;/code>&lt;/pre>
&lt;p>&lt;strong>2. Open nvim-connected R session&lt;/strong>&lt;/p>
&lt;p>Open a &lt;code>*.R&lt;/code> or &lt;code>*.Rmd&lt;/code> file with &lt;code>nvim&lt;/code> and intialize a connected R session with &lt;code>\rf&lt;/code>. This command can be remapped to other key combinations, e.g. uncommenting lines 10-12 in &lt;code>.config/nvim/init.vim&lt;/code> will remap it to the &lt;code>F2&lt;/code> key. Note, the resulting split window between Nvim and R behaves like a split viewport in &lt;code>nvim&lt;/code> meaning the usage of &lt;code>Ctrl-w w&lt;/code> followed by &lt;code>i&lt;/code> and &lt;code>Esc&lt;/code> is important for navigation (for details see vim usage above).&lt;/p>
&lt;pre>&lt;code class="language-sh">nvim myscript.R # or *.Rmd file
&lt;/code>&lt;/pre>
&lt;p>&lt;strong>3. Send R code from nvim to the R pane&lt;/strong>&lt;/p>
&lt;p>Single lines of code can be sent from nvim to the R console by pressing the
space bar. To send several lines at once, one can select them in nvim&amp;rsquo;s visual
mode and then press the space bar. Please note, the default command for sending
code lines in the nvim-r-plugin is &lt;code>\l&lt;/code>. This key binding has been remapped in
the provided &lt;code>.config/nvim/init.vim&lt;/code> file to the space bar. Most other key
bindings (shortcuts) still start with the &lt;code>\&lt;/code> as LocalLeader, &lt;em>e.g.&lt;/em> &lt;code>\rh&lt;/code>
opens the help for a function/object where the curser is located in nvim. More
details on this are given below.&lt;/p>
&lt;h3 id="important-keybindings-for-tmux-1">Important keybindings for tmux&lt;/h3>
&lt;p>See corresponding tmux section &lt;a href="https://hpcc.ucr.edu/manuals/hpc_cluster/terminalide/#important-keybindings-for-tmux">above&lt;/a>.&lt;/p>
&lt;h2 id="nvim-r-like-solutions-for-bash-python-and-other-languages">&lt;code>Nvim-R&lt;/code>-like solutions for Bash, Python and other languages&lt;/h2>
&lt;h3 id="basics-1">Basics&lt;/h3>
&lt;p>For languages other than R one can use the
&lt;a href="https://github.com/jalvesaq/vimcmdline">vimcmdline&lt;/a> plugin for nvim (or vim).
Supported languages include Bash, Python, Golang, Haskell, JavaScript, Julia,
Jupyter, Lisp, Macaulay2, Matlab, Prolog, Ruby, and Sage. The nvim terminal
also colorizes the output, as in the screenshot below, where different colors
are used for general output, positive and negative numbers, and the prompt
line.&lt;/p>
&lt;center>&lt;img title="vimcmdline" src="https://cloud.githubusercontent.com/assets/891655/7090493/5fba2426-df71-11e4-8eb8-f17668d9361a.png" >&lt;/center>
&lt;center>vimcmdline&lt;/center>
&lt;h3 id="install">Install&lt;/h3>
&lt;p>To install it, one needs to copy from the &lt;code>vimcmdline&lt;/code> resository the directories
&lt;code>ftplugin&lt;/code>, &lt;code>plugin&lt;/code> and &lt;code>syntax&lt;/code> and their files to &lt;code>~/.config/nvim/&lt;/code>. For
user accounts of UCR’s HPCC, the above install script &lt;code>Install_Nvim-R_Tmux&lt;/code> (old: &lt;code>install_nvimRtmux&lt;/code>) includes the
install of &lt;code>vimcmdline&lt;/code> (since 09-Jun-18).&lt;/p>
&lt;h3 id="usage">Usage&lt;/h3>
&lt;p>The usage of &lt;code>vimcmdline&lt;/code> is very similar to &lt;code>nvim-R&lt;/code>. To start a connected terminal session, one
opens with nvim a code file with the extension of a given language (&lt;em>e.g.&lt;/em> &lt;code>*.sh&lt;/code> for Bash or &lt;code>*.py&lt;/code> for Python),
while the corresponding interactive interpreter session is initiated
by pressing the key sequence &lt;code>\s&lt;/code> (corresponds to &lt;code>\rf&lt;/code> under &lt;code>nvim-R&lt;/code>). Subsequently, code lines can be sent
with the space bar. More details are available &lt;a href="https://github.com/jalvesaq/vimcmdline">here&lt;/a>.&lt;/p></description></item><item><title>Manuals: Parallel Evaluations in R</title><link>https://hpcc.ucr.edu/manuals/hpc_cluster/parallelr/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://hpcc.ucr.edu/manuals/hpc_cluster/parallelr/</guid><description>
&lt;h1 id="overview">Overview&lt;/h1>
&lt;p>R provides a variety of packages for parallel computations. One of the most
comprehensive parallel computing environments for R is &lt;a href="https://mllg.github.io/batchtools/articles/batchtools.html">&lt;code>batchtools&lt;/code>&lt;/a>
(formerly &lt;code>BatchJobs&lt;/code>). It supports both multi-core and multi-node computations with and
without schedulers. By making use of cluster template files, most schedulers
and queueing systems are also supported (e.g. Torque, Sun Grid Engine, Slurm).&lt;/p>
&lt;h2 id="r-code-of-this-section">R code of this section&lt;/h2>
&lt;p>To simplify the evaluation of the R code of this page, the corresponding text version
is available for download from &lt;a href="https://bit.ly/3m5QmMU">here&lt;/a>.&lt;/p>
&lt;h2 id="parallelization-with-batchtools">Parallelization with batchtools&lt;/h2>
&lt;p>The following introduces the usage of &lt;code>batchtools&lt;/code> for a computer cluster using SLURM as scheduler (workload manager).&lt;/p>
&lt;h2 id="set-up-working-directory-for-slurm">Set up working directory for SLURM&lt;/h2>
&lt;p>First login to your cluster account, open R and execute the following lines. This will
create a test directory (here &lt;code>mytestdir&lt;/code>), redirect R into this directory and then download
the required files:&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://bit.ly/3Oh9dRO">&lt;code>slurm.tmpl&lt;/code>&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://bit.ly/3KPBwou">&lt;code>.batchtools.conf.R&lt;/code>&lt;/a>&lt;/li>
&lt;/ul>
&lt;pre>&lt;code class="language-r">dir.create(&amp;quot;mytestdir&amp;quot;)
setwd(&amp;quot;mytestdir&amp;quot;)
download.file(&amp;quot;https://bit.ly/3Oh9dRO&amp;quot;, &amp;quot;slurm.tmpl&amp;quot;)
download.file(&amp;quot;https://bit.ly/3KPBwou&amp;quot;, &amp;quot;.batchtools.conf.R&amp;quot;)
&lt;/code>&lt;/pre>
&lt;h2 id="load-package-and-define-some-custom-function">Load package and define some custom function&lt;/h2>
&lt;p>This is the test function (here toy example) that will be run on the cluster for demonstration
purposes. It subsets the &lt;code>iris&lt;/code> data frame by rows, and appends the host name and R version of each
node where the function was executed. The R version to be used on each node can be
specified in the &lt;code>slurm.tmpl&lt;/code> file (under &lt;code>module load&lt;/code>).&lt;/p>
&lt;pre>&lt;code class="language-r">library('RenvModule')
module('load','slurm') # Loads slurm among other modules
library(batchtools)
myFct &amp;lt;- function(x) {
result &amp;lt;- cbind(iris[x, 1:4,],
Node=system(&amp;quot;hostname&amp;quot;, intern=TRUE),
Rversion=paste(R.Version()[6:7], collapse=&amp;quot;.&amp;quot;))
}
&lt;/code>&lt;/pre>
&lt;h2 id="submit-jobs-from-r-to-cluster">Submit jobs from R to cluster&lt;/h2>
&lt;p>The following creates a &lt;code>batchtools&lt;/code> registry, defines the number of jobs and resource requests, and then submits the jobs to the cluster
via SLURM.&lt;/p>
&lt;pre>&lt;code class="language-r">reg &amp;lt;- makeRegistry(file.dir=&amp;quot;myregdir&amp;quot;, conf.file=&amp;quot;.batchtools.conf.R&amp;quot;)
Njobs &amp;lt;- 1:4 # Define number of jobs (here 4)
ids &amp;lt;- batchMap(fun=myFct, x=Njobs)
done &amp;lt;- submitJobs(ids, reg=reg, resources=list(partition=&amp;quot;short&amp;quot;, walltime=60, ntasks=1, ncpus=1, memory=1024))
waitForJobs() # Wait until jobs are completed
&lt;/code>&lt;/pre>
&lt;h2 id="summarize-job-status">Summarize job status&lt;/h2>
&lt;p>After the jobs are completed one instect their status as follows.&lt;/p>
&lt;pre>&lt;code class="language-r">getStatus() # Summarize job status
showLog(Njobs[1])
# killJobs(Njobs) # # Possible from within R or outside with scancel
&lt;/code>&lt;/pre>
&lt;h2 id="accessassemble-results">Access/assemble results&lt;/h2>
&lt;p>The results are stored as &lt;code>.rds&lt;/code> files in the registry directory (here &lt;code>myregdir&lt;/code>). One
can access them manually via &lt;code>readRDS&lt;/code> or use various convenience utilities provided
by the &lt;code>batchtools&lt;/code> package.&lt;/p>
&lt;pre>&lt;code class="language-r">readRDS(&amp;quot;myregdir/results/1.rds&amp;quot;) # reads from rds file first result chunk
loadResult(1)
lapply(Njobs, loadResult)
reduceResults(rbind) # Assemble result chunks in single data.frame
do.call(&amp;quot;rbind&amp;quot;, lapply(Njobs, loadResult))
&lt;/code>&lt;/pre>
&lt;h2 id="remove-registry-directory-from-file-system">Remove registry directory from file system&lt;/h2>
&lt;p>By default existing registries will not be overwritten. If required one can exlicitly
clean and delete them with the following functions.&lt;/p>
&lt;pre>&lt;code class="language-r">clearRegistry() # Clear registry in R session
removeRegistry(wait=0, reg=reg) # Delete registry directory
# unlink(&amp;quot;myregdir&amp;quot;, recursive=TRUE) # Same as previous line
&lt;/code>&lt;/pre>
&lt;h2 id="load-registry-into-r">Load registry into R&lt;/h2>
&lt;p>Loading a registry can be useful when accessing the results at a later state or
after moving them to a local system.&lt;/p>
&lt;pre>&lt;code class="language-r">from_file &amp;lt;- loadRegistry(&amp;quot;myregdir&amp;quot;, conf.file=&amp;quot;.batchtools.conf.R&amp;quot;)
reduceResults(rbind)
&lt;/code>&lt;/pre></description></item><item><title>Manuals: SSH Keys</title><link>https://hpcc.ucr.edu/manuals/hpc_cluster/sshkeys/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://hpcc.ucr.edu/manuals/hpc_cluster/sshkeys/</guid><description>
&lt;blockquote>
&lt;p>The below links to detailed instructions. A shorter but more comprehensive summary for all major OSs is available &lt;a href="https://hpcc.ucr.edu/manuals/access/login/#ssh-keys">here&lt;/a>.&lt;/p>
&lt;/blockquote></description></item><item><title>Manuals: Visualization</title><link>https://hpcc.ucr.edu/manuals/hpc_cluster/visual/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://hpcc.ucr.edu/manuals/hpc_cluster/visual/</guid><description>
&lt;h2 id="compute-node">Compute Node&lt;/h2>
&lt;p>We support running graphical programs on the cluster using &lt;code>VNC&lt;/code>. For more information refer to &lt;a href="../../manuals/hpc_cluster/jobs/#desktop-environments">Desktop Environments&lt;/a>.&lt;/p>
&lt;h2 id="gpu-workstation">GPU Workstation&lt;/h2>
&lt;p>If a remote compute node does not fit your needs then we also have a GPU workstation specifically designed for rendering high resolution 3D graphics.&lt;/p>
&lt;h3 id="hardware">Hardware&lt;/h3>
&lt;ul>
&lt;li>Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz&lt;/li>
&lt;li>DDR4 256GB @ 2400 MHz&lt;/li>
&lt;li>NVIDIA Corporation GM204GL [Quadro M5000]&lt;/li>
&lt;li>1TB RAID 1 HDD&lt;/li>
&lt;/ul>
&lt;h3 id="software">Software&lt;/h3>
&lt;p>The GPU workstation is uniquely configured to be an extension of the HPCC cluster. Thus, all software available to the cluster is also available on the GPU workstation through &lt;a href="../../about/software/modules/">Environment Modules&lt;/a>.&lt;/p>
&lt;h3 id="access">Access&lt;/h3>
&lt;p>The GPU workstation is currently located in the Genomics building room 1208. Please check ahead of time to make sure the machine is available &lt;a href="mailto:support@hpcc.ucr.edu">support@hpcc.ucr.edu&lt;/a>.
Once you have access to the GPU workstation, login with your cluster credentials. If your username does not appear in the list, you may need to click &lt;code>Not listed?&lt;/code> at the bottom of the screen so that you are able to type in your username.&lt;/p>
&lt;h4 id="usage">Usage&lt;/h4>
&lt;p>There are 2 ways to use the GPU workstation:&lt;/p>
&lt;ol>
&lt;li>Local - Run processes directly on the GPU workstation hardware&lt;/li>
&lt;li>Remote - Run processes remotely on the GPU cluster hardware&lt;/li>
&lt;/ol>
&lt;p>&lt;strong>Local&lt;/strong>&lt;/p>
&lt;p>Local usage is very simple. Open a terminal and use the &lt;a href="../../manuals/hpc_cluster/start/#modules">Environment Modules&lt;/a> to load the desired software, then run your software from the terminal.
For example:&lt;/p>
&lt;pre>&lt;code class="language-bash">module load amira
Amira
&lt;/code>&lt;/pre>
&lt;p>&lt;strong>Remotely&lt;/strong>&lt;/p>
&lt;p>Open a terminal and submit a job. This is to reserve the time on the remote GPU node. Then once your job has started connect to the remote GPU node via ssh and forward the graphics back to the GPU workstation.
For example:&lt;/p>
&lt;ol>
&lt;li>
&lt;p>Submit a job for March 28th, 2018 at 9:30am for a duration of 24 hours, 4 cpus, 100GB memory:&lt;/p>
&lt;pre>&lt;code class="language-bash">sbatch --begin=2018-03-28T09:30:00 --time=24:00:00 -p gpu --gres=gpu:1 --mem=100g --cpus-per-task=4 --wrap='echo ${CUDA_VISIBLE_DEVICES} &amp;gt; ~/.CUDA_VISIBLE_DEVICES; sleep infinity'
&lt;/code>&lt;/pre>
&lt;p>Read about &lt;a href="../../manuals/hpc_cluster/jobs/#gpu-jobs">GPU jobs&lt;/a> for more information regarding the above.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Run the VirtualGL client in order to receive 3D graphics from the remove GPU node:&lt;/p>
&lt;pre>&lt;code class="language-bash">vglclient &amp;amp;
&lt;/code>&lt;/pre>
&lt;/li>
&lt;li>
&lt;p>Wait for the job to start, and then check where your job is running:&lt;/p>
&lt;pre>&lt;code class="language-bash">GPU_NODE=$(squeue -h -p gpu -u $USER -o '%N'); echo $GPU_NODE
&lt;/code>&lt;/pre>
&lt;/li>
&lt;li>
&lt;p>The above command should result in a GPU node name, which you then need to SSH directly into with the following:&lt;/p>
&lt;pre>&lt;code class="language-bash">ssh -XY $GPU_NODE
&lt;/code>&lt;/pre>
&lt;/li>
&lt;li>
&lt;p>Once you have SSH&amp;rsquo;ed into the remote GPU node, run setup the environment and run your software:&lt;/p>
&lt;pre>&lt;code class="language-bash">export NO_AT_BRIDGE=1
module load amira
vglrun -display :$(head -1 ~/.CUDA_VISIBLE_DEVICES) Amira
&lt;/code>&lt;/pre>
&lt;/li>
&lt;/ol></description></item><item><title>Manuals: Singularity Jobs</title><link>https://hpcc.ucr.edu/manuals/hpc_cluster/singularity/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://hpcc.ucr.edu/manuals/hpc_cluster/singularity/</guid><description>
&lt;h2 id="what-is-singularity">What is Singularity&lt;/h2>
&lt;p>In short, &lt;code>Singularity&lt;/code> is a program that will allow a user to run code or command, within a customized environment.
We will refer to this customized environment as a &lt;code>container&lt;/code>.
This type of container system is common, the more popular one being &lt;a href="https://www.docker.com/">Docker&lt;/a>.
Since &lt;code>Docker&lt;/code> requires root access and HPC users are not typically granted these permissions, we use Singularity instead.
&lt;code>Docker&lt;/code> containers can be used via &lt;code>Singularity&lt;/code>, with varying compatibility.&lt;/p>
&lt;p>&lt;code>Singularity&lt;/code> is forking into 2 branches:&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://sylabs.io/">Singularity-CE&lt;/a> - Community Edition from Sylabs.io&lt;/li>
&lt;li>&lt;a href="https://apptainer.org/">Apptainer&lt;/a> - Original Sinularity open source project&lt;/li>
&lt;/ul>
&lt;p>We will be using &lt;code>Apptainer&lt;/code> when it is ready for production use.
However, in the meantime, &lt;code>singularity-ce&lt;/code> is currently availble on the cluster.&lt;/p>
&lt;h2 id="limitations">Limitations&lt;/h2>
&lt;p>Currently we are not supporting Slurm jobs being submitted from within a container.
If you load the container &lt;code>centos/7.9&lt;/code> and try to submit a job from within it will fail.
Please contact support in order to work around this issue.&lt;/p>
&lt;p>Additionally, the building of Singularity contains on the cluster is not possible due to the steps requiring elevated permissions. If custom containers are required, you will need to build them on a machine that you have root/&lt;code>sudo&lt;/code> access on (such as your local machine) or use a &lt;a href="https://cloud.sylabs.io/builder">Remote Builder&lt;/a>.&lt;/p>
&lt;h2 id="how-to-use-singularity">How to use Singularity&lt;/h2>
&lt;p>You can use Singularity by running &lt;code>module load singularity&lt;/code>.
You can run &lt;code>singularity&lt;/code> in an interactive mode by calling a shell, or you can run &lt;code>singularity&lt;/code> in a non-interactive mode and just pass it a script/program to run.
These 2 modes are very similar to job submission on the cluster; &lt;code>srun&lt;/code> is used for interactive, while &lt;code>sbatch&lt;/code> is used for non-interactive.&lt;/p>
&lt;h3 id="pulling-container-images">Pulling Container Images&lt;/h3>
&lt;p>The first step in using Singularity is to get a container to run inside of. Containers can be custom built, pulled from &lt;a href="https://hub.docker.com/">Docker Hub&lt;/a> or from &lt;a href="https://cloud.sylabs.io/">SyLabs Container Library&lt;/a>.&lt;/p>
&lt;p>For example, if you wanted to run your program within an Ubuntu environment you could use the following command to pull the Ubuntu 22.04:&lt;/p>
&lt;pre>&lt;code class="language-bash"># From Singularity Library:
singularity pull library://library/default/ubuntu:22.04
# From Docker Hub:
singularity pull docker://ubuntu:22.04
&lt;/code>&lt;/pre>
&lt;p>Note that the environment within these containers will be limited, mainly you lose the ability to use the module system. This is expected, as the environment (and the operating system) within the container will be different than the one we are running on our nodes. Even if you are able to get the modules mounted within the container, compatability can not be guatanteed as different libraries versions and packages might be present within the container that the modules were not compiled with.&lt;/p>
&lt;blockquote>
&lt;p>NOTE: If you get an error similar to &amp;ldquo;unexpected HTTP status: 401&amp;rdquo;, make sure your &lt;a href="https://cloud.sylabs.io/dashboard#projects">project&lt;/a> on the Container Builder website is set to &amp;ldquo;Public&amp;rdquo;.&lt;/p>
&lt;/blockquote>
&lt;h4 id="hpcc-provided-images">HPCC Provided Images&lt;/h4>
&lt;p>In an attempt to preserve some legacy software, we created a CentOS 7 image that integrates with the old CentOS 7 modules. Access to the CentOS 7 container can be granted by running &lt;code>module load centos/7.9&lt;/code>. This will set the &lt;code>CENTOS7_SING&lt;/code> environment variable, which is the location of the CentOS 7 container image. Usage examples of the CentOS 7 image are in the below sections.&lt;/p>
&lt;h3 id="building-container-images">Building Container Images&lt;/h3>
&lt;p>In order to build a custom image, you must use a machine you have &lt;code>sudo&lt;/code> access on or use a &lt;a href="https://cloud.sylabs.io/builder">Remote Builder&lt;/a>.&lt;/p>
&lt;h4 id="localsudo-machine">Local/Sudo Machine&lt;/h4>
&lt;p>Installing Singularity is outside of the scope for this tutorial. Please see the &lt;a href="https://docs.sylabs.io/guides/3.9/admin-guide/installation.html">Installing SingularityCE&lt;/a> steps.&lt;/p>
&lt;p>Once Singularity is installed, you must create a definition (def) file. More details on creating a definition file can be found on the Singularity &lt;a href="https://docs.sylabs.io/guides/3.9/user-guide/definition_files.html">The Definition File&lt;/a> documentation, but a simple definition file of a Debian container that installs Python3 is the following:&lt;/p>
&lt;pre>&lt;code>BootStrap: docker
From: debian:12
%post
apt-get update -y
apt-get install -y python3
&lt;/code>&lt;/pre>
&lt;p>If the above file was named &amp;ldquo;debian.def&amp;rdquo;, then an image could be build using &lt;code>singularity build debian.sif debian.def&lt;/code>. This will create an image called &lt;code>debian.sif&lt;/code> that can be ran using the sections below.&lt;/p>
&lt;h4 id="remote-builder">Remote Builder&lt;/h4>
&lt;p>After signing up for the &lt;a href="https://cloud.sylabs.io/builder">remote builder &lt;/a>, log in using the steps from &lt;code>singularity remote login&lt;/code>.&lt;/p>
&lt;p>After logging in, you must create a definition file. We can use the same &amp;ldquo;debian.sif&amp;rdquo; file from the &amp;ldquo;Local/Sudo Machine&amp;rdquo; section. With the definition file, build the container image using &lt;code>singularity build --remote debian.sif debian.def&lt;/code>. After the image has been built and downloaded, you can run it using the sections below.&lt;/p>
&lt;h3 id="interactive-singularity">Interactive Singularity&lt;/h3>
&lt;p>When running singularity you need to provide the path to a &lt;code>singularity&lt;/code> image file.
For example, this would be the most basic way to get a shell within your container:&lt;/p>
&lt;pre>&lt;code class="language-bash">module load singularity
singularity pull library://library/default/ubuntu:22.04
singularity shell ubuntu_22.04.sif
cat /etc/os-release # Inside Container
&amp;gt; PRETTY_NAME=&amp;quot;Ubuntu 22.04.4 LTS&amp;quot;
&amp;gt; NAME=&amp;quot;Ubuntu&amp;quot;
&amp;gt; VERSION_ID=&amp;quot;22.04&amp;quot;
&lt;/code>&lt;/pre>
&lt;p>To run the CentOS 7 container:&lt;/p>
&lt;pre>&lt;code class="language-bash">module load centos
singularity shell $CENTOS7_SING
&lt;/code>&lt;/pre>
&lt;p>Additionally, there is a special shortcut for the &lt;code>centos&lt;/code> module that allows us to run the above more simply, as:&lt;/p>
&lt;pre>&lt;code class="language-bash">module load centos
centos.sh
&lt;/code>&lt;/pre>
&lt;p>While running containers on a head node is technically possible, compute resources are still limited. You can use the following commands to run a job on a compute node:&lt;/p>
&lt;p>Ubuntu:&lt;/p>
&lt;pre>&lt;code class="language-bash">module load singularity
singularity pull library://library/default/ubuntu:22.04
singularity shell ubuntu_22.04.sif
srun -p epyc --mem=1g -c 4 --time=2:00:00 --pty singularity shell ubuntu_22.04.sif
hostname # Inside container
&amp;gt; r21
cat /etc/os-release # Inside container
&amp;gt; PRETTY_NAME=&amp;quot;Ubuntu 22.04.4 LTS&amp;quot;
&amp;gt; NAME=&amp;quot;Ubuntu&amp;quot;
&amp;gt; VERSION_ID=&amp;quot;22.04
&lt;/code>&lt;/pre>
&lt;p>CentOS 7:&lt;/p>
&lt;pre>&lt;code class="language-bash">module load centos
srun -p epyc --mem=1g -c 4 --time=2:00:00 --pty centos.sh
cat /etc/os-release # Inside container
&amp;gt; PRETTY_NAME=&amp;quot;CentOS Linux 7 (Core)&amp;quot;
&amp;gt; NAME=&amp;quot;CentOS Linux&amp;quot;
&amp;gt; VERSION_ID=&amp;quot;7&amp;quot;
&lt;/code>&lt;/pre>
&lt;h3 id="non-interactive-singularity">Non-Interactive Singularity&lt;/h3>
&lt;p>When running singularity non-interactivly, the same basic rules apply. We need a path to our &lt;code>singularity&lt;/code> image file as well as a command to run.&lt;/p>
&lt;h4 id="basics">Basics&lt;/h4>
&lt;p>For example, here is the basic syntax:&lt;/p>
&lt;pre>&lt;code class="language-bash">module load singularity
singularity exec /path/to/singularity/image someCommand
&lt;/code>&lt;/pre>
&lt;p>Using &lt;code>ubuntu.sif&lt;/code> as an example, you can execute an abitraty command like so:&lt;/p>
&lt;pre>&lt;code class="language-bash">module load singularity
singularity pull library://library/default/ubuntu:22.04
singularity exec ubuntu_22.04.sif cat /etc/os-release
&lt;/code>&lt;/pre>
&lt;p>And using our CentOS 7 image:&lt;/p>
&lt;pre>&lt;code class="language-bash">module load singularity
singularity exec $CENTOS7_SING cat /etc/redhat-release
&lt;/code>&lt;/pre>
&lt;h4 id="shortcuts">Shortcuts&lt;/h4>
&lt;p>Using the &lt;code>centos.sh&lt;/code> shortcut that we provide for CentOS 7:&lt;/p>
&lt;pre>&lt;code class="language-bash">module load centos
centos.sh &amp;quot;cat /etc/redhat-release&amp;quot;
&lt;/code>&lt;/pre>
&lt;p>Here is a more complex example with modules:&lt;/p>
&lt;pre>&lt;code class="language-bash">module load centos
centos.sh &amp;quot;module load samtools; samtools --help&amp;quot;
&lt;/code>&lt;/pre>
&lt;h4 id="jobs">Jobs&lt;/h4>
&lt;p>Here is an example job submitted using an Ubuntu container:&lt;/p>
&lt;pre>&lt;code class="language-bash">module load singularity
singularity pull library://library/default/ubuntu:22.04
sbatch -p epyc --wrap=&amp;quot;singularity exec ubuntu_22.04.sif cat /etc/os-release; whoami; date&amp;quot;
&lt;/code>&lt;/pre>
&lt;p>Here is an example submitted as a job using the CentOS 7 container:&lt;/p>
&lt;pre>&lt;code class="language-bash">module load centos
sbatch -p epyc --wrap=&amp;quot;centos.sh 'module load samtools; samtools --help'&amp;quot;
&lt;/code>&lt;/pre>
&lt;h4 id="variables">Variables&lt;/h4>
&lt;p>Here is an example with passing environment variables:&lt;/p>
&lt;pre>&lt;code class="language-bash">export SINGULARITYENV_SOMETHING='stuff'
centos.sh 'echo $SOMETHING'
&lt;/code>&lt;/pre>
&lt;blockquote>
&lt;p>Notice: Just add the &lt;code>SINGULARITYENV_&lt;/code> prefix to pass any varibales to the centos container.&lt;/p>
&lt;/blockquote>
&lt;h4 id="enable-gpus">Enable GPUs&lt;/h4>
&lt;p>First review how to submit a GPU job from &lt;a href="../../manuals/hpc_cluster/jobs/#gpu-jobs">here&lt;/a>.
Then request an interactive GPU job, or embed one of the following within your submission script.&lt;/p>
&lt;p>In order to enable GPUs within your container you need to add the &lt;code>--nv&lt;/code> option to the singularity command:&lt;/p>
&lt;pre>&lt;code class="language-bash">module load centos
singularity exec -nv $CENTOS7_SING cat /etc/redhat-release
&lt;/code>&lt;/pre>
&lt;p>However, when using the &lt;code>centos&lt;/code> shortcut it is easier to just set the following environment variable then run &lt;code>centos.sh&lt;/code> as usual:&lt;/p>
&lt;pre>&lt;code class="language-bash">module load centos
export SINGULARITY_NV=1
centos.sh
&lt;/code>&lt;/pre>
&lt;h2 id="singularity-usecases">Singularity Usecases&lt;/h2>
&lt;p>In addition to using Singularity to run operating system containers (Debian, Ubuntu, CentOS, etc), it can also be used to run certain software on the cluster.&lt;/p>
&lt;p>The most prominent example of this is AlphaFold. If you are interested in using AlphaFold on the cluster, see the &lt;a href="https://hpcc.ucr.edu/manuals/hpc_cluster/selected_software/alphafold/">AlphaFold Usage on HPCC&lt;/a> page of our documentation. In addition to AlphaFold, we also offer &lt;code>freefem&lt;/code> and &lt;code>prymetime&lt;/code> through singularity, available by using &lt;code>module load freefem&lt;/code> and &lt;code>module load prymetime&lt;/code> respectively, and runnable with using &lt;code>singularity shell $FREEFEM_SING&lt;/code> and &lt;code>singularity shell $PRYMETIME_SING&lt;/code>.&lt;/p></description></item><item><title>Manuals: Data Transfer</title><link>https://hpcc.ucr.edu/manuals/hpc_cluster/data/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://hpcc.ucr.edu/manuals/hpc_cluster/data/</guid><description>
&lt;blockquote>
&lt;p>These pages describe how to use common data transfer software on the UCR HPCC cluster.&lt;/p>
&lt;/blockquote></description></item></channel></rss>