HPCC – Selected Research Software Usage

Manuals: AlphaFold Usage on HPCC

Mon, 01 Jan 0001 00:00:00 +0000

AlphaFold3

Loading the module

You can load AlphaFold3 using the following commands:

module load alphafold/3
singularity shell $ALPHAFOLD_SING

You can also run AlphaFold3 with a gpu. If you wish to use a GPU, log into an A100 gpu node and then use the following commands:

module load alphafold/3
singularity shell --nv $ALPHAFOLD_SING

Using AlphaFold databases

A handful of databases are available at $ALPHAFOLD_DB (available after loading the alphafold/3 module).

An example command is as follows:

module load alphafold/3
singularity shell --nv $ALPHAFOLD_SING
# Commands from here on are run inside of the Alphafold container
python3 /app/alphafold/run_alphafold.py \
--model_dir=$ALPHAFOLD_DB/model \
--db_dir=$ALPHAFOLD_DB \
--json_path=fold_input.json \
--output_dir=my_output_folder/

More information on using Alphafold3 can be found in the Alphafold3 GitHub repo, including input documentation and output documentation.

Processing Large Datasets

Sometimes the dataset cannot fit within the memory of a single GPU. In this case you’ll need to use Unified Memory (“Combined” GPU and System memory). This does come with a drop in performance, but might be the only way to get large datasets processed.

To use Unified Memory, you can add these additional flags to the alphafold command:

--env XLA_PYTHON_CLIENT_PREALLOCATE=false \
--env TF_FORCE_UNIFIED_MEMORY=true \
--env XLA_CLIENT_MEM_FRACTION=3.2

For example:

python3 /app/alphafold/run_alphafold.py \
--model_dir=$ALPHAFOLD_DB/model \
--db_dir=$ALPHAFOLD_DB \
--json_path=fold_input.json \
--env XLA_PYTHON_CLIENT_PREALLOCATE=false \
--env TF_FORCE_UNIFIED_MEMORY=true \
--env XLA_CLIENT_MEM_FRACTION=3.2 \
--output_dir=my_output_folder/

AlphaFold2

Description of AlphaFold2

Loading the module

You can load AlphaFold2 using the following commands:

module load alphafold/2
singularity shell $ALPHAFOLD_SING

You can also run AlphaFold2 with a gpu. If you wish to use a GPU, log into a P100 gpu node and then use the following commands:

module load alphafold/2
singularity shell --nv $ALPHAFOLD_SING

Using Alphafold Databases

When running the alphafold command, you will be asked for certain databases. These databases can be found under the path $DATABASE_DIR/alphafold/. They can also be accessed using the $$ALPHAFOLD_DB environment variable that is automatically set after loading the alphafold module.

Here is an example of how to write your alphafold command using the monomer preset:

python3 /app/alphafold/run_alphafold.py \
--model_preset=monomer \
--db_preset=reduced_dbs \
--use_gpu_relax=True \
--data_dir=$ALPHAFOLD_DB \
--uniref90_database_path=$ALPHAFOLD_DB/uniref90/uniref90.fasta \
--mgnify_database_path=$ALPHAFOLD_DB/mgnify/mgy_clusters_2018_12.fa \
--template_mmcif_dir=$ALPHAFOLD_DB/pdb_mmcif/mmcif_files \
--max_template_date=2020-05-14 \
--obsolete_pdbs_path=$ALPHAFOLD_DB/pdb_mmcif/obsolete.dat \
--pdb_seqres_database_path=$ALPHAFOLD_DB/pdb_seqres/pdb_seqres \
--uniprot_database_path=$ALPHAFOLD_DB/uniprot/uniprot.fasta \
--small_bfd_database_path=$ALPHAFOLD_DB/small_bfd/bfd-first_non_consensus_sequences.fasta \
--pdb70_database_path=$ALPHAFOLD_DB/pdb70/pdb70 \
--fasta_paths=<path to fasta file here> \
--output_dir=<path to output directory>

and an example using the multimer preset:

python3 /app/alphafold/run_alphafold.py \
--model_preset=multimer \
--db_preset=reduced_dbs \
--use_gpu_relax=True \
--data_dir=$ALPHAFOLD_DB \
--uniref90_database_path=$ALPHAFOLD_DB/uniref90/uniref90.fasta \
--mgnify_database_path=$ALPHAFOLD_DB/mgnify/mgy_clusters_2018_12.fa \
--template_mmcif_dir=$ALPHAFOLD_DB/pdb_mmcif/mmcif_files \
--max_template_date=2020-05-14 \
--obsolete_pdbs_path=$ALPHAFOLD_DB/pdb_mmcif/obsolete.dat \
--small_bfd_database_path=$ALPHAFOLD_DB/small_bfd/bfd-first_non_consensus_sequences.fasta \
--uniprot_database_path=$ALPHAFOLD_DB/uniprot/uniprot.fasta \
--pdb_seqres_database_path=$ALPHAFOLD_DB/pdb_seqres \
--fasta_paths=<path to fasta file> \
--output_dir=<path to output directory>

Remember to fill in your fasta path and output dir if you wish to use these templates.

Additionally, these are not the only two methods of running AlphaFold, and different modes might require different sets of arguments to be passed to alphafold.py. For more details regarding what parameters are available, as well as more examples, please refer to the Alphafold Github Repo.

Manuals: Galaxy Usage on HPCC

Mon, 01 Jan 0001 00:00:00 +0000

What is Galaxy?

Galaxy is a free and open source scientific platform designed to make research accessible and reproducible. The HPCC has configured Galaxy to work on the cluster as an OnDemand application, allowing users to launch and maintain their very own Galaxy platform. The following sections below will explain how to use Galaxy on the cluster.

How to start a Galaxy session?

To start a Galaxy session, you must access our OnDemand instance and log in. You can find more information on how to access our OnDemand instance here. Once you are logged in, select the “Galaxy” application from the “Interactive Apps” tab as you would any other OnDemand application. Once selected, you need to fill out the following fields below:

Database Directory: This field is specific to Galaxy, when galaxy is launched a ‘.galaxy’ directory needs to be created. This directory will contain all files related to your Galaxy instance such as configuration files, databases, tool files, etc. Here, you would specify where you would want the ‘.galaxy’ to reside. If no path is given then, by default the directory will be created in your “rhome” directory.

Number of cores: This field specifies the number of CPU cores you would like to allocate to your Galaxy session. The amount allocated will be used for all local running jobs. The default amount of cores allocated is 4.

Memory in GB: Similar to the “Number of cores” field, this entry specifies the amount of memory you would like to allocate to your Galaxy session. The amount allocated will be used for all local running jobs. The default amount of memory allocated is 6 GBs.

Job runtime: This field specifies the amount of time your Galaxy session will be active. The Maximum runtime is determined by the HPCC’s queue policies, which can be found here

Partition: This field specifies the partition which your Galaxy session will be dispatched to. Depending on which partition is specified, the maximum runtime may vary. Please refer to the HPCC’s queue policies for more information.

Additional Slurm Arguments: This field specifies any additional slurm flags. If a GPU node is requested then the “–gres” flag will need to be added.

Galaxy tool runner: This field is specific to Galaxy, it specifies which “runner” to use for the Galaxy session. A “runner” is how jobs will be ran during your Galaxy session. If “local” is chosen then all jobs ran from within Galaxy, will be local to the machine from which Galaxy is running on. If the session is terminated, then so will all active running Galaxy jobs. If “cluster” is chosen then all jobs will be submitted via slurm. This includes workflows, file uploads, and tools. If the Galaxy session is terminated then all jobs will still continue to run until they are completed. Please refer to the following section for more information on how the “cluster” job runner works.

After you have filled on the fields, click on the “Launch” button to have your Galaxy session queued up to run.

Galaxy on OnDemand

Once your Galaxy session is running, a user is automatically created for you to access your Galaxy instance with. This user has admin privileges within your Galaxy instance. This allows for you to have more control when it comes to using Galaxy via OnDemand on the cluster. With a Galaxy admin account you are able to do the following:

Install tools from Galaxy’s tool shed repository, which can include tools not installed on the cluster.
Upload files locally from your computer and from the cluster directly into Galaxy.
Submit jobs to slurm from within your Galaxy session.
Execute Galaxy specific admin commands and or features.

As mentioned previously, a ‘.galaxy’ directory is created when a Galaxy session is launched for the first time. This directory is where all of your specific Galaxy files and dependencies are stored, such as tools installed via the tool shed, environment module tool files, other configuration files, and Galaxy’s database. You are able to freely modify any file within this directory and should be able to see any changes made once Galaxy is reloaded. If you delete this directory, then the directory will be re-made entirely new next time you launch Galaxy.

The HPCC admins have also added some extra features to Galaxy that allow users to interact with the cluster from within Galaxy. The next section will go into detail on these features and how they work.

Extra Galaxy Features

When starting a Galaxy session, you have the option to select which ‘runner’ to use for your Galaxy session. If the runner ‘cluster’ is chosen then all jobs started within Galaxy will be submitted to slurm. When selecting this runner, you will see a cog icon added to the menu bar for your Galaxy session. Clicking this icon will create a drop-down menu where you can configure the parameters used for all your jobs submitted via slurm from within Galaxy. The current default parameters are 2 CPU cores and 2 GBs of memory. The parameters set will be saved within your Galaxy database and will carry over into your next Galaxy session. Below you can see an example on how to modify the parameters for jobs submitted through Galaxy.

The cluster’s environment module system has been configured to work with Galaxy via Galaxy’s tool configuration file system. You can interact with any module on the cluster through Galaxy via the ‘Environment modules’ section in Galaxy’s Tool bar. Below you can see where this section is located.

When interacting with an environment module tool in Galaxy, you will notice that each tool follows the same format. This is because since each environment module is unique, trying to create a unique configuration file for each module would be time-consuming. Instead, all environment module tools are created with a template that allows each tool to still be run from within Galaxy, using their command line specific commands. If this setup isn’t to your liking, you can either install the tool directly from Galaxy’s tool shed repository or edit the specific environment module tool configuration file. All environment module tool configurations can be found under your ‘.galaxy’ directory in the ‘tools’ directory under ‘modules’. Below is an example image of how an environment module tool looks, the tool shown is fastqc.

An explanation on what each parameter does can be found under each environment module tool. This manual will briefly explain what each parameter does, using the fastqc tool as an example.

What upload type would you like to use: This field specifies the files you would like to be used as input for the tool. Here you can either provide a ‘Dataset’ which is Galaxy’s custom format for files, or you can provide the full path of a file on the cluster as an input. One benefit for this method is that you don’t need to directly upload a file to the Galaxy in order for it to be used as input.

How to process file or files: This field specifies how you want the input files to be processed. Here you can either choose to process files individually or grouped. If you chose to process files individually, then the given command will run on all input files individually. Below you can see an example of how this would look using the fastqc tool via the command line.

fastqc -i file1;
fastqc -i file2;
fastqc -i file3;
...

If you chose to process files in a group, then all the files will be used as a single input to the command. Below you can see an example of how this would look using the fastqc tool via the command line.

fastqc -i file1 file2 file3 ...

Output directory: This field specifies the path on the cluster to the directory where output files will be dumped into. Each tool also has logic to parse a given output path and upload each output file automatically to Galaxy as a Dataset file.

Command: This field specifies the command the tool will use when it is executing. When entering the command, any inputs or outputs need to be specified as ‘$input’ or ‘$output’. This is because HPCC admins have configured the tool to replace these variables with the given input files and the path to the desired output directory. Below you can see an example of how this would look like, again using the fastqc tool via the command line.

fastqc -i $input -o $output

If a tool is installed from Galaxy’s tool shed repository and the same environment module tool corresponds to the specific installed tool, then the tool installed via Galaxy’s tool shed repository will take precedence over the environment module and act as the default. Most tools from Galaxy’s tool shed repository are unique to the tool they were made for, so they will have custom parameters that reflect this. This guide will not go over how to modify a tool configuration file, instead please refer to the Galaxy’s documentation on the topic.

Common Issues

While the HPCC admins have configured Galaxy on the cluster to be in a usable state, due to how detailed and large the Galaxy project is—there are still some issues the HPCC admins are still actively working to resolve. The following section details these issues. It is important to note that these issues do not hinder the usability of Galaxy on the cluster.

LocalProtocolError
When a user uploads a file from their local computer to their running Galaxy session, the following debug message can be seen in the ‘output.log’ file created for every OnDemand session.

h11._util.LocalProtocolError: Too much data for declared Content-Length

The local file will still be uploaded successfully to a user’s Galaxy session and can still be used normally within Galaxy.

Proxy Error 502
When a user attempts to install tools via Galaxy’s tool shed repository or tries to resolve dependencies for Galaxy specific tools, the following message will be displayed after 3 to 5 minutes.

The ‘output.log’ file shows no indication of this error, and while the message is still displayed—the task will still run in the background until it is fully complete. Meaning that tools installed via Galaxy’s tool shed repository will install successfully and dependencies will still be resolved after awhile. The progress of such tasks can be viewed in the ‘output.log’ file for a user’s OnDemand session.

This section will be updated continuously as issues are resolved or encountered. If you encounter any issues while using Galaxy on the cluster, please report them to our support email support@hpcc.ucr.edu

Manuals: Open OnDemand Usage

Mon, 01 Jan 0001 00:00:00 +0000

What is OnDemand?

Open OnDemand allows users to access our cluster resources purely through a web browser. No additional client software is required. OnDemand gives users the ability to launch “Interactive Apps” such as Jupyter, RStudio, Matlab, Mathematica, and VSCode and connect to them through your browser.

User’s also have the ability to upload/download files to/from the cluster, connect to the cluster via SSH, and create batch job templates.

The sections below go over using OnDemand, as well as a couple pieces of popular software.

Accessing OnDemand

Our OnDemand instance is located here: https://ondemand.hpcc.ucr.edu/. Log in with your cluster login details (UCR NetID and Password) and verify your login with Duo’s two-factor authentication.

Jupyter on OnDemand

After logging in, select “Jupyter Notebook” from the “Interactive Apps” tab from the menu bar.

From there, select the resources you need, time you want, partition to run the job on, and click “Launch”.

Your job will then be queued and eventually start running.

Click “Connect to Jupyter” to open a new window containing Jupyter and start working!

Using Remote Kernels in VSCode

VSCode allows you to run your code using a remote kernel. They provide some instructions here. Using the OnDemand Jupyter requires a couple of additional extra steps.

When you start a new Jupyter session on OnDemand, it should provide you with a command to set up an SSH Tunnel. This command should be run on your local machine and not on the cluster. Note that numbers and node name will likely be different!

At this point, you should be able to navigate to the provided URL along with the provided password to access your Jupyter session.

To connect within VSCode you’ll need the Jupyter extension installed. Within a .ipynb file, find the “Select Kernel” option in the top right of your screen, select “Existing Jupyter Server”, and paste the URL provided by OnDemand. When asked for a password, use the one provided by OnDemand.

From there you should be able to select the kernel that you would like to run.

RStudio on OnDemand

The process of launching RStudio is almost identical to that of starting Jupyter, but selecting “RStudio Server” instead of “Jupyter Notebook” from the menu.

Please see the Jupyter section for selecting resources and opening the RStudio window.

Desktop Session on OnDemand

A Desktop session is a Virtual Desktop that is running on the cluster. It will allow you to run programs that require GUIs without going through the steps of forwarding X11 sessions.

Similar to Jupyter and RStudio, a Desktop Session can be started by selecting “HPCC Desktop” from the menu dropdown.

Please see the Jupyter section for selecting resources and opening the Desktop Window.

Using GPUs on OnDemand

In many of the interactive session launch pages, the “Additional Slurm Arguments” option is available.

To select a GPU, you can use the same --gres argument as you would with the srun command or in sbatch scripts.

For example, to get 1x A100 GPU for a job, be sure to select the gpu partition and enter --gres=gpu:a100:1 in the Additional Slurm Arguments box. You might also want to load one of the “cuda” modules as well in the “Additional Modules” box.

Troubleshooting Jobs

RStudio Crashes

If your RStudio session crashes with an error similar to the following, first try increasing the memory allocated to your job. If your R program attempts to allocate too much memory it will be killed by Slurm, causing an error similar to the one pictured.

To confirm whether or not this is the problem you are encountering:

Copy the Job ID from OnDemand
Delete the job (This will remove the dialog, so make sure you copy the JobID first)
Using a terminal, run sacct -j ####

If one of the job steps existed with the reason “OUT_OF_MEM”, then you need to allocate more memory to RStudio.

Manuals: PyTorch In Jupyter

Mon, 01 Jan 0001 00:00:00 +0000

PyTorch in a Jupyter Notebook

There are many ways to run PyTorch within Jupyter, though some methods are needlessly complicated or are more prone to errors. If you intend to use PyTorch within Jupyter, the following steps should get you up and running.

Setting Up The Environment

Creating a new Conda environment is necessary as we do not provide PyTorch through our global Python installation.

conda create -n pytorch_env
conda activate pytorch_env

After activating the conda environment, install python and ipykernel.

Warning: The PyTorch instructions provide a method of installing through Conda. Do not use this method as the CUDA packages installed through Conda can conflict with the system installation.

conda install python=3 ipykernel

At this point run which pip, it should return a path ending in something similar to ...../.conda/envs/pytorch_env/bin/pip. If the output path begins with /opt/linux/... then the environment has not been set up correctly.

With ipykernel installed, add the environment as a Jupyter Kernel.

python -m ipykernel install --user --name pytorch_env --display-name "PyTorch Env"

For more info on Jupyter Kernels, see the Package Management page.

PyTorch can now be installed in the Conda environment using Pip

pip3 install torch torchvision torchaudio

From this point, the PyTorch kernel should be ready to use within Jupyter. Because of it’s requirement for a GPU, it will need to be run using OnDemand or through an interactive session.

Running on Jupyter

The below steps will focus on running it through OnDemand.

From within OnDemand, start a new Jupyter Notebook using the highlighted options below.

Once the session has started, be sure to use the “PyTorch Env” kernel that we created earlier when creating a new notebook or within existing notebooks.

The following code can be used to verify that the GPU is properly working.

import torch
dev = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
print(dev)

If the output prints “cuda”, then you’re good to go. If it prints “cpu”, then please double check that all of the above steps are correct.

Manuals: VSCode Usage on HPCC

Mon, 01 Jan 0001 00:00:00 +0000

Using VSCode on the Cluster

VSCode is a code editor that can run locally on your computer, or while connected to the cluster.

When using VSCode on the cluster, please do not use Remote SSH as it will launch the code server on a head node, causing unneeded load.

Instead, we can use a feature of VSCode: Remote Tunnels.

OnDemand

A browser-based version of VSCode can be used through OnDemand for basic programming and cluster access. Not all extensions are available through the browser-based version of VSCode, so setting up a remote tunnel might be preferred in instances where those extensions are needed. More information on using OnDemand can be found on our Open OnDemand Usage page.

Setting up VSCode Tunnels

Using a tunnel allows us to work on a compute node, rather than on a head node. This allows us to use more resources than we would normally be allowed to on a head node.

Installing the Remote Tunnels extension

On your local machine, install the “Remote - Tunnels” extension.

Starting VSCode Tunnel on the Cluster

Create an interactive session using srun

srun -p epyc -t 5:00:00 --pty -c 4 --mem=4g bash -l # Customize as needed

Load the VSCode module and start the tunnel

module load vscode
code tunnel

The program will provide you with a code and ask you to verify on GitHub.com. Follow the steps for authorization. Once you get to the “Congratulations, you’re all set!” page, the terminal will update with a new line asking you to open another link. At this point you have 2 ways to access: via a web browser, or using the extension that we previously installed. Make sure that you keep the server running in the background, as it is what allows the connection to occur.

Using A Web Browser

After authorizing VSCode, you can use the link given to access your session. The URL should be similar to https://vscode.dev/tunnel/.... The environment is very similar to the desktop program, though some features might be missing.

Using the VSCode Extension

After install the “Remote - Tunnels” extension on your local machine, connect to the Tunnel session that was previously created using the green “><” icon in the bottom left of VSCode. Select the “Connect to Tunnel…” option, then select the tunnel we created earlier.

After VSCode connects, you should be able to open Files and Folders on the cluster as if it were your local machine.

Using the Built-In Terminal

One feature that VSCode integrates is an in-editor terminal. To activate it, you can use the keyboard shortcut Ctrl+`, or via View > Terminal from the status bar.

By default, you might be dropped into a basic shell without some of the features that you are used to (eg. with the prompt bash-4.4$ instead of username@node). To fix this, you can type bash -l that should bring you to the terminal environment that you are used to, and from here you can navigate and use the cluster as if it was any other terminal program.

Cleaning Up

Once you have finished, make sure to close VSCode (locally or using your web browser). Then stop the Tunnel from running on the cluster using Ctrl+C. Once the program had been stopped, you can exit out of the interactive srun session and close your terminal.