Skip to content

First Steps to using Alma

The objective of this section is to briefly present the different software tools which are made available on Alma HPC. This includes: code versioning tools, Alma GUI version or AlmaOnDemand, package manager tools, among others.

It is recommended to set up carefully these tools prior to running scripts on Alma. This will ensure that the majority of the further help tutorials are accurate for your environment.

If you don't think any of this is going to be necessary, feel free to skip it.

1. Setting up ssh key for Alma (credits to VSCode and blissweb on StackOverflow)

In order to seamlessly log into Alma without having to type in your password, you need to set up an ssh key. Here are the steps to do so:

  1. Generate public and a private key

Mac/Linux/WSL:

# command line
ssh-keygen -t ed25519 -f ~/.ssh/id_ed25519-remote-ssh

Windows:

# command line
ssh-keygen -t ed25519 -f "$HOME\.ssh\id_ed25519-remote-ssh"

IMPORTANT! Two keys will be generated:

id_ed25519-remote-ssh (private key)

id_ed25519-remote-ssh.pub (public key)

  1. Follow the 'quick start Using SSH Key' from the following tutorial: https://code.visualstudio.com/docs/remote/troubleshooting#_quick-start-using-ssh-keys BUT set PUBKEYPATH to id_ed25519-remote-ssh.pub

  2. Modify your Users/USERNAME/.ssh/config file locally

Add the following information to your config file:

Host alma
HostName alma.icr.ac.uk
User USERNAME
IdentityFile ~/.ssh/id_ed25519-remote-ssh

  1. Copy public key

Open /Users/USERNAME/.ssh/id_ed25519-remote-ssh.pub and copy the key

# command line
cat /Users/USERNAME/.ssh/id_ed25519-remote-ssh.pub
Copy the displayed key

  1. Log into Alma normally

  2. Modify your /home/USERNAME/.ssh/authorized_keys on the server Use preferred text editor, here is an example using vi:

    # command line
    vi (or any other text editor) /home/USERNAME/.ssh/authorized_keys
    
    Paste the key into the file

2. Github account

If you don't have a github account, you can create one here. Use your ICR email address for the GitHub account, and once you have one raise a ticket with the scientific computing helpdesk and ask to be added to the ICR GitHub organisation.

3. Git with ssh keys

Many applications are integrated with git and having it set up correctly from the outset will save problems down the line. You want to have it set up with ssh access from Alma to GitHub. - Check first if you have ssh keys set up.
- Otherwise generate a new one. - Then add the ssh key to github.

For gitlab, the ~/.ssh/config file needs the gitlab credentials added as:

# Private GitLab instance
Host ralcraft.git.icr.ac.uk
  PreferredAuthentications publickey
  IdentityFile ~/.ssh/id_ecdsa
Note that is seems to prefer the ecdsa keygen: ssh-keygen -t ecdsa -C comment add the contents of the .pub file to the ssh keys on gitlab (search ssh-key).

4. The Alma fileshare

Samba servers exist for mounting easily a remote system on your Machine, for both SCRATCH and RDS. This allows you to move files between your local machine and Alma and to edit files directly on Alma. If you prefer (or need to access home), there are various browser applications such as WinSCP for the file system.

Note for MAC users: Press Command+K and enter the Samba server address or simply open the address in your browser to mount the remote system.

  • IOS
    To have quick access to the scratch and RDS, save scratch and RDS server links in you favourites. You can do so by going 'Finder'>'Go'>'Connect to server...' and typing in the one of the following server addresses:
  • For scratch: smb://alma-fs
  • For RDS: smb://rds.icr.ac.uk/DATA

Save the server by pressing the '+' icon. Whenever you want to connect, repeat 'Finder'>'Go'>'Connect to server...' and choose the server you wish to connect to.

  • Windows
    Using file explorer, right-click on 'This PC' and choose "Add a network drive". Enter these paths:
  • For scratch: \\alma-fs\SCRATCH
  • For RDS: \\rds.icr.ac.uk\DATA

5. Conda and Mamba

You should initialise mamba and conda in your shell profile. This will make sure that the conda and mamba commands are available in your shell.
Instructions are here: Conda and Mamba Initialise

6. Python and R OnDemand

Make sure you can use the ondemand applications you will require here: Alma Open OnDemand.
RStudio uses the Alma installation of R and does not allow for environments (unlike R in scripts which can be used in a mamba environment) but jupyter notebooks can be used with a conda environment.

7. Nextflow

Alma has a shared installation of Nextflow, but you can also access nextflow through python virtual environments or conda environments. Instructions are here: Nextflow
Nextflow is used to build analysis pipelines among others. A community effort was put to collect existing set of analysis pipelines and to save in 'nf-core' pipelines. nf-core pipelines are also available on Alma, the complexity is in the way the pipelines are run on the slurm executor. Instructions are here for getting started and running an nf-core pipeline.

8. Docker and singularity

An alternative to conda environments is creating a docker image and running it on Alma through singularity. Additionally, many bioinformatics tools come available as docker images that can be run on Alma through singularity. Instructions are here: Docker and Singularity


For any help or questions email scientific computing.


9. Connection to Alma and using Slurm

Alma clusters are hosted on : alma.icr.ac.uk.

You can access your /home/ repository via ssh, as follows:

ssh alma.icr.ac.uk
Alma is using Slurm queue system for running jobs through scheduling. Users submit jobs, which are scheduled and allocated resources (CPU time, memory, etc.) by the resource manager.

Multiple partitions exist on Alma clusters: interactive, compute, GPU, data-transfer, short, among others. Compute queue is the default one. Depending on the task to accomplish one partition could be more adapted than the others. For instance, use data-transfer partition to move files, use GPU partition to run GPU adapted software/code and use interactive or short partition for ephemeral tasks.

It is important to note that any computation should not be made directly on the cluster head, but rather on a node. Below is an example of how to run an interactive session on a remote node, for 2h with 10GB max memory and 2 cores.

srun --pty --mem=10GB -c2 -t 02:00:00 -p interactive bash
A good practice is to set accurately the allocated resources (CPU time, memory and core numbers) and not to over-estimate them to get a chance to run fast your job. The node you connect to will depend on the resources you request, usually node01 if you require more compute power and node24 is less.

For more info, an extensive internal documentation is present on Nexus