Conda Virtual Environments: How to Build Using Miniforge

ARC suggests the use of MiniForge as the preferred way to construct conda virtual environments (CVEs). (As noted elsewhere, Ananconda can no longer be used because of changes in Anaconda’s terms of use that are affecting universities nation-wide and beyond.)

Steps for Building a Conda Virtual Environment (CVE)

The steps are given concisely here. Sections below, named as the bullets here, provide additional detail if needed.

Steps for Build

  1. Log onto the machine on which you wish to run code.

  2. Identify the partition (i.e., queue) to which you will submit your job.

  3. Request resources on a compute node of that partition. These resources will be used to build the CVE. Sample form of request, for TC or Owl:

salloc --account=<account>  --partition=<partition_name>    --nodes=<number of nodes>  --ntasks-per-node=<number of tasks per node> --cpus-per-task=<number of cores per task> --time=<duration of resources>`
  1. ssh to the compute node returned from the salloc resource request. Enter:

ssh XXX

where XXX is the compute node name that is returned from the salloc command.

  1. Reset modules and load an appropriate Miniforge module. These modules will be provided by ARC. Usually, it is best to load the latest Miniforge module, which at the time of this writing on, e.g., Tinkercliffs (TC) and Owl, is Miniforge3/24.1.2-0. Enter

module reset

Then enter

module load Miniforge3/24.1.2-0
  1. Create a CVE. A typical command is

conda create -p ~/path/to/env/<name of CVE>

This is an empty CVE. There are many ways to create a virtual environment (VE), and if you use multiple ways to construct VEs, then you might want to consider putting -mf- (or similar) in the name of VEs that you create using this procedure to denote that they were built using MiniForge.

  1. Activate the newly-created CVE. Enter:

source activate ~/path/to/env/<name of CVE>
  1. Install the version of python desired. Enter:

conda install python=XXX

where XXX is the python version, e.g., 3.9, 3.12.

  1. Add packages to the CVE. Enter:

conda install <package_name>

This command can be repeated any number of times to load different packages. If you cannot load one or more packages with conda install then see the Details section below for using pip install.

  1. Deactivate the CVE. When you are done adding to the CVE, enter:

conda deactivate
  1. Leave the compute node. After you are done building the CVE, exit off the compute node by typing

exit
  1. Relinquish resources. From the (or a) head node, enter:

scancel XXX

where XXX is the slurm JOB ID corresponding to the resource request.

Note that if you find you want additional packages in your VE at this point, then you merely repeat the steps starting with step 7.

Details

Precedence of conda versus pip for package installs

Sometimes when creating a CVE, you cannot do conda install <package_name> because the package is not available from conda.

In this case, you might have to pip install <package_name>.

For these cases, first install all the packages you can with conda install and then install all of the remaining packages with pip install. This increases the odds that the installs will be compatible.

Log onto the machine on which you wish to run code

From a terminal, type ssh <username>@<clustername>.arc.vt.edu where <username> is your user name and <clustername> is the name of the cluster you are trying to log into. Examples of the latter are tinkercliff2 and owl1.

Identify the partition (i.e., queue) to which you will submit your job

To list the partitions on a cluster, type:

sinfo

or

sinfo --long

or

sinfo |   awk -F " "  'NR > 1 { a[$1]++ } END { for (b in a) { print b } }'

Request resources on a compute node of that partition

To build a CVE, it is most likely that you will only need one core of one compute node. For the sample form of resource request, for TC or Owl,

salloc --account=<account>  --partition=<partition_name>    --nodes=<number of nodes>  --ntasks-per-node=<number of tasks per node> --cpus-per-task=<number of cores per task> --time=<duration of resources>

one may take <number of nodes> as 1, <number of tasks per node as 1, and <number of cores per task> as 1. A duration <duration of resources> of two hours, i.e., 2:00:00 will usually suffice.

When slurm returns with your resources, note the names of the compute node(s) given to you and the slurm JOB ID. The names of compute nodes are used to determine which nodes to ssh into. The slurm JOB ID is used to relinquish resources when done with them, as a last step in this process.

Reset modules and load an appropriate Miniforge module

To find all occurrence of Miniforge, first write all of the modules available on the cluster by typing:

module avail >& list.of.all.modules

Then open this file list.of.all.modules and search for MiniForge to find the latest version of Miniforge. Use this full name in the module load command.

Create a CVE

There are many ways to create a virtual environment (VE), and if you use multiple ways to construct VEs, then you might want to consider putting -mf- (or similar) in the name of the VE to denote it was built using MiniForge. Different methods of generating modules result in different ways to activate them.

Add packages to the CVE

At any point, you can type conda list to list all of the packages in the VE.

Use of CVEs

You can only use a CVE on the cluster and with the type of compute nodes that was used to build the VE.