Conda Virtual Environments: How to Build Using Miniforge
ARC suggests the use of MiniForge as the preferred way to construct conda virtual environments (CVEs). (As noted elsewhere, Ananconda can no longer be used because of changes in Anaconda’s terms of use that are affecting universities nation-wide and beyond.)
Steps for Building a Conda Virtual Environment (CVE)
The steps are given concisely here. Sections below, named as the bullets here, provide additional detail if needed.
Steps for Build
Log onto the machine on which you wish to run code.
Identify the partition (i.e., queue) to which you will submit your job.
Request resources on a compute node of that partition. These resources will be used to build the CVE. Sample form of request, for TC or Owl:
salloc --account=<account> --partition=<partition_name> --nodes=<number of nodes> --ntasks-per-node=<number of tasks per node> --cpus-per-task=<number of cores per task> --time=<duration of resources>`
ssh
to the compute node returned from thesalloc
resource request. Enter:
ssh XXX
where XXX
is the compute node name that is returned from the
salloc
command.
Reset modules and load an appropriate Miniforge module. These modules will be provided by ARC. Usually, it is best to load the latest Miniforge module, which at the time of this writing on, e.g., Tinkercliffs (TC) and Owl, is Miniforge3/24.1.2-0. Enter
module reset
Then enter
module load Miniforge3/24.1.2-0
Create a CVE. A typical command is
conda create -p ~/path/to/env/<name of CVE>
This is an empty CVE. There are many ways to create a virtual environment (VE), and if you use multiple ways to construct VEs, then you might want to consider putting -mf- (or similar) in the name of VEs that you create using this procedure to denote that they were built using MiniForge.
Activate the newly-created CVE. Enter:
source activate ~/path/to/env/<name of CVE>
Install the version of python desired. Enter:
conda install python=XXX
where XXX
is the python version, e.g., 3.9, 3.12.
Add packages to the CVE. Enter:
conda install <package_name>
This command can be repeated any number of times to load different packages.
If you cannot load one or more packages with conda install
then see the
Details section below for using pip install
.
Deactivate the CVE. When you are done adding to the CVE, enter:
conda deactivate
Leave the compute node. After you are done building the CVE, exit off the compute node by typing
exit
Relinquish resources. From the (or a) head node, enter:
scancel XXX
where XXX
is the slurm JOB ID corresponding
to the resource request.
Note that if you find you want additional packages in your VE at this point, then you merely repeat the steps starting with step 7.
Details
Precedence of conda
versus pip
for package installs
Sometimes when creating a CVE, you cannot do conda install <package_name>
because the package is not available from conda.
In this case, you might have to pip install <package_name>
.
For these cases, first install all the packages you can
with conda install
and then install all of the remaining
packages with pip install
.
This increases the odds that the installs will be compatible.
Log onto the machine on which you wish to run code
From a terminal, type ssh <username>@<clustername>.arc.vt.edu
where <username>
is your user name
and
<clustername>
is the name of the cluster you are trying to log into.
Examples of the latter are tinkercliff2
and owl1
.
Identify the partition (i.e., queue) to which you will submit your job
To list the partitions on a cluster, type:
sinfo
or
sinfo --long
or
sinfo | awk -F " " 'NR > 1 { a[$1]++ } END { for (b in a) { print b } }'
Request resources on a compute node of that partition
To build a CVE, it is most likely that you will only need one core of one compute node. For the sample form of resource request, for TC or Owl,
salloc --account=<account> --partition=<partition_name> --nodes=<number of nodes> --ntasks-per-node=<number of tasks per node> --cpus-per-task=<number of cores per task> --time=<duration of resources>
one may take <number of nodes>
as 1, <number of tasks per node
as 1, and <number of cores per task>
as 1.
A duration <duration of resources>
of two hours, i.e., 2:00:00 will usually suffice.
When slurm returns with your resources, note the names of the compute node(s) given to you
and the slurm JOB ID.
The names of compute nodes are used to determine which nodes to ssh
into.
The slurm JOB ID is used to relinquish resources when done with them, as a last step
in this process.
Reset modules and load an appropriate Miniforge module
To find all occurrence of Miniforge, first write all of the modules available on the cluster by typing:
module avail >& list.of.all.modules
Then open this file list.of.all.modules and search
for MiniForge to find the latest version of Miniforge.
Use this full name in the module load
command.
Create a CVE
There are many ways to create a virtual environment (VE), and if you use multiple ways to construct VEs, then you might want to consider putting -mf- (or similar) in the name of the VE to denote it was built using MiniForge. Different methods of generating modules result in different ways to activate them.
Add packages to the CVE
At any point, you can type conda list
to list all of the packages in the
VE.
Use of CVEs
You can only use a CVE on the cluster and with the type of compute nodes that was used to build the VE.