Conda
Why miniconda and not anaconda?
While Anaconda includes the conda
software and the conda-forge
channel, which are open-source and freely licensed, the defaults
channel is subject to paid licenses (see terms). To protect VT’s research community, ARC does not provide the Anaconda package. You may install Anaconda into your home but we recommend removing the defaults
channel.
Use module spider miniconda
to search our module system for the most recent Miniconda available on the system you’re using.
Do not run conda init
Running conda init
is a convenience for managing conda virtual environments on a single computer, but it does not produce portable results. The principle action of conda init
adds lines like this to the user’s BASH startup script ~/.bashrc
:
# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/apps/easybuild/software/tinkercliffs-rome/Miniconda3/23.10.0-1/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
eval "$__conda_setup"
else
if [ -f "/apps/easybuild/software/tinkercliffs-rome/Miniconda3/23.10.0-1/etc/profile.d/conda.sh" ]; then
. "/apps/easybuild/software/tinkercliffs-rome/Miniconda3/23.10.0-1/etc/profile.d/conda.sh"
else
export PATH="/apps/easybuild/software/tinkercliffs-rome/Miniconda3/23.10.0-1/bin:$PATH"
fi
fi
unset __conda_setup
# <<< conda initialize <<<
Notably, you can see that explicit references are made to paths which are specific to a particular node type on ARC systems, like /apps/easybuild/software/tinkercliffs-rome/...
. Such a path only exists on one node type on Tinkercliffs and will fail on a different type of Tinkercliffs node or any other cluster’s nodes. In short, conda init
produces non-portable results and so we recommend not to use it.
Use source activate
and do not use conda activate
Use of conda activate <envname>
requires the conda initialization from above, and so is not designed to work on systems where a single home directory is shared between several different nodes. Instead, use source activate <envname>
to activate conda virtual environments.
Create a virtual environment specifically for the type of node where it will be used
Each cluster has at least two different node types. Each node type is equipped with a different cpu micro-architecture, slightly different operating system and/or kernel versions, slightly different system configuration and packages. All are tuned to be customized and efficient for the particular node features. These system differences can make virtual environments non-portable between node types.
As a result, you should create and build a virtual environment on a node of the type where you will use the environment.
Example 1:
If you want to use Anaconda on Tinkercliffs a100_normal_q
nodes, then you need to build the environment from a shell on those nodes.
The important commands for this are:
command |
purpose |
---|---|
|
get an interactive command line shell on a compute node |
|
search for the latest anaconda module |
|
load a module |
|
create a new anaconda environment at the provided path |
|
activate the newly created environment |
|
install packages into the environment |
Note
$HOME “expands” in the shell to your home directory, eg. /home/jdoe2
. And envname
from above should be a short but meaninful name for the environment. Since they are particular to the node type, it is recommended to reference the node type in the name. For example tca100-science
or tcnq
for Tinkercliffs a100_normal_q
nodes or Tinkercliffs normal_q
nodes respectively.
Use conda env list
to view conde environments and their absolute paths. Use conda list
to view the packages and versions in the currently activated environment.
[jdoe2@tinkercliffs2 ~]$ interact --partition=a100_normal_q --nodes=1 --ntasks-per-node=4 --gres=gpu:1 --account=jdoeacct
srun: job 2920919 queued and waiting for resources
srun: job 2920919 has been allocated resources
[jdoe2@tc-gpu001 ~]$ module spider miniconda
--------------------------------------------------------------------------------------------------------------------
Miniconda3: Miniconda3/23.10.0-1
--------------------------------------------------------------------------------------------------------------------
Description:
Miniconda is a free minimal installer for conda. It is a small, bootstrap version of Anaconda that includes only conda, Python, the packages they depend on, and a small number of other useful packages.
You will need to load all module(s) on any one of the lines below before the "Miniconda3/23.10.0-1" module is available to load.
apps site/tinkercliffs-rome/easybuild/arc.arcadm
apps site/tinkercliffs-rome/easybuild/setup
apps site/tinkercliffs/easybuild/arc.arcadm
apps site/tinkercliffs/easybuild/setup
Help:
Description
===========
Miniconda is a free minimal installer for conda. It is a small,
bootstrap version of Anaconda that includes only conda, Python, the packages they
depend on, and a small number of other useful packages.
More information
================
- Homepage: https://docs.conda.io/en/latest/miniconda.html
[jdoe2@tc-gpu001 ~]$ module load Miniconda3/23.10.0-1
[jdoe2@tc-gpu001 ~]$ conda create -p ~/env/a100_env python=3.11
...
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
# $ conda activate /home/jdoe2/env/a100_env
#
# To deactivate an active environment, use
#
# $ conda deactivate
[jdoe2@tc-gpu001 ~]$ source activate /home/jdoe2/env/a100_env/
(/home/jdoe2/env/a100_env) [jdoe2@tc-gpu001 ~]$ conda install matplotlib
Proceed ([y]/n)? y
Downloading and Extracting Packages:
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
Using kernels with an Environment
You can use a Jupyter kernel to use a virtual environment inside a Jupyter notebook. Each kernel can be used to run different cells according to its language/package requirements. For example, if you have a notebook that uses two different sets of packages where each set is installed in a different conda environment, then you can use Jupyter kernels to switch between those two sets of packages.
To start a kernel that is associated with a specific environment, activate the environment and install ipykernel inside that environment:
[jdoe2@tinkercliffs2 ~]$ module load Miniconda3
[jdoe2@tinkercliffs2 ~]$ source activate /home/jdoe2/env/a100_env/
[jdoe2@tinkercliffs2 ~]$ conda install ipykernel
(/home/jdoe2/env/a100_env/) [jdoe2@tinkercliffs2 ~]$ python -m ipykernel install --user --name a100_env --display-name "Python (a100_env)"
Installed kernelspec a100_env in /home/jdoe2/.local/share/jupyter/kernels/a100_env
Then, when launching the Jupyter interactive app from Open OnDemand, you can start a kernel in the environment created before. From the top menu, select *Kernel -> Change kernel -> Python (a100_env), then execute your cell.
GPU - Cuda compatability
While nvidia-smi
will display a version of CUDA, this is just the base CUDA on the node and can be overridden by
loading a different CUDA module:
module spider cuda
activating an Anaconda environment which has cudatoolkit
conda list cudatoolkit
installing a conda package built with a different cuda:
conda list tensorflow
-> check the build string
A100 GPUs require CUDA 11.0 or greater
Check CUDA version in Tensorflow
import tensorflow as tf
sys_details = tf.sysconfig.get_build_info()
cuda_version = sys_details["cuda_version"]
print(cuda_version)
Check cuDNN version in TensorFlow
cudnn_version = sys_details["cudnn_version"]
print(cudnn_version)
Check CUDA version in PyTorch
torch.version.cuda