Conda

Why miniconda and not anaconda?

While Anaconda includes the conda software and the conda-forge channel, which are open-source and freely licensed, the defaults channel is subject to paid licenses (see terms). To protect VT’s research community, ARC does not provide the Anaconda package. You may install Anaconda into your home but we recommend removing the defaults channel.

Use module spider miniconda to search our module system for the most recent Miniconda available on the system you’re using.

Do not run conda init

Running conda init is a convenience for managing conda virtual environments on a single computer, but it does not produce portable results. The principle action of conda init adds lines like this to the user’s BASH startup script ~/.bashrc:

# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/apps/easybuild/software/tinkercliffs-rome/Miniconda3/23.10.0-1/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
    eval "$__conda_setup"
else
    if [ -f "/apps/easybuild/software/tinkercliffs-rome/Miniconda3/23.10.0-1/etc/profile.d/conda.sh" ]; then
        . "/apps/easybuild/software/tinkercliffs-rome/Miniconda3/23.10.0-1/etc/profile.d/conda.sh"
    else
        export PATH="/apps/easybuild/software/tinkercliffs-rome/Miniconda3/23.10.0-1/bin:$PATH"
    fi
fi
unset __conda_setup
# <<< conda initialize <<<

Notably, you can see that explicit references are made to paths which are specific to a particular node type on ARC systems, like /apps/easybuild/software/tinkercliffs-rome/.... Such a path only exists on one node type on Tinkercliffs and will fail on a different type of Tinkercliffs node or any other cluster’s nodes. In short, conda init produces non-portable results and so we recommend not to use it.

Use source activate and do not use conda activate

Use of conda activate <envname> requires the conda initialization from above, and so is not designed to work on systems where a single home directory is shared between several different nodes. Instead, use source activate <envname> to activate conda virtual environments.

Create a virtual environment specifically for the type of node where it will be used

Each cluster has at least two different node types. Each node type is equipped with a different cpu micro-architecture, slightly different operating system and/or kernel versions, slightly different system configuration and packages. All are tuned to be customized and efficient for the particular node features. These system differences can make virtual environments non-portable between node types.

As a result, you should create and build a virtual environment on a node of the type where you will use the environment.

Example 1:

If you want to use Anaconda on Tinkercliffs a100_normal_q nodes, then you need to build the environment from a shell on those nodes.

The important commands for this are:

command

purpose

interact

get an interactive command line shell on a compute node

module spider

search for the latest anaconda module

module load

load a module

conda create -p $HOME/envname

create a new anaconda environment at the provided path

source activate $HOME/envname

activate the newly created environment

conda install ...

install packages into the environment

Note

$HOME “expands” in the shell to your home directory, eg. /home/jdoe2. And envname from above should be a short but meaninful name for the environment. Since they are particular to the node type, it is recommended to reference the node type in the name. For example tca100-science or tcnq for Tinkercliffs a100_normal_q nodes or Tinkercliffs normal_q nodes respectively.

Use conda env list to view conde environments and their absolute paths. Use conda list to view the packages and versions in the currently activated environment.

[jdoe2@tinkercliffs2 ~]$ interact --partition=a100_normal_q --nodes=1 --ntasks-per-node=4 --gres=gpu:1 --account=jdoeacct
srun: job 2920919 queued and waiting for resources
srun: job 2920919 has been allocated resources
[jdoe2@tc-gpu001 ~]$ module spider miniconda

--------------------------------------------------------------------------------------------------------------------
  Miniconda3: Miniconda3/23.10.0-1
--------------------------------------------------------------------------------------------------------------------
    Description:
      Miniconda is a free minimal installer for conda. It is a small, bootstrap version of Anaconda that includes only conda, Python, the packages they depend on, and a small number of other useful packages.


    You will need to load all module(s) on any one of the lines below before the "Miniconda3/23.10.0-1" module is available to load.

      apps  site/tinkercliffs-rome/easybuild/arc.arcadm
      apps  site/tinkercliffs-rome/easybuild/setup
      apps  site/tinkercliffs/easybuild/arc.arcadm
      apps  site/tinkercliffs/easybuild/setup

    Help:
      Description
      ===========
      Miniconda is a free minimal installer for conda. It is a small,
       bootstrap version of Anaconda that includes only conda, Python, the packages they
       depend on, and a small number of other useful packages.


      More information
      ================
       - Homepage: https://docs.conda.io/en/latest/miniconda.html

[jdoe2@tc-gpu001 ~]$ module load Miniconda3/23.10.0-1
[jdoe2@tc-gpu001 ~]$ conda create -p ~/env/a100_env python=3.11 
...

Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate /home/jdoe2/env/a100_env
#
# To deactivate an active environment, use
#
#     $ conda deactivate

[jdoe2@tc-gpu001 ~]$ source activate /home/jdoe2/env/a100_env/
(/home/jdoe2/env/a100_env) [jdoe2@tc-gpu001 ~]$ conda install matplotlib
Proceed ([y]/n)? y


Downloading and Extracting Packages:

Preparing transaction: done
Verifying transaction: done
Executing transaction: done

Using kernels with an Environment

You can use a Jupyter kernel to use a virtual environment inside a Jupyter notebook. Each kernel can be used to run different cells according to its language/package requirements. For example, if you have a notebook that uses two different sets of packages where each set is installed in a different conda environment, then you can use Jupyter kernels to switch between those two sets of packages.

To start a kernel that is associated with a specific environment, activate the environment and install ipykernel inside that environment:

[jdoe2@tinkercliffs2 ~]$ module load Miniconda3
[jdoe2@tinkercliffs2 ~]$ source activate /home/jdoe2/env/a100_env/
[jdoe2@tinkercliffs2 ~]$ conda install ipykernel
(/home/jdoe2/env/a100_env/) [jdoe2@tinkercliffs2 ~]$ python -m ipykernel install --user --name a100_env --display-name "Python (a100_env)"
Installed kernelspec a100_env in /home/jdoe2/.local/share/jupyter/kernels/a100_env

Then, when launching the Jupyter interactive app from Open OnDemand, you can start a kernel in the environment created before. From the top menu, select *Kernel -> Change kernel -> Python (a100_env), then execute your cell.

GPU - Cuda compatability

While nvidia-smi will display a version of CUDA, this is just the base CUDA on the node and can be overridden by

  • loading a different CUDA module: module spider cuda

  • activating an Anaconda environment which has cudatoolkit conda list cudatoolkit

  • installing a conda package built with a different cuda: conda list tensorflow -> check the build string

A100 GPUs require CUDA 11.0 or greater

Check CUDA version in Tensorflow

import tensorflow as tf
sys_details = tf.sysconfig.get_build_info()
cuda_version = sys_details["cuda_version"]
print(cuda_version)

Check cuDNN version in TensorFlow

cudnn_version = sys_details["cudnn_version"]  
print(cudnn_version)

Check CUDA version in PyTorch

torch.version.cuda