Video Tutorials
ARC provides a number of video tutorials on our channel on video.vt.edu. In particular, the following sequence walks a user through the fundamentals of ARC usage in less than an hour:
Login
How to access ARC clusters for the first time (and streamlining access for subsequent logins):
Accessing Software
The following videos will walk the user through accessing software that ARC has installed or through setting up your own packages:
Note that these videos require a VT Login to access. Also, each video has a table of contents that can be used to skip between sections; this can be accessed by clicking the “hamburger” (three horizontal bars) button at the top left of the video.
We are actively updating this page.
There are many types of videos linked on this page. Generally, they fall into two classes:
General overviews.
Narrow focus on how to accomplish one thing.
With respect to bullet #2, sometimes, one video suffices for a topic. In other cases, multiple videos are required. An example of the latter are virtual environments (VEs). There are videos on why they are needed, how to use modules (in setting up general environments), how to build and use VEs of various types, and how to use VEs across programming languages. Rather than throw all of these topics into one (longer) video, we carve the issues up into several videos so one can more directly satisfy their needs. You might, for example, only want to know how to build a VE and use it in a Jupyter notebook.
We try to make very few assumptions about the experience of the user so that videos are of value to novices and those with experience. All users of all types are welcome on ARC resources and are encouraged to use them.
General Overview
These videos provides a general overview of use of ARC systems.
Overview of ARC systems.
How to get help with and how to learn about ARC resources.
Connecting to ARC Clusters
There are multiple ways to connect to ARC clusters.
How to access ARC clusters using a terminal window on your laptop and ssh (and streamlining access for subsequent logins). This approaches uses the command line for working on clusters.
How to connect to ARC clusters using VS Code. This approach uses an IDE for working on clusters.
How to access ARC clusters using Open OnDemand (OOD). This approach uses an assortment of ways for working on clusters (e.g., command line, UIs).
Directory Structures and Mounts
This video describes the various mounts/directories for doing different types of work and for storing files.
How to Run Codes—Your Own or Commercial/Open Software
Whether you run your own code, e.g., built using some programming language, or a commercial/open source code such as Ansys, you will be interacting with the scheduler. Our scheduler on ARC clusters is the Slurm Scheduler. The first video is an excellent and concise introduction for running interactive and batch jobs. The subsequent videos are motivated by cluster configurations that increase throughput of jobs on the clusters and signficantly reduce wait time. To specify more precisely job resources, concepts such as constraints are needed that are not covered in the first video.
How to run both interactive and batch jobs.
Learn the different types of compute nodes on clusters and how to specify them for your jobs. Watch this before the later videos of this section.
A more detailed example of an interactive job that uses the material in the second video—that is, how to use constraints to specify compute node types.
A more detailed example of a batch job that uses the material in the second video—that is, how to use constraints to specify compute node types.
Examples of slurm batch jobs where files (input files, code files, output files) use volatile resources [to speed file input/output]
Self-Monitoring Your Activities and Code Execution to Understanding How Your Code Is Performing
ARC clusters are communal resources. They work best when everyone knows the purposes for different types of computer nodes (i.e., computers) on a cluster. These videos describe appropriate (and inappropriate) uses, and just as importantly, how to monitor your own use of the computing resources. And monitoring your (computational) jobs can give you insights about your codes and their performance. If you find that your code in a batch job is not performing as expected/desired, you can terminate the job. Similarly, if you find that you are (inadvertently) running prohibited types of processes on head nodes, you can terminate those.
Appropriate use of login (head) nodes of ARC clusters.
How you can self-monitor your use of login (head) nodes and kill/terminate processes that should not be run on head nodes.
Working with Slurm batch jobs.
How you can self-monitor your Slurm batch jobs and understand your code’s performance.
How to terminate slurm batch jobs.
File Compression and Archiving
File compression and file archiving are two separate activities. However, they can also be used in combination. These videos describe how to compress and archive files.
How to compress files.
How to create an archive (a single file) that contains many files, and how to compress this resulting file.
File Transfer: Transferring files onto and off of clusters
One can use these tools to transfer files between and within machines/computers; here, our focus is ARC clusters.
Overview of utilities/tools for copying files onto and off of ARC clusters.
How to use scp (secure copy protocol) for copying files onto and off of ARC clusters.
How to use sftp (secure file transfer protocol) for copying files onto and off of ARC clusters.
How to use rsync for copying files onto and off of ARC clusters, and copying files within a cluster’s storage.
How to use Globus for copying (very large) files between computers (e.g., onto and off of ARC clusters) with fault tolerance and a UI.
Synopsis: Globus is a very powerful tool. While the other file transfer tools can complete their work with a single command, Globus is more of a file transfer environment. Hence it is more complicated than the other tools and has higher start-up cost for you. But do not be put off: it is not that complicated and it is VERY powerful. There is a reason why Globus is the defacto standard tool for large file transfers. Here we present a sequence of videos, which should be watched in order, because some videos depend on content from previous videos in this series. We have broken the information into shorter videos so that users can “enter” this sequence where it is appropriate for them.
Globus overview. For those who do not know Globus or want a refresher in what it does.
Globus prerequisites. Complete these steps (which are required regardless of Globus use) before working with Globus—it will make your life much easier. NEED TO ADD HOW TO MAKE PROJECT AREA VISIBLE TO GLOBUS USING COLDFRONT.
Roadmap of Globus-specific videos. This short video demonstrates the unified view in presenting primary globus features in the remaining videos.
Unlinking previous Globus identity. This video is only relevant for those people coming to VT from another institution where they had a Globus account. Because in general your Globus identity is intimately tied to your present institution, and you had a Globus account at a previous institution, you must update your Globus account so that it is affiliated with your new institution (which is presumably Virginia Tech). This video shows how to “unlink” from your previous university. Then, in videos below, you can follow those instructions to establish your new VT-affiliated Globus identity, which are the same steps as creating a VT-affiliated Globus account for someone who has never had a Globus account.
Create a Globus account and log in. This is used by new Globus users and those Globus users who have come from another institution and have unlinked their previous Globus identity. We also ensure that you can see the main VT collection.
File transfer demonstration using Globus. This demo is a file transfer within one collection on one cluster. The explanation is generalized to illustrate how to perform file transfers between two clusters using different collections. This demo also shows how the directories Globus displays are the same as those under “/projects” on the ARC clusters.
Install Globus Connect Personal (GCP) on your laptop. GCP is used with Globus to enable file transfers between your local computer and ARC clusters.
File transfer demonstration using Globus and Globus Connect Personal (GCP). GCP is used with Globus to enable file transfers between your local computer and ARC clusters.
Setup Globus Guest Collections. Globus Guest Collections (GGCs) are areas of storage that are accessible to you and other people that you designate. These “others” can be colleagues at other institutions around the world, making for powerful file sharing.
Setting permissions on files within Globus collections. Just because some directory within a collection is visible to Globus does NOT mean that all of the files and directories within the collection are visible. You can still use Linux/Unix permissions to control Globus’ access to your files and directories. This video shows the issues and how you can control file and directory visibility.
Environments
Environments are collections of software that tailor an otherwise “basic” computing ecosystem into one that supports your particular computing needs for a particular type of task. You may need different environments for your different tasks. There is a series of videos here on
motivation,
how to use modules,
how to structure directories to house your virtual environments,
how to construct and use virtual environments (VEs) for
command-line execution of code and
use with applications like Jupyter notebooks.
Motivation: why we need environments.
Modules–a backbone of customizing your environments.
Ways to think about structuring the locations of virtual environments to organize the (cluster, compute node type) for which they are used.
How to create and use Conda virtual environments on Owl (and other) clusters.
How to create and use a Python pip-venv virtual environment (VE) on Owl (and other clusters).
How to create and use Python Conda virtual environments with Jupyter notebooks (through OOD [Open OnDemand]).
How to create and use Julia virtual environments on clusters.