TinkerCliffs, ARC’s Flagship Cluster

Overview

TinkerCliffs has 353 nodes, 44,224 CPU cores, 133 TB RAM, 112 NVIDIA A100 and 56 NVIDIA H200 GPUs. TinkerCliffs hardware is summarized in the table below.

Node Type

Base Compute Nodes

Intel Nodes

High Memory Nodes

DGX A100 GPU Nodes

A100 GPU Nodes

H200 GPU Nodes

Total

Chip

AMD EPYC 7702

Intel Xeon Platinum 9242

AMD EPYC 7702

AMD EPYC 7742

AMD EPYC 7742

Intel Xeon Platinum 8562Y+

-

Architecture

Zen 2

Cascade Lake

Zen 2

Zen 2

Zen 2

Emerald Rapids

-

Slurm features

amd

intel, avx512

amd

dgx-A100

hpe-A100

-

-

Nodes

308

16

8

10

4

7

353

GPUs

-

-

-

8x NVIDIA A100-80G

8x NVIDIA A100-80G

8x NVIDIA H200-141G

168

Cores/Node

128

96

128

128

128

64

-

Memory (GB)/Node

256

384

1,024

2,048

2,048

2,048

-

Total Cores

39,424

1,536

1,024

1,280

512

448

44,224

Total Memory (GB)

78,848

6,144

8,192

20,480

8,192

14,336

136,192

Local Disk

480GB SSD

3.2TB NVMe

480GB SSD

30TB Gen4 NVMe

11.7TB NVMe

28 TB NVMe

-

Interconnect

HDR-100 IB

HDR-100 IB

HDR-100 IB

8x HDR-200 IB

4x HDR-200 IB

8x HDR-200 IB

-

Tinkercliffs is hosted in the Steger Hall HPC datacenter on the Virginia Tech campus, so it is physically separated from other ARC HPC systems which are hosted in the AISB Datacenter at the Corporate Research Center (CRC) in Blacksburg.

An IBM ESS GPFS file system supports /projects for group collaboration and a VAST /scratch serves high-performance input/output (I/O).

Get Started

Tinkercliffs can be accessed via one of the two login nodes using your VT credentials:

  • tinkercliffs1.arc.vt.edu

  • tinkercliffs2.arc.vt.edu

For testing purposes, all users will be alloted an initial 240 core-hours for 90 days in the “personal” allocation. Researchers at the PI level are able to request resource allocations in the “free” tier (usage fully subsidized by VT) and can allocate 1,000,000 monthly Service Units among their projects.

To create an allocation, log in to the ARC allocation portal https://coldfront.arc.vt.edu

  • select or create a project

  • click the “+ Request Resource Allocation” button

  • Choose the “Compute (Free) (Cluster)” allocation type

Usage needs in excess of 1,000,000 monthly Service Units can be purchased via the ARC Cost Center.

Partitions

Users submit jobs to partitions of the cluster depending on the type of resources (CPUs or GPUs) needed. Features are optional restrictions users can indicate in their job submission to restrict the execution of their job to nodes meeting specific requirements. If users do not specify the amount of memory requested for a job, the parameter DefMemPerCPU will automatically determine the amount of memory for the job based on the number of CPU cores requested. If the users do not specify the number of CPU cores on a GPU job, the parameter DepCpuPerGPU will automatically determine the number of CPU cores based on the number of GPUs requested. Jobs will be billed against the user’s allocation accounting for the utilization of number of CPU cores, memory, and GPU time. Consult the Slurm configuration to understand how to specify the parameters for your job.

Partition

normal_q

preemptable_q

a100_normal_q

a100_preemptable_q

h200_normal_q

h200_preemptable_q

Node Type

Base Compute, Intel, High Memory

Base Compute, Intel, High Memory

DGX A100 GPU, A100 GPU

DGX A100 GPU, A100 GPU

H200 GPU

H200 GPU

Features

amd,intel,avx512

amd,intel,avx512

hpe-A100,dgx-A100

hpe-A100,dgx-A100

-

-

Number of Nodes

332

332

14

14

7

7

DefMemPerCPU (MB)

1944

1944

16056

16056

32112

32112

DefCpuPerGPU

-

-

8

8

4

4

TRESBillingWeights

CPU=1.0,Mem=0.0625G

-

CPU=1.0,Mem=0.0625G,GRES/gpu=100.0

-

CPU=1.0,Mem=0.0625G,GRES/gpu=150

-

PreemptMode

OFF

ON

OFF

ON

OFF

ON

Quality of Service (QoS)

The QOS associated with a job will affect the job in three key ways: scheduling priority, resource limits, and time limits. Each partition has a defaulq QoS named partitionname_base with a default priority, resource limits, and time limits. Users can optionally select a different QoS to increase or decrease the priority, resource limits, and time limits. The goal is to offer users multiple flexible options that adjust to their jobs needs. The long QoS allows users to run for an extended period of time (up to 14 days) but reduces the total amount of resources that can be allocated for the job. The short QoS allows users to increase the number of resources for a job but reduces the maximum time to 1 day. ARC staff reserves the right to modify the QoS settings at any point of time to ensure a fair and balanced utilization of resources among all users.

Partition

QoS

Priority

MaxWall

MaxTRESPerUser

MaxTRESPerAccount

normal_q

tc_normal_base

1000

7 days

cpu=8397,mem=18276G

cpu=16794,mem=36552G

normal_q

tc_normal_long

500

14 days

cpu=2100,mem=4569G

cpu=4199,mem=9138G

normal_q

tc_normal_short

2000

1 day

cpu=12596,mem=27414G

cpu=25191,mem=54828G

preemptable_q

tc_preemptable_base

0

30 days

cpu=1050,mem=2285G

cpu=2100,mem=4569G

a100_normal_q

tc_a100_normal_base

1000

7 days

cpu=359,mem=5642G,gres/gpu=23

cpu=717,mem=11284G,gres/gpu=45

a100_normal_q

tc_a100_normal_long

500

14 days

cpu=90,mem=1411G,gres/gpu=6

cpu=180,mem=2821G,gres/gpu=12

a100_normal_q

tc_a100_normal_short

2000

1 day

cpu=538,mem=8463G,gres/gpu=34

cpu=1076,mem=16926G,gres/gpu=68

a100_preemptable_q

tc_a100_preemptable_base

0

30 days

cpu=45,mem=706G,gres/gpu=3

cpu=90,mem=1411G,gres/gpu=6

h200_normal_q

tc_h200_normal_base

1000

7 days

cpu=90,mem=2868G,gres/gpu=12

cpu=180,mem=5735G,gres/gpu=23

h200_normal_q

tc_h200_normal_long

500

14 days

cpu=23,mem=717G,gres/gpu=3

cpu=45,mem=1434G,gres/gpu=6

h200_normal_q

tc_h200_normal_short

2000

1 days

cpu=135,mem=4301G,gres/gpu=17

cpu=269,mem=8602G,gres/gpu=34

h200_preemptable_q

tc_h200_preemptable_base

0

30 days

cpu=12,mem=359G,gres/gpu=2

cpu=23,mem=717G,gres/gpu=3

Optimization

Node Type

Base Compute Nodes

Intel Nodes

High Memory Nodes

DGX A100 GPU Nodes

A100 GPU Nodes

H200 GPU Nodes

CPU arch

Zen 2

Cascade Lake

Zen 2

Zen 2

Zen 2

Emerald Rapids

Compiler flags

-march=znver2

-march=cascadelake

-march=znver2

-march=znver2

-march=znver2

-march=native

GPU arch

-

-

-

NVIDIA A100

NVIDIA A100

NVIDIA H200

Compute Capability

-

-

-

8.0

8.0

9.0

NVCC flags

-

-

-

-gencode=arch=compute_80,code=sm_80

-gencode=arch=compute_80,code=sm_80

-gencode=arch=compute_90,code=sm_90

See the tuning guides available at https://developer.amd.com and https://www.intel.com/content/www/us/en/developer/

  • Cache locality really matters - process pinning can make a big difference on performance.

  • Hybrid programming often pays off - one MPI process per L3 cache with 4 threads is often optimal.

  • Use the appropritate -march flag to optimize the compiled code and -gencode flag when using the NVCC compiler.