Globus
Current VT Status
Since 2023, Virginia Tech has established and maintained an institutional Globus Standard Subscription coordinated through ARC. In addition, ARC has established a Globus Data Transfer Node which provides access to /projects
storage. Individuals are also able to create personal Globus accounts and use Globus Connect Personal software on ARC and other systems.
Make a /projects
directory visible to Globus
The /projects
directories can be made visible (“shared”) via Globus. The owner (usually the PI) of the /projects
directory can enable sharing via ARC’s ColdFront allocation management site.
The PI will login to ColdFront, navigate to the corresponding “Project (free) (Storage)” allocation and check the box for “Share via Globus” and then click “update”. This will take effect immediately.
Any member of the associated group can then login to Globus.org and access/manage files there. Under “File Manager” search for “Virginia Tech” to find the “Virginia Tech ARC Globus Projects Directories” collection. Select this and then the shared directory should be visible. All
/projects
directories for which “Share via Globus” is selected will be listed, but permissions to access these directories is still restricted to members of the group just as on ARC clusters.
Filesets with "Lots of Small Files" (LoSF) are the worst-case scenario for most file systems and transfer tools. For stability and performance, it is vital that such LoSF filesets be packaged into archives via tools such as `tar`. Attempting transfers of LoSF filesets via Globus is known to cause very poor performance and faults such as `ENDPOINT_TOO_BUSY`.
Globus Connect Personal (GCP)
GCP can be used to connect a device or storage location you own to the Globus network. For example, you can make your /home/<username>
or /projects/<groupname>
group-shared directory accessible to you when you log into the Globus.org web application. When you do this, it shows up in your “Collections”. Then you can browse, upload, download, and coordinate transfers among other collections you have in the Globus web application. This can be a very powerful and enabling way to manage data among multiple institutions. Detailed information on using GCP is available on Globus’s website.
Using GCP on ARC Systems
Here is an outline of the steps to you’ll need to take to use GCP on an ARC cluster. These are derived from the more complete instrutions provided by Globus.
Connecting GCP to Globus requires and that you have an account with Globus and you will need to access their web application, so the first step is to
Log in to https://globus.com in a web browser.
On the Tinkercliffs cluster, a software module for GCP is provided.
module load tinkercliffs-rome/GlobusConnectPersonal
By loading the module, the program globusconnectpersonal
is made available to you, but it still needs to be configured.
Configuring
From the command line on an ARC system (eg. Tinkercliffs login node), load the module and then run the command
globusconnectpersonal
. If you have not already completed configuration, then it will provide you with a URL and walk you through the next two steps.Authenticate GCP client with the Globus web application by copying the provided URL into your browser. This will prompt you for some setup information and then provide an “auth code”.
Copy the “auth code” from your browser and paste it into your the command-line shell which should have a prompt waiting for this code.
(optional) Edit the file
~/.globusonline/lta/config-paths
to configure which directories GCP should use and whether or not to present them as writable in the Globus system.
Note
Any text editor can be used to modify the config file. If you don’t already have a preferred command-line text editor, then nano
may be a good choice.
Here is an example config-paths
file. It is a header-less CSV (comma separated values) file. The three fields are
the directory (ie. “folder” or “path”) to connect
[0,1]
indicating whether or not “Globus sharing” is enabled. “0” is the only viable options while VT does not have an institutional subscription to Globus.0
or1
indicating whether the directory is “not writable” or “writable”, respectively in the Globus interface.
Note
Writability in Globus also requires that the writing user actually have write permissions on the filesystem as well. So, indicating that a directory is writable for GCP does not somehow override the file/directory permissions on ARC system.
~/,0,1
/projects/proj_name,0,0
Here, two directories (~
, and /projects/proj_name
) are being made available to GCP. The last field
Note
~
is a shortcut for /home/<username>
Installing GCP on linux
Note
These are derived from the more complete instrutions provided by Globus.
Verify that you can log in to https://globus.org. If you do not already have a Globus account, you will need to create one.
Download and extract the latest GCP, then run the setup. The
ls
command is needed to determine the version number you have downloaded which you must specify tocd
to the correct directory:
# Download latest GCP
wget https://downloads.globus.org/globus-connect-personal/linux/stable/globusconnectpersonal-latest.tgz
# Extract the compressed tar file
tar xzf globusconnectpersonal-latest.tgz
# Determine the name and version of the extracted directory
ls -ld globusconnect*
# Change directory to the newly extracted on
cd globusconnectpersonal-__.__.__
# This will run the GCP setup if you have not already done so
./globusconnecpersonal
Authenticate the GCP client with the Globus website. The last step above should have provided a URL for you to copy-paste into a web browser. Navigating to that URL will connect the GCP you have installed with the Globus web app.
-----
https://auth.globus.org/v2/oauth2/authorize?client_id=4d6448ae-8ca0-40e4-aaa9-8ec8e8320621&redirect_uri=https%3A%2F%2Fauth.globus.org%2Fv2%2Fweb%2Fauth-code&scope=openid+profile+urn%3Aglobus%3Aauth%3Ascope%3Aauth.globus.org%3Aview_identity_set+urn%3Aglobus%3Aauth%3Ascope%3Atransfer.api.globus.org%3Agcp_install&state=_default&response_type=code&code_challenge=XXXXXXXXXXXX--YYYYYYYYYYYYYYYYYYYYYYYYYYYYY&code_challenge_method=S256&access_type=online&prefill_named_grant=tinkercliffs2
-----
Enter the auth code:
Complete the authentication. Review the details at the page loaded by that URL, configure as desired, and you will be provided with an “auth code” when complete. Copy that from your browser and paste it into the shell which has prompted for this and is awaiting your input.
-----
Enter the auth code: ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ
== starting endpoint setup
Input a value for the Endpoint Name: tc2
registered new endpoint, id: 5874dee8-edcf-11ed-9bb3-c9bb788c490e
setup completed successfully
Will now start globusconnectpersonal in GUI mode
Graphical environment not detected
To launch Globus Connect Personal in CLI mode, use
globusconnectpersonal -start
Or, if you want to force the use of the GUI, use
globusconnectpersonal -gui
*Start the client to make your files available to you in the Globus web app.
globusconnectpersonal -start
(optional) Edit the configuration to add other directories and set permissions