Data Transfer

This is a service to users that need to either transfer data to their home institute / company / organisation and/or upload data to PSI (not all data locations allow upload!). Data can be transferred by two methods, SSH (scp/rsync) and Globus.

In case of questions / problems please contact datatransfer@psi.ch and/or open a ServiceNow trouble ticket.

PSI Account / MFA

Access to the data transfer services requires an active PSI or PSI ext- Account with MFA (multi factor authentication) enabled. If you don't have an account, please talk to your supervisor / contact person. 

IMPORTANT: During the authentication process you will have to provide your username/password. After entering the password you will get a PUSH notification to the MFA application (your mobile need to have Internet access for it to work!). You have to check the app and authorize your login attempt within 60 seconds. Due to technical reasons we are not able to display you a message regarding this procedure on the login screen.

The following directories are accessible with the data transfer service:

Directory NameModeExampleComments
/slsread-only/sls/x10da/e15874Raw data from SLS
/dasread-only/das/work/p15/p15874Data Ra cluster, please note a structure of subdirectories: p{AB}/p{ABCDEF}
/das/workread-write Working area Ra cluster
/sfread-only/sf/alvra/p17502/{raw,res,work}Raw data and working area for the data taken a SwissFEL facility 
/museread-only Muse data
/merlinread-write Merlin data

Overview

ssh and scp are simple but powerful tools to transfer data. They are usually available on all major operating systems and can be used out of the box. 

rsync usually needs to be installed separately. 

The hostname to be used for the datatransfer is: datatransfer.psi.ch

In the following examples, please replace  <username>, <data-path>,  <source-path>, <destination-path>, <local-data-path>, etc. with the needed usernames and paths!

 

[ADVANCED TOPIC] Connection Multiplexing

Each ssh connection to datatransfer.psi.ch will require you to provide your username/password and to approve the login via the MFA application.

While using SSH multiplexing, one can efficiently re-use an established SSH connection, authenticated via  MFA, for the subsequent connections, without the need to authenticate (with MFA) again.

You can configure this on your SSH by adding following configuration in your local ~/.ssh/config  ssh configuration file:

$ cat .ssh/config
Host datatransfer
  ControlMaster auto
  ControlPath ~/.ssh/mux-%r@%h:%p
  ControlPersist 86400
  HostName datatransfer.psi.ch
  User <username>

Before transferring / querying data, you first have to initialise the "master" connection (authenticate with your password/MFA )

$ ssh datatransfer.psi.ch

In case the connection was successfully established all following ssh commands will be run via the main  connection, without additional authentication. You can now use any of the commands listed above to query and transfer your data like:

$ ssh datatransfer.psi.ch "ls /test"

To check the state of main connection:

$ ssh datatransfer.psi.ch -O check

To terminate main connection:

$ ssh datatransfer.psi.ch -O exit

The main connection lifetime is determined by the ConrolPersist value in your ~/.ssh/config  file. Please note that very long lived connections (more than few days) will be terminated on our side!

 

Prerequisites

In addition to the prerequisites listed at the beginning following prerequisites apply: 

  • SSH access to datatransfer.psi.ch from your machine/organisation.
    • Please talk to your network/security team to allow this access in case your organisation is blocking outgoing SSH traffic!

 

Commands

Find Data

Data/files can be queried / listed as follows:

$ ssh <username>@datatransfer.psi.ch "ls /<data-path>"

or a recursive listing (might be slow)

$ ssh <username>@datatransfer.psi.ch "ls -R /<data-path>"

 

Retrieve Data

Data can be downloaded to your machine as follows:

  • scp

    scp -r <username>@datatransfer.psi.ch:/<data-path> <destination-path>

     

  • rsync

    $ rsync -av <username>@datatransfer.psi.ch:/<data-path> <destination-path>

     

Upload Data 

Uploading data is only possible into certain directories, please check the table above if your storage is writeable !

Uploading data can be done via:

  • scp

    scp -r <local-data-path> <username>@datatransfer.psi.ch:/<destination-path>

  • rsync

    rsync -av <local-data-path> <username>@datatransfer.psi.ch:/<destination-path>

 

Sometimes it might be useful to tell the rsync command what the ownership of the transferred files should be. This is achieved by passing the --chown argument:

rsync -rtv --chown=<username>:<pgroup> <local-data-path> <username>@datatransfer.psi.ch:/<destination-path>

Note also that we had to change the -a option (archive, which tries to preserve the original ownerships) to -rt (recursive + preserve times).

 

Troubleshooting

  • Public-key based authentication is not possible on datatransfer.psi.ch !
  • In very urgent cases - in case you have problems connecting to datatransfer.psi.ch - try to resort to one of the backup servers datatransfer-01.psi.ch and/or datatransfer-02.psi.ch . However do NOT use these hostnames in normal cases as we might stop these serves at any time!

Data can also be queries, downloaded and uploaded via sftp. To do this, switch to the local directory where you want to download or upload to/from. Afterwards connect to the datatransfer server as follows:

sftp <username>@datatransfer.psi.ch:/<path>

Afterwards you have to authenticate and the you get a sftp shell.

sftp>

You are now able to use any sftp shell command to navigate and to upload/download data.

Example:

myshell > Desktop % cd ~/Desktop/data-dir
myshell > data-dir % sftp datatransfer.psi.ch:/test
Connected to datatransfer.psi.ch.
Changing to: /test
sftp> ls
testdir-1 testdir-2 testfile-1 testfile-2 testfile-3
sftp> ls -l
drwxr-xr-x 2 root root 48 Jul 31 15:44 testdir-1
drwxr-xr-x 2 root root 48 Jul 31 15:43 testdir-2
-rw-r--r-- 1 root root 0 Jul 31 15:43 testfile-1
-rw-r--r-- 1 root root 0 Jul 31 15:43 testfile-2
-rw-r--r-- 1 root root 0 Jul 31 15:43 testfile-3
sftp> # download data
sftp> get testfile-1
Fetching /test/testfile-1 to testfile-1
sftp> put local-data-file
Uploading local-data-file to /test/local-data-file
sftp>

Overview

Globus is a web service, which allows to transfer files in an easy and managed way. 

Globus has a number of build in features such as:

  • Automatic network optimisation
  • Parallel/multistream transfers (up to 4 transfers/streams)
  • Automatic retry in case of failure
  • Online task monitor
  • Summary email sent at the end of the transfer
  • Usually is firewall safe, as it uses only outgoing connections. If your firewall blocks also outgoing connections, then you need some special rules to be set up, contact your local IT support

In case of questions/problems please contact globus@psi.ch

Prerequisites

In addition to the prerequisites listed above following prerequisites apply:

  • A GlobusID account (free), or an account recognized by Globus (like Google, XSEDE, US Universities), see a very detailed description
    • PSI staff can use the PSI account to login to Globus (select the PSI organization)
    • PSI "ext-" accounts can't be used to login to Globus
  • Globus endpoint operated by your organisation or GlobusConnect Personal client installed (available for Win, Mac, Linux here)
  • To access PSI's data collection you have to authenticate with your username/password/MFA with our OIDC server. At the time of writing the OIDC server will only accept MFA verification if the user has Push notification via Microsoft Authenticator enabled! (OTP tokens will not work!)

Query / Transfer Data

PSI provides multiple collection from which data can be transferred. Which collection you are using depends at which PSI facility you collected your data.

To list all PSI data collections login to Globus, switch to the collections tab and search for "Paul Scherrer Institute"

 

To access a collection you need to authenticate against our OIDC server. This authentication requires that your account is MFA enabled (Important - Push Notification need to be your default setting for MFA !). After entering the password you will get a push notification on your Microsoft Authenticator app that you have to acknowledge in order to be able to log in. Afterwards you can access the collection, browse to your data and initiate the data transfer.

Globus Browse Collection


For more information on how to use Globus, please refer to the  Globus documentation