Argonne National Laboratory

GM/CA @ APS

Remote Data Backups with Globus Online

Department of Energy Office of Science
GM/CA @ APS Sponsors:
National Institute of General Medical Sciences (NIGMS) and National Cancer Institute (NCI) of the National Institutes of Health (NIH)
 

 

Please also check our Globus Video Guide!

 

Globus Online for GM/CA @ APS Users

Globus Online is a free service sponsored by DOE, NIH, NSF, Argonne, and the University of Chicago (see the list of sponsors). It addresses the challenges faced by researchers in moving, sharing, and archiving large volumes of data among distributed sites. With Globus Online, you hand-off data movement tasks to a hosted service that manages the entire operation, monitoring performance and errors, retrying failed transfers, correcting problems automatically whenever possible, and reporting status to keep you informed so that you can focus on your research. Our tests show that Globus Online is 2x faster than scp and rsync. This make a big difference reducing data transfer times from e.g. 6 hours to 3 hours. Also, the transfer progress can be started and watched from any place with Internet connection, e.g. from ANL Guest House, airport, or home.

Transferring Data Between GM/CA @ APS and Your Machine

  1. Register for a Globus Online account (one time for life).
     
  2. Sign in to Globus Online using your Globus Online username and password:

     
  3. If you already have globusconnect application installed on the computer that will be receiving data (i.e. the computer is already registered as Globus Endpoint under your account), then skip to Step-4. Otherwise install globusconnect on the computer and obtain a unique computer Setup Key from Globus by clicking on the Globus Connect link:

    The above link brings you to the page listing all of your configured endpoints (if any) and allows to create new endpoint and install Globus client on it:

    • Name your data-receiving computer as a new Globus Endpoint. The endpoint name consists of your username prefix (will be added automatically) and the computer name you wish to assign. Then, generate the computer Setup Key by pressing the "Generate Setup Key" button and copy this key to clipboard. You will need to provide this key to Globus Client after the installation.
    • Download a one-click Globus Connect Setup for the operating system of the data receiving computer. The application is available for Mac, Linux, and Windows. Note the place where you saved the installer (e.g. on the Desktop or in the "Downloads" folder).
    • Install Globus Connect by running the downloaded Setup application. When Setup asks you for a setup key, paste from clipboard the previously copied Setup key.

     
  4. Start the globusconnect application on the data receiving computer. For example, on Linux starting globusconnect is as simple as typing "./globusconnect" in respective directory:

    On Windows the application can be started via Start -> Programs -> Globus Connect menu. Normally no administrative privileges are required.
    NOTE: The globusconnect client should be started each time you want to transfer files to or from your data receiving computer. This application makes the computer visible among available endpoints on Globus Online web page. The client has a GUI interface, which looks like this:

    If for whatever reason you cannot run "globusconnect" as a GUI (for example, you are transferring data to your home institution and accessing a computer there remotely), then you can start it in command-line mode -- see these instructions.
     
  5. On the Globus Connect web page select "Start Transfer" under "File Transfer", or from the drop down menu in the top bar:

     
  6. In the "Start Transfer" page, you can view the list of available endpoints by clicking the button on the "Endpoint" drop down box. You need to select endpoints for both left and right panes.
     
    • For the left pane choose the endpoint of data receiving computer. It is usually located at the top of the list and identified as "<your-username>#<your-globus-connect-name>". If you do not see the endpoint, then either globusconnect is not started on the computer or it is blocked by a firewall. As soon as you select the endpoint, you will see a list of folders on the data receiving computer.
    • On the right pane choose one of the GM/CA @ APS endpoints. You can use the GM/CA @ APS endpoint gmca23idd#gridftp for the 23-IDD beamline, or GM/CA @ APS endpoint gmca23idb#gridftp for the 23-IDB beamline.
    • You can type letters into the box (e.g. 'gm') to filter endpoints.

     
  7. Once you select the GM/CA endpoint, a login window will pop up:

    You can access the GM/CA endpoints on Globus Online by simply using your username and password with respective GM/CA beamline. Keep in mind that your account with GM/CA @ APS is typically disabled a few days after the end of your beamtime; contact your host if unsure. Enter your GM/CA beamline username in the "Username" field and GM/CA beamline password (not the SSH passphrase) in the "Passphrase" field and click "Authenticate". You can ignore the other fields.
     
  8. You will see a listing of the contents of your home directory on respective GM/CA beamline. Double click on a directory to view its content.
     
  9. Select a file or directory and click on the highlighted "arrow button" to initiate the transfer:

     
  10. To watch the file transfer progress or possibly cancel a transfer, choose "View Transfers" from the drop down menu in the top bar of the Globus Online web page. The screen will look like this:

     

NOTE: In the present form Globus Online offers directories synchronization option in the "Transfer Files" drop down box, but no continuous synchronization. Although the lack of continuous synchronization is some inconvenience, it is outweighed by the speed, which has been tested to be at least 2x faster than traditional scp or rsync (rsync deploy scp for the transfers). We recommend to use Globus Online for transferring large amounts of data and possibly scp/rsync for post-transfer synchronizations.

Additional Learning Resources

  • Video guide: "Remote data transfers with Globus GridFTP" presented by Raj Kettimuthu, MCS. This 12-minutes video courtesy one of core Globus developers guides you through the steps of setting up Globus transfers with GMCA servers.
  • In-depth video guide "Globus for Research Data Management" by Rachana Ananthakrishnan, Globus; presented at the Argonne Training Program on Extreme-Scale Computing, Summer 2015 (53 minutes).

GM/CA @ APS is an Office of Science User Facility operated for the U.S. Department of Energy Office of Science by Argonne National Laboratory

UChicago Argonne LLC | Privacy & Security Notice | Contact Us | A-Z Index | Search