Argonne National Laboratory

GM/CA @ APS

Remote Data Backups with bbcp

Department of Energy Office of Science
GM/CA @ APS Sponsors:
National Institute of General Medical Sciences (NIGMS) and National Cancer Institute (NCI) of the National Institutes of Health (NIH)
 

 

 

Backing up data from GM/CA @ APS with bbcp

bbcp is a fast data transfer utility developed by Andy Hanushevsky at SLAC National Accelerator Laboratory at Stanford. Using this utility one can achieve substantial increase of data transfer rates over WAN compared to traditional SCP/SFTP. A good discussion of the bbcp vs other transfer utilities including Globus can be found here. You should be aware that bbcp does not encrypt data. Only authentication is encrypted (bbcp authenticates via ssh, which should be present on the system). If this is a concern, revert to Globus or SCP/SFTP.

Step-by-step instructions

  1. Download the latest bbcp binary for your OS from SLAC. bbcp is single-file command-line utility available for Linux, MacOS and Solaris, but not for Windows. Since bbcp is an open source, you may try to compile it for your OS, if you wish.
  2. Read the bbcp manpage explaining the program options and the bbcp HOWTO at Caltech.
  3. Because of firewall considerations, bbcp needs to be started at a computer running in your institution and instructed to connect to blXws5.gmca.aps.anl.gov (X=1 for 23ID-D and X=2 for 23ID-B).
  4. Use one of the options listed below:
    bbcp -P 3 -Z 50000:51000 -z -c 1 -r \
                                  username@blXws5.gmca.aps.anl.gov:sourcefolder destinationfolder
    
    bbcp -P 3 -Z 50000:51000 -z -c 1 -N io \
                                 'username@blXws5.gmca.aps.anl.gov:gtar -c -O sourcefolder' \
                                                                                              'gtar -x -C destinationfolder'
         
    Here '-P 3' instructs bbcp to print progress every 3s, '-Z 50000:51000' is the range of TCP ports to be used (this option is mandatory and the ports must not be changed), '-z' instructs bbcp to reverse connection initiation in order to bypass firewalls, '-c 1' turns on data stream compression, and '-r' means recursive directory scanning.
    In the second example the data stream is piped via gtar (tar can be used too) for additional speed.
    For more options including those for optimizing the speed and resuming interrupted transfers, please refer the bbcp manpage and HOWTO.


GM/CA @ APS is an Office of Science User Facility operated for the U.S. Department of Energy Office of Science by Argonne National Laboratory

UChicago Argonne LLC | Privacy & Security Notice | Contact Us | A-Z Index | Search