Search

How to download files with Aspera


Recipe metadata

identifier: RX.X

version: v1.0

Difficulty level

Reading Time

15 minutes

Recipe Type

Hands-on

Executable Code

Yes

Intended Audience

Principal Investigators

Data Managers

Data Scientists


Abstract

A recipe to download files from an Aspera Site, it will also help with the uploading. Providing some guidance on how to do this. This is part of a group of other related recipes such as the ftp upload/download to help us all more efficiently get to be uploading and downloading files .

Graphical overview of the Recipe

graph LR;

A[Data on source repository] -->B(Decide in needed location) B --> C{Do you have an account on source repository?} C -->|Yes| D[Decide how you are you going access the data] C -->|No| E[Get an account if you are the applicable person] D -->H{Install download/upload software} H -->I{Work out the appropriate command line options} I -->Z{Does your system have the applicable firewall holes?} Z -->|Yes| F[Have the firewall holes] Z -->|No| G[Get firewall holes] F -->J{Start download} J -->K{Monitor download and check download speed} K -->L{Check download and create a mini data catalogue} </div>

Capability & Maturity Table

Capability Initial Maturity Level Final Maturity Level
Interoperability minimal repeatable

Get Accounts Permissions:

  • Apply for access
  • Pay attention to the conditions
  • Sign up if you are the appropriate person for this download/upload
  • Typically Aspera sites are locked down and need a username and password.
  • For some sites, only one username is allowed per organisation, so it is worth making sure that that person is technically capable of uploading or downloading data, and also understands the data a little.

Decide how you are going to access the data:

  • A Web browser is great for initial browsing and downloading of small occasional files. It will automatically prompt you to download the Aspera broswer plugin to be able to do download any files.
  • For heavy duty downloading an Aspera command line client is needed. e.g. to download gigabytes or even terrabytes of data.

Decide on Software needed and get it installed:

Get the Firewall Configured as required to allow downloading and or uploading:

  • You may need to have firewall exceptions raised to unblock ports 3301 and 22, with your organisation IT's network perimeter team.
  • Try connecting first in case they are already not blocked

Work out the appropriate command line options:

  • For the Aspera command line, there are a large variety of options for downloading and uploading. See the download documentation above.
  • Considerations:
    • Use the -k {1,2,3} option to allow restarts without re-downloading all the data.
    • Run it using something like screen, so that it can be running in the background on a server
    • On the command line you can choose a preferred transfer rate. Please be careful to not hog the network bandwidth (we found up to about 100Mbps is okay).

Download Example command line:

  • These are the download command options we used. (and both ports 3301 and 22 were unblocked) ```#set the password variable corresponding to your Aspera account. export ASPERA_SCP_FILEPASS="mypassword"

example to download the files recursively from a specific directory on the Aspera server to

$ /hpc/apps/current/aspera/v3.9.6.app/bin/ascp -k 1 -P 33001 -o FileCrypt=decrypt aspera.myacc@aspera-immport.niaid.nih.gov:dir_to_download ./


* Ascp Version we used:
```$ ascp --version
IBM Aspera Desktop Client version 3.9.6.176292
ascp version 3.9.6.176292
...

Other suggestions:

  • Observe the download/upload speed e.g. 100Mbs and then you can estimate the finish time
  • Have some automated monitoring on the download process to notify you if it stops/finishes. Even an hourly du -sh from a cron job
  • Also typically you are pulling down many directories and files. On completion, it may be worth doing a recursive file listing to a file e.g. ls -ltR > file_listing.txt to give you and your "customers" a simple file catalogue.

Considerations for uploading

  • For uploading much of the above will apply. the main differences:
    • know which area to upload to
    • prepare your data for ease of loading. E.G. directories? Or compression if needed.
  • Example command line for uploading
    • (Needed - no real example yet)

TO DO:

  • Aspera is commercial software
  • Is this still okay as part of FAIR principles? As long as the instution with the server has paid for the licence
  • Action : ask EBI e.g. Tony or Fuqi (in the presentation)
  • Action Philippe: will ask Mark Wilkinson if Aspera is compliant? and how it would work with his evaluator?
  • Look at the dockerised version of the client?
  • write a new recipe for uploading - probably update this.

Authors:

Name Affiliation orcid CrediT role
Peter Woollard GSK, metadata group in R&D Data and Computational Sciences 0000-0002-7654-6902 Writing - Original Draft
Philippe Rocca-Sera Oxford University reviewer

License:

This page is released under the Creative Commons 4.0 BY license.