GEOfastq can use aspera
connect to download
fastqs. It is faster than ftp for large single-file downloads (single-cell
fastqs).
To download and install it according to the
documentation. For me
(Fedora 30), this works:
wget https://download.asperasoft.com/download/sw/connect/3.9.6/ibm-aspera-connect-3.9.6.173386-linux-g2.12-64.tar.gz
tar -zxvf ibm-aspera-connect-3.9.6.173386-linux-g2.12-64.tar.gz
./ibm-aspera-connect-3.9.6.173386-linux-g2.12-64.sh
After restarting Rstudio, to confirm things are set up properly:
# should have the above path added
Sys.getenv('PATH')
# should print info about Aspera Connect
system2('ascp', '--version')
Install docker image
To install GEOfastq and Aspera Connect from a pre-built docker image:
# retrieve pre-built geofastq docker image
docker pull alexvpickering/geofastq
# run interactive container with host portion of
#`-v host:container` mounted where you want to persist data to
sudo docker run -it --rm \
-v /srv:/srv \
geofastq /bin/bash
Usage
First crawl a study page on GEO to get
study metadata and corresponding fastq.gz download links on
ENA:
GEOfastq
Install GEOfastq
To download and install
GEOfastq:Install Aspera Connect (optional)
GEOfastqcan use aspera connect to download fastqs. It is faster than ftp for large single-file downloads (single-cell fastqs). To download and install it according to the documentation. For me (Fedora 30), this works:I also had to make sure
ascpwas on the thePATH:For Rstudio to find
ascpon thePATH, I also had to add this to a .Renviron:After restarting Rstudio, to confirm things are set up properly:
Install docker image
To install
GEOfastqand Aspera Connect from a pre-built docker image:Usage
First crawl a study page on GEO to get study metadata and corresponding fastq.gz download links on ENA:
Next, subset
srp_metato samples that you want, then download:That’s all folks! GOTO:
kallisto?