Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue downloading data using manifest TSV file #33

Open
stephwon opened this issue Feb 20, 2025 · 2 comments
Open

Issue downloading data using manifest TSV file #33

stephwon opened this issue Feb 20, 2025 · 2 comments

Comments

@stephwon
Copy link

stephwon commented Feb 20, 2025

I'm trying to download HHS (WGS raw sequencing) data, a total of 3.8k samples, but none are getting downloaded.

This is the manifest file that I was attempting to obtain (in CSV because it will not allow me to attach TSV):
hmp_manifest_358ba31e87.csv

portal_client -m path/to/hmp_manifest_358ba31e87.tsv -d /path/to/HHS
When I execute the above command (censoring path due to privacy), I get the following:
Image

Even when I put either relative or absolute path for the manifest file and output destination, the result is the same.
I'm not sure why the dataset is not accessible for download...

How should I proceed so that I can get the dataset?

Thank you!

@stephwon stephwon changed the title Issue downloading data Issue downloading data using manifest TSV file Feb 20, 2025
@nsuvarnaiari
Copy link

Hi @stephwon,

You can access the HMP data from our public AWS bucket for free.

The hmwgsqc v1 and v2 data can be downloaded using aws cli as follows,
hmwgsqc v1 - aws s3 cp s3://hmpdcc/hmp1/hhs/microbiome/wms/analysis/qc/qc_2012/ . --no-sign-request
hmwgsqc v2 - aws s3 cp s3://hmpdcc/hmp1/hhs/microbiome/wms/analysis/qc/qc_2017/ . --no-sign-request
hmasm - aws s3 cp s3://hmpdcc/hmp1/hhs/microbiome/wms/analysis/assemblies/single_sample/ssasm_2012/ . --no-sign-request

Please note, the filenames might be slightly different in AWS. The SRS numbers in the filenames are same in the manifest and in AWS bucket.

Thanks,
Suvvi

@stephwon
Copy link
Author

Hi @nsuvarnaiari,
Thank you for sharing the AWS CLI commands. I was wondering if all the hmwgsqc data needs to be downloaded using the AWS CLI, or if files are still accessible through the portal_client tool? Thanks again for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants