Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error in building database #79

Open
AmruthaJNC opened this issue Mar 30, 2023 · 19 comments
Open

error in building database #79

AmruthaJNC opened this issue Mar 30, 2023 · 19 comments

Comments

@AmruthaJNC
Copy link

I was trying to create database for Candida tropicalis using genomic assembly fasta file and annotation file in gtf format. I created a config file named C_tropicalis_MYA3404.txt , which is located in test files folder. when i try to run the command :
perl make-SIFT-db-all.pl -config test_files/C_tropicalis_MYA3404.txt

I'm getting the followng error:

entered mkdir /test_files/C_topicalis_MYA3404
No such file or directory at /home/mml/programs/scripts_to_build_SIFT_db/common-utils.pl line 80.

I'm quite new to this, so it might be some very obvious problem. Kindly let me know if i can fix this.

@pauline-ng
Copy link
Owner

What happens when you do:

ls /home/mml/programs/scripts_to_build_SIFT_db/ ?

Where is your directory SIFT4G_Create_Genomic_DB ?

@AmruthaJNC
Copy link
Author

I gave the parent directory for the database as this PARENT_DIR=/test_files/C_topicalis_MYA3404
Do i have to make a directory with the name you mentioned?
ls /home/mml/programs/scripts_to_build_SIFT_db/ gave me the following:
Screenshot from 2023-03-31 19-00-01

@pauline-ng
Copy link
Owner

I think I see the issue.
Check your config file.

Where it says
/test_files/C_topicalis_MYA3404

replace it with
/home/mml//test_files/C_topicalis_MYA3404

You want a full path that you have write access to.

@AmruthaJNC
Copy link
Author

i changed the path and it went past that error.
This is what i'm getting now.
Screenshot from 2023-03-31 19-54-01

I'm sure it's because of my inexperience. Thank you for your time.

@AmruthaJNC
Copy link
Author

AmruthaJNC commented Mar 31, 2023

Also, the PROTEIN_DB path in the config file under #running SIFT 4G, is it the same path to the database we are trying to create?

@pauline-ng
Copy link
Owner

Protein database is a database you download from Uniprot or NCBI.

I recommend you download the fasta file for UniRef90
Then set PROTEIN_DB to that path.

@AmruthaJNC
Copy link
Author

I have added path to the downloaded uniref90.fasta.gz file. It has went past the earlier error. I'm getting this error now.
Screenshot from 2023-04-13 11-29-46

@pauline-ng
Copy link
Owner

Please uncompress your fasta file, and then re-run.

gunzip uniref90.fasta

@AmruthaJNC
Copy link
Author

i could run it till processing database. It showed 100% completion and took around 24 hours to complete. I could not see the 'populating database' part described in https://github.com/pauline-ng/SIFT4G_Create_Genomic_DB#annotate in the terminal. Also, the folders , SIFT_predictions and SingleRecords_with_scores are empty.

@pauline-ng
Copy link
Owner

Please paste your config file so I can debug

@pauline-ng
Copy link
Owner

Actually, can you run a test config? That would be a lot easier and faster to debug. It may be that it's already working (some directories are cleaned out at the end)

@AmruthaJNC
Copy link
Author

i tried to run the homo sapiens test config. I'm getting an error like this.
Screenshot from 2023-04-22 16-10-28
I don't remember seeing the 'populating database' part when i tried with my config file.

@pauline-ng
Copy link
Owner

Per the README instructions, install python3 which should be invoked by calling python.

@AmruthaJNC
Copy link
Author

AmruthaJNC commented Apr 25, 2023

I could successfully complete the database creation from the homo sapiens test config. When i tried the same with my config file, it ends like this.
Screenshot from 2023-04-25 10-51-35
The problem i mentioned earlier of empty folders persist. The Sift predictions folder for homo sapiens is not empty, whereas it is empty in my case after creating the database. Also, the fasta folder has a single file in my case whereas multiple ones in the case of homo sapiens.

@pauline-ng
Copy link
Owner

Did you exit the program and stop it from running?

The test config uses test files that should run in less than 15 minutes. But when I build a full genome, it can take 24-48 hours.

@AmruthaJNC
Copy link
Author

I did not stop it in between. It took almost 24 hours to reach here.

@AmruthaJNC
Copy link
Author

I just tried running the annotator with my vcf files. The database creation stopped midway, if i'm not wrong. But i'm getting this error when i run the annotator.
Screenshot from 2023-04-26 12-16-53

The vcf file was obtained after joint Genotype calling using GenotypeGVCF tool from GATK.

@pauline-ng
Copy link
Owner

Uncompress your .vcf.gz file

@AmruthaJNC
Copy link
Author

I'm facing this error since the database was not created completely
Screenshot from 2023-04-26 17-10-27
I'm not sure why the database creation stopped abruptly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants