You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi Pauline,
When I used SIFT4G_Create_Genomic_DB to create the sugarcane database, I encountered a problem. While creating databases for each chromosome in sugarcane, some chromosomes successfully created databases, but others did not. There were no warning or error messages for the failures. I don't know why this is happening. Here are the outputs when the process succeeded and failed:
the succeeded output message:
** Checking query data and substitutions files **
processing queries: 100.00/100.00% *
** Searching database for candidate sequences **
processing database part 364 (size ~0.25 GB): 100.00/100.00% *
** Aligning queries with candidate sequences **
processing database part 91 (size ~1.00 GB): 100.00/100.00% *
** Selecting alignments with median threshold: 2.75 **
processing queries: 100.00/100.00% *
** Generating SIFT predictions with sequence identity: 100.00% **
processing queries: 100.00/100.00% *
the failed output message:
** Checking query data and substitutions files **
processing queries: 100.00/100.00% *
** Searching database for candidate sequences **
processing database part 364 (size ~0.25 GB): 100.00/100.00% *
Uniref90 was utilized for the database creation.
The config file:
Hi Pauline,
When I used SIFT4G_Create_Genomic_DB to create the sugarcane database, I encountered a problem. While creating databases for each chromosome in sugarcane, some chromosomes successfully created databases, but others did not. There were no warning or error messages for the failures. I don't know why this is happening. Here are the outputs when the process succeeded and failed:
the succeeded output message:
** Checking query data and substitutions files **
** Searching database for candidate sequences **
** Aligning queries with candidate sequences **
** Selecting alignments with median threshold: 2.75 **
** Generating SIFT predictions with sequence identity: 100.00% **
the failed output message:
** Checking query data and substitutions files **
** Searching database for candidate sequences **
Uniref90 was utilized for the database creation.
The config file:
GENETIC_CODE_TABLE=1
GENETIC_CODE_TABLENAME=Standard
MITO_GENETIC_CODE_TABLE=11
MITO_GENETIC_CODE_TABLENAME=Plant Plastid Code
PARENT_DIR=/xtdisk/apod/xiehx/Deleterious_variants/SIFT/Saccharum/SIFT_Database/Chr8D
ORG=Saccharum_spontaneum
ORG_VERSION=Np-X
#Running SIFT 4G
SIFT4G_PATH=/gpfs/biosoft/app2/python2024/envs/sift4g/bin/sift4g
PROTEIN_DB=/xtdisk/apod/xiehx/Deleterious_variants/SIFT/Saccharum/SIFT_Database/config/uniref90.fasta
Sub-directories, don't need to change
GENE_DOWNLOAD_DEST=gene-annotation-src
CHR_DOWNLOAD_DEST=chr-src
LOGFILE=Log.txt
ZLOGFILE=Log2.txt
FASTA_DIR=fasta
SUBST_DIR=subst
ALIGN_DIR=SIFT_alignments
SIFT_SCORE_DIR=SIFT_predictions
SINGLE_REC_BY_CHR_DIR=singleRecords
SINGLE_REC_WITH_SIFTSCORE_DIR=singleRecords_with_scores
DBSNP_DIR=dbSNP
Doesn't need to change
FASTA_LOG=fasta.log
INVALID_LOG=invalid.log
PEPTIDE_LOG=peptide.log
ENS_PATTERN=ENS
SINGLE_RECORD_PATTERN=:change:_aa1valid_dbsnp.singleRecord
Could you give me some suggestions and help?
Thank you very much for your advice and time!
The text was updated successfully, but these errors were encountered: