Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Larger file not accepted #35

Open
marivalen opened this issue Oct 18, 2017 · 2 comments
Open

Larger file not accepted #35

marivalen opened this issue Oct 18, 2017 · 2 comments

Comments

@marivalen
Copy link

Hi,

When I use the following file as input to vcf2db:
working.txt

the database is created as expected.

But when I use this file:
notworking.txt

I run into the following error:
sqlalchemy.exc.StatementError: (sqlalchemy.exc.InvalidRequestError) A value is required for bind parameter 'siphy_29way_logodds', in parameter group 4 [SQL: u'INSERT INTO variant_impacts (variant_id, gene, transcript, is_exonic, is_coding, is_lof, [...]

The first and second file are made up from the exact same lines which repeat themselves. The only difference between them is that the notworking file has one line more than the working file.

Maybe you have an idea why this error occurs or what I can do different?

Thank you,
Maria Marin

@brentp
Copy link
Member

brentp commented Oct 20, 2017

I have tried debugging this but there is too much going on. You can gnomad exome and WGS annotatoins from vcfanno, from annovar, and from VEP. This is just too much to dig through for an error I have not seen before. I suspect it's something with your annovar annotations, but I'm not sure. If you can create a VCF that demonstrates the issue with a much smaller set of annotations, that will help me to debug.

@marivalen
Copy link
Author

marivalen commented Oct 23, 2017

I tried removing different annotations to see what works and what does not.

When I remove the snpEff annotations from the variants but leave the rest, then the database is created as normal.

When I only have snpEff and VEP annotations then eventually the same thing as above happens. After a certain number of lines then even if it is the same line (variant) repeating over and over, with the exact same annotation in every line, I get the error.

here is the working file:
working-1.txt

and the not working file:
notworking-1.txt

Aside from this problem, I was wondering how does your tool figure out which of the VEP transcripts is the most deleterious one? because right now I add all the additional information to all the transcripts but I would like to add it only to the most deleterious one so that my variant_impacts table is not so big.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants