Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

difference of the result between predict.py and website (HumanBase) #20

Open
zofieLin opened this issue Jun 19, 2020 · 1 comment
Open

Comments

@zofieLin
Copy link

Hi,

I used the script "predict.py" to do the prediction of a vcf file, and I find that there is some difference between these results and the results from ExPecto website. A lot of variants could not found the result on the website with a warning like "No significant predictions for rs188098026 found". However, I can get the result from "predict.py", so I wonder if there is any difference between these two methods?
Besides, I downloaded a full file of variation potential prediction of all 140 million mutations (~125G), for each tissue file (e.g. effects_pergene_mat_Whole_Blood.txt), it contains 6003 columns, however, I couldn't find the information nor column name of these columns. Could you please tell me where could I find it?
Thanks.

Zofie

@jzthree
Copy link
Collaborator

jzthree commented Jun 20, 2020

The website only contained variants with >0.3 predicted log fold change in at least one tissue while the code provides all predictions regardless of their effect size.

For your second question, every entry of the matrix in the file corresponds to a variant in the effects_coors.txt file (there are 6003 variants per gene). The orders of variants and genes are the same as in the effects_coors.txt file.

Jian

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants