You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Per the documentation provided here (Resource 531: Description of genetic data types), there are two versions of the sample QC file:
NOTE: there are currently 2 versions of this file in circulation. The newer
version is described below and contains column headers on the first row.
The older (deprecated) version lacks the column headers and has two
additional Affymetrix internal values prefixing the columns listed below
At this time (2019-12-19), the version of the sample QC file that I am able to download from EGA here (UKB Genotyped 2018) appears to be the older version of the file, i.e., sans header, with additional Affymetrix internal values. When I attempt to import it using the procedure described here (Loading other types of data), I encounter this error message:
2019-12-19 16:54:18,381 - ukbrest - ERROR - Loading finished with an
unknown error. Activate debug to see full stack trace.
(psycopg2.IntegrityError) column "eid" contains null values
[SQL: '\n ALTER TABLE samplesqc ADD CONSTRAINT
pk_samplesqc PRIMARY KEY (eid);\n ']
Thu Dec 19 10:54:23 CST 2019
I believe this is caused by the older sample QC file not matching the *.sample file associated with our our project, which we use to join our project phenotype data to the genotype/imputed data distributed by UKBiobank. I.e., the number of samples between the our sample file is less than the number of records in the older sample QC file, triggering the error concerning records that do not contain a required valid sample id for loading data.
I believe the correct solution is to use the more current sample QC file, but I have not had any luck finding it. When I reached out to EGA, they indicated that the sample QC file they provided at the dataset given above is the only version of the file they have. When I contacted UKBiobank, they pointed me to the ukbgene documentation here (Resource 664), which indicates that the sample QC file should be retrieved from EGA.
The text was updated successfully, but these errors were encountered:
@hakyim I apologize for the lagging response: to answer your question, no, we never resolved this issue.
FYI, we were able to make progress on our specific project because some (not all) of the information the sample QC file can be found in the phenotype data as well, and were able to work around this failure/roadblock accordingly.
Thank you William for the feedback! Sadly, it's really hard to me to work on this without access to the data. You told me that this is mainly because of a file with inconsistent format being delivered by the UK Biobank people, right? Did you get any response from them, a newer sample QC file for instance?
Per the documentation provided here (Resource 531: Description of genetic data types), there are two versions of the sample QC file:
At this time (2019-12-19), the version of the sample QC file that I am able to download from EGA here (UKB Genotyped 2018) appears to be the older version of the file, i.e., sans header, with additional Affymetrix internal values. When I attempt to import it using the procedure described here (Loading other types of data), I encounter this error message:
I believe this is caused by the older sample QC file not matching the
*.sample
file associated with our our project, which we use to join our project phenotype data to the genotype/imputed data distributed by UKBiobank. I.e., the number of samples between the our sample file is less than the number of records in the older sample QC file, triggering the error concerning records that do not contain a required valid sample id for loading data.I believe the correct solution is to use the more current sample QC file, but I have not had any luck finding it. When I reached out to EGA, they indicated that the sample QC file they provided at the dataset given above is the only version of the file they have. When I contacted UKBiobank, they pointed me to the
ukbgene
documentation here (Resource 664), which indicates that the sample QC file should be retrieved from EGA.The text was updated successfully, but these errors were encountered: