Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: invalid literal for int() with base 10: '48169282.0': Error while type casting for column 'pos' #177

Open
IMingGarson opened this issue Jun 14, 2024 · 0 comments

Comments

@IMingGarson
Copy link

IMingGarson commented Jun 14, 2024

Hi, I ran into this issue when I am reading vcf file:

ValueError: invalid literal for int() with base 10: '48169282.0': Error while type casting for column 'pos'

Version: snps 2.8.1
OS: MacOS Sonoma 14.3.1 with Apple M1 chip

Here is part of my DNA raw data downloaded from AncestryDNA:

# This data file generated by 23andMe at: Sat Jun 12 11:23:12 2024
#
# This file contains raw genotype data, including data that is not used in 23andMe reports.
#
# Each line corresponds to a single SNP.  For each SNP, we provide its identifier (an rsid
# or an internal id), its location on the reference human genome, and the genotype call
# oriented with respect to the plus strand on the reference human genome.  We are using reference
# human assembly build 37 (also known as Annotation Release 104).
#
# More information on these variants can be found at http://www.ncbi.nlm.nih.gov/SNP/
#
# rsid	chromosome	position	genotype
rs1	17	21102678	AG
rs2	7	62361768	AA
rs3	18	42763504	CT
...

And here is part of the converted vcf file using to_vcf():

##fileformat=VCFv4.2
##fileDate=20240613
...
#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT	SAMPLE
1	48169282.0	rs112	C	A,T	.	.	.	GT	2/1
...

As you can see "POS" became float (48169282.0) instead of np.uint32 as here suggested
https://github.com/apriha/snps/blob/master/src/snps/io/reader.py

TWO_ALLELE_DTYPES = {
    "rsid": object,
    "chrom": object,
    "pos": np.uint32,
    "allele1": object,
    "allele2": object,
}

If I change "48169282.0" to "48169282" the error would be gone but I didn't know if it is expected data type.
Enlightenments are appreciated.

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant