Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing chromosomal position in mthyl call #166

Open
devenderarora opened this issue Nov 9, 2020 · 5 comments
Open

Missing chromosomal position in mthyl call #166

devenderarora opened this issue Nov 9, 2020 · 5 comments

Comments

@devenderarora
Copy link

Dear Sir,
I have received a data after running pigz_bsseq pipeline. While analysing the methylation call I found some strange missing information like:

"67","1",20716,20716,"+",22,18,4
"68","1",20809,20809,"+",15,12,3
"69","1",20821,20821,"+",14,14,0
"70","1",20850,20850,"+",16,15,1
"71","1",20944,20944,"+",17,10,7
"72","1",21128,21128,"+",21,19,2
"73","1",2176,2176,"+",10,10,0
"74","1",22386,22386,"+",10,0,10
"75","1",22402,22402,"+",10,0,10
"76","1",22410,22410,"+",10,1,9
"77","1",2268,2268,"+",11,10,1
"78","1",22954,22954,"+",11,8,3
"79","1",23020,23020,"+",15,9,6
"80","1",23176,23176,"+",11,0,11
"81","1",23331,23331,"+",12,0,12
"82","1",23333,23333,"+",12,0,12

Here, at 77th: 2268 should be something 2268X and this is missing all over the chromosome. We cross checked at some position and found missing "0" or "00" at last position. May I know if we can fix this issue somehow. Or any suggestion. I will be grateful to you for your key inputs in this regards.

Devender Arora

@alexg9010
Copy link
Member

Hi @devenderarora,

Thank you for using the pigx-bsseq pipeline.

The format of the methylation call file that you are showing here should be:
(Rownumber), Chromosome, Start Base, End Base, Strand, Total Read Coverage, Number of Cytosine at Base, Number of Thymine at Base

That being said, the 77th column indicates 11X at chr1:2268-2268.

I hope this makes things clearer for you.

Best, Alex

@devenderarora
Copy link
Author

I think I am not able to present my query clearly. The data is easy to understand.
But If you look at column 73 or 77th the start chromosome is 2176 or 2268 according to data output but if we look at the above positions we will see the start and end position is well in increasing order and after some digging, we found there should be 21760 instead.

@alexg9010
Copy link
Member

alexg9010 commented Nov 24, 2020

Ah, now I got your point.
You are right, the files should be sorted by (first chromosome, but then ) position, so these two rows that you highlighted should be 2176_0_ and 2268_0_ respectively.
I have the suspicion, that within R these numbers were represented with scientific notation, like 2176+e1 for 2176_0_ or 2176+e2 for 2176_00_, and when we exported to bed/txt format the notation was not respected.

@devenderarora Could you please tell me which version of the pipeline you are using?
Also, I would need to know which exact file we are talking about, whether methylation has been called with methylKit or methylDackel so maybe you could paste the path of the file starting from your output folder.

Best,
Alex

@alexg9010
Copy link
Member

alexg9010 commented Nov 24, 2020 via email

@devenderarora
Copy link
Author

Dear Alex,
Thankyou for your response. We used pigx_bsseq-0.0.10 version for the analysis. The file folder is 06_methyl_calls/ and file name ends with CpG.txt and methylRaw.RDS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants