-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Peak calling with ATAC-seq data #8
Comments
Hi! We have en experimental branch that supports this. It is not very well tested, but you could give it a try. You will need to install Graph Peak Caller by cloning this repository, checking out the branch called Then you should be able to run Let me know if you run into some problems! |
Thank you for your quick reply. Do you mind giving some more specific instruction on how to check out the branch? I've tried to simple run |
Sure, try doing a The following should work:
If you already have Graph Peak Caller installed, you may need to uninstall it first with |
So, when I try to run the checkout command I get this error:
However, it seems to be working if I force the checkout ( Which method would you recommend to estimate the fragment length? Thanks again for the support, |
That warning means that you have made som changes to some of the files, but forcing checkout should work fine (as you saw), as long as you are not afraid of losing your changes. I am not very familiar with ATAC-seq data/peak calling, and I think there are no correct answer for what fragment length should be used, but in this thread it seems that people are using either |
So, I've tried to proceed as suggested in the guide. Aligned the reads to my graph, filtered them with the parameters provided, converted to json and split them by chromosome.
This process generates the files *.json, *.nobg, *.sequences, *.sequencesv2, linear_pathv2.interval and node_range for every chromosome.
In addition to that, I'd like to signal a minor bug in a function, that tries to sum a list and a string causing an error. The problem is at line 101 of file Thank you again for your help, Andrea |
Seems like something is wrong (either with Graph Peak Caller or your graphs). Thanks for sharing the details! Would you be able to share either the *.nobg files or the *.vg files with me? Then I can try to figure out what is wrong. |
Hi, |
Sure, that's fine. You can email them to [email protected] or upload them somewhere and share the link |
Hello, just wanted to ask whether there are some progresses with this. Thank you in advance, |
Really sorry for the delay, I've recieved the graph you sent me and will try to reproduce the error as soon as possible, hopefully during the weekend. |
No need to apologise, thank you very much for your support! :) Andrea |
Hi! I've managed to have a look at the graph you sent me, and I think maybe the problem is that not all nodes in your graph are connected. For instance, both node How did you make this graph? Did you make it from a vcf and a linear reference using |
Hi! |
I see! Graph Peak Caller really only works with graphs that are directed, connected and non-cyclic. I'm not sure why Cactus would create a non-connected graph. Is the graph you sent me a single graph with multiple chromosomes/scaffolds or is it only one chromosome? |
No the graph is multiple chromosomes, joined with the vg ids command and then indexed later on. |
Okay, I understand. Graph peak caller can only call peaks on a graph representing a single chromosome, so the way to run graph peak caller on multiple chromosomes is to run it on each chromosome graph separately, as described from Step 6 here: https://github.com/uio-bmi/graph_peak_caller/wiki/Graph-based-ChIP-seq-tutorial If you've at one point in you graph creation pipeline had one graph per chromosome, this approach should work as long as they have converted node id space that match the joint graph that was used to make the index. Usually with vg, you would make one graph for each chromosome first, then run This, however, still requires that each and one of you chromosome graphs is directed, connected and acyclic. |
Hello again,
The software asks me to change parameters with -m or -M, but I cannot find these settings anywhere in help of the tool. I've used the filtered reads to do this, but maybe I should've used the raw reads. Do you think it's this? How should I proceed? Thanks again for your help! |
It seems like you are on the right direction now. I think you get this error because you don't specify Let me know if you face any problems. |
I've tried to run the
For every chromosome, I got the file such as these:
What should I do? Thank you anyway for your help! |
Hi again, sorry was wondering whether there has been any development in this. |
Sorry for the late reply. Would you be able to share the above mentioned files and the |
Hi, |
Hi, |
Hi! Sorry for the late reply again. I got the data and managed to reproduce your error. I think the problem still is that your graph is not connected. For instance, node 133450827 has an edge going to 135833768, but these two nodes don't have any other edges to any nodes, so they are isolated together. I guess there might be more such cases in your graph. I am not sure why your graph is disconnected (I don't know much about cactus, but I thought it would create one graph for every chromosome). Do you know why? Anyway, Graph Peak Caller will not work properly with a graph that is disconnected. It might be possible to fix this one error you got, but there will likely be more problems. Also, I suspect you will have trouble interpreting your results in the end if your graph is disconnected. Thus, I would suggest that you try to find out if this is what your graph is supposed to look like or not. I could try to help you with this if you need any help. Also let me know if you have any further questions. |
Hi, sorry for my late reply again. It is very strange, since the graph is supposed to be connected (and with alternative paths as well). Not sure how this is possible to be honest. |
Hello,
I've got a question. I would like to try your software to perform peak calling, but instead of using Chip-seq data I woul like to use it on ATAC-seq data.
Do you think it is possible? If so, are there any settings that should be used?
Thank you in advance,
Andrea
The text was updated successfully, but these errors were encountered: