Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in creating linear map when in ATAC-seq mode #9

Open
huangyizR opened this issue Sep 15, 2020 · 1 comment
Open

Error in creating linear map when in ATAC-seq mode #9

huangyizR opened this issue Sep 15, 2020 · 1 comment

Comments

@huangyizR
Copy link

Hi,
I am trying the graph_peak_caller in ATAC-seq mode as you suggested in #8 . I am following the pipeline in vg to construct the variation graph. I worked on the mouse genome, in which case 21 graphs for each chromosome was created. But I still expensed the same kinds of error as shown in #8, in the creating learn map step for most of the chromosomes (only Chr9, ChrX and ChrY work fine) :

[peter.huang@node070 peak]$ graph_peak_caller callpeaks -a True -g /secondary/projects/immunograph/vg/scripts/indexGCSA/graphs/chr1.nobg -s /secondary/projects/immunograph/vg/results/gam/10B_L000_Interleaves_chr1.json -n "" -f 150 -r 50 -p True -u $unique_reads -G 2500000000 2020-09-15 10:09:25,605, INFO: Sample files: ['/secondary/projects/immunograph/vg/results/gam/10B_L000_Interleaves_chr1.json'] 2020-09-15 10:09:25,605, INFO: Using graphs: ['/secondary/projects/immunograph/vg/scripts/indexGCSA/graphs/chr1.nobg'] 2020-09-15 10:09:25,605, INFO: Will use sequence graphs. ['/secondary/projects/immunograph/vg/scripts/indexGCSA/graphs/chr1.nobg.sequences'] 2020-09-15 10:09:25,605, INFO: Using graphs from data directory /secondary/projects/immunograph/vg/scripts/indexGCSA/graphs 2020-09-15 10:09:25,605, INFO: Will use [''] as extra experiments names for each run, based on graph file names.If only running on single graph, this should be empty. 2020-09-15 10:09:25,608, WARNING: Did not find linear map for for graph /secondary/projects/immunograph/vg/scripts/indexGCSA/graphs/chr1.nobg. Will create. 2020-09-15 10:09:25,608, WARNING: Did not find linear map for for graph /secondary/projects/immunograph/vg/scripts/indexGCSA/graphs/chr1.nobg. Will create. 2020-09-15 10:09:29,838, INFO: Getting topologically sorted nodes 2020-09-15 10:09:29,839, INFO: 0 nodes processed 2020-09-15 10:09:29,839, INFO: Finding starts and ends 2020-09-15 10:09:29,839, INFO: 0 nodes processed Traceback (most recent call last): File "/secondary/projects/triche/tools/anaconda2/envs/r_env/bin/graph_peak_caller", line 11, in <module> load_entry_point('graph-peak-caller', 'console_scripts', 'graph_peak_caller')() File "/secondary/projects/triche/Peter/tools/graph_peak_caller/graph_peak_caller/command_line_interface.py", line 39, in main run_argument_parser(sys.argv[1:]) File "/secondary/projects/triche/Peter/tools/graph_peak_caller/graph_peak_caller/command_line_interface.py", line 679, in run_argument_parser args.func(args) File "/secondary/projects/triche/Peter/tools/graph_peak_caller/graph_peak_caller/callpeaks_interface.py", line 182, in run_callpeaks2 create_linear_map(graph, linear_map_name) File "/secondary/projects/triche/Peter/tools/graph_peak_caller/graph_peak_caller/util.py", line 6, in create_linear_map linear_map = LinearMap.from_graph(ob_graph) File "/secondary/projects/triche/Peter/tools/graph_peak_caller/graph_peak_caller/control/linearmap.py", line 106, in from_graph starts = cls.find_starts(graph, node_ids) File "/secondary/projects/triche/Peter/tools/graph_peak_caller/graph_peak_caller/control/linearmap.py", line 134, in find_starts max_dists[j] = max(cur_dist, max_dists[j]) IndexError: index -21804899 is out of bounds for axis 0 with size 3

In #8, you suggest that the error may be caused by a disconnected graph. But because I was using vg, I doubt that is the case, although I am not so sure about it. I wonder how you manage to find the disconnection in graph as you comment in #8 :

" For instance, node 133450827 has an edge going to 135833768, but these two nodes don't have any other edges to any nodes, so they are isolated together."

And also, I wonder if you any further suggestions about this error. Thank you so much!

Best,
Peter

@ttriche

@cgroza
Copy link

cgroza commented Aug 8, 2024

This error was introduced after VG made slight changes in their JSON format output, representing nodes as strings instead of integers. There was a fix introduced. Did you try the latest version?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants