Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

classifying results #19

Open
eesiribloom opened this issue Nov 5, 2024 · 3 comments
Open

classifying results #19

eesiribloom opened this issue Nov 5, 2024 · 3 comments

Comments

@eesiribloom
Copy link

Is there a need to classify the output of CoRAL or are all circular paths/amplicons output estimated to be ecDNA specifically?
Thanks again for the tool

@jluebeck
Copy link
Member

jluebeck commented Nov 5, 2024

Hi, this is a great question. Some degree of interpretation is required still for CoRAL outputs. We have not built up a diverse enough sample set to determine how well AmpliconClassifier would work and what the modifications might be to make it compatible with CoRAL results. This will change as we analyze more samples, but we won't be ready to connect those two tools for some time.

A big risk with calling all genome cycles as ecDNA is that some might be derived from a breakage fusion bridge. If you see a high degree of foldback inversions and reconstructions that contain palindromic series of segments, and step-wise changes in CN then it is more likely to be BFB. Also, genome cycles decomposed with low CN may also not be ecDNA. High CN decompositions that are closed in a head-to-tail fashion are more likely ecDNA. We're happy to help interpret results if needed.

Thanks,
Jens

@eesiribloom
Copy link
Author

Thank you for your reply. Here is an example of a cycles.txt file from an amplicon. Note: this was generated with default parameters. When I increased --min_bp_support to 5.0 or 10.0 I saw a significant decrease in paths output - which is expected - but I was no longer able to reconstruct any circular paths, so I went back to default.

I would assume Im looking for circular paths, paths that satisfy constraints (not entirely sure what this means though) and have high copy number and subpaths with high support (is this read support?)

Does 0+ and 0- indicate a circular path?

For example, based on previous results from AA and decoil, cycle 1 and cycle 6 are of interest and overlap regions previously estimated to be contained within ecDNA, but Im not sure if these results necessarily support that.

Interval	1	chr3	130528734	130728734
Interval	2	chr8	42836869	43036869
Interval	3	chr12	23184602	28059521
Interval	4	chr12	29101551	29301551
Interval	5	chr12	31614458	32309448
Interval	6	chr12	72410272	72665271
Interval	7	chr19	29200032	29946017
Interval	8	chr19	33146201	34601276
Interval	9	chr19	39031534	40328962
Interval	10	chr20	19475071	19900922
Interval	11	chrX	1739217	3193027
List of cycle segments
Segment	1	chr3	130528734	130628733
Segment	2	chr3	130628734	130728734
Segment	3	chr8	42836869	42936868
Segment	4	chr8	42936869	43036869
Segment	5	chr12	23184602	23313736
Segment	6	chr12	23313737	23782045
Segment	7	chr12	23782046	23797060
Segment	8	chr12	23797061	23957745
Segment	9	chr12	23957746	23997864
Segment	10	chr12	23997865	25265269
Segment	11	chr12	25265270	25360453
Segment	12	chr12	25360454	25622905
Segment	13	chr12	25622906	25622918
Segment	14	chr12	25622919	25780115
Segment	15	chr12	25780116	25803970
Segment	16	chr12	25803971	25806645
Segment	17	chr12	25806646	25917588
Segment	18	chr12	25917589	27958246
Segment	19	chr12	27958247	28059521
Segment	20	chr12	29101551	29201551
Segment	21	chr12	29201552	29301551
Segment	22	chr12	31614458	31713292
Segment	23	chr12	31713293	31716434
Segment	24	chr12	31716435	32206875
Segment	25	chr12	32206876	32309448
Segment	26	chr12	72410272	72512816
Segment	27	chr12	72512817	72563536
Segment	28	chr12	72563537	72665271
Segment	29	chr19	29200032	29300031
Segment	30	chr19	29300032	29459277
Segment	31	chr19	29459278	29465259
Segment	32	chr19	29465260	29467530
Segment	33	chr19	29467531	29785292
Segment	34	chr19	29785293	29822449
Segment	35	chr19	29822450	29843120
Segment	36	chr19	29843121	29946017
Segment	37	chr19	33146201	34500466
Segment	38	chr19	34500467	34601276
Segment	39	chr19	39031534	39132453
Segment	40	chr19	39132454	39643471
Segment	41	chr19	39643472	40228962
Segment	42	chr19	40228963	40328962
Segment	43	chr20	19475071	19575070
Segment	44	chr20	19575071	19800051
Segment	45	chr20	19800052	19900922
Segment	46	chrX	1739217	1911898
Segment	47	chrX	1911899	1937067
Segment	48	chrX	1937068	2163657
Segment	49	chrX	2163658	2186393
Segment	50	chrX	2186394	2187092
Segment	51	chrX	2187093	2206100
Segment	52	chrX	2206101	2277483
Segment	53	chrX	2277484	2282342
Segment	54	chrX	2282343	2374471
Segment	55	chrX	2374472	2376727
Segment	56	chrX	2376728	2390794
Segment	57	chrX	2390795	3048708
Segment	58	chrX	3048709	3193027
List of longest subpath constraints
Path constraint	1	2-,13+,14+	Support=17	Satisfied
Path constraint	2	40-,35+,47+	Support=1	Satisfied
Path constraint	3	7-,32-,30-	Support=9	Satisfied
Path constraint	4	30+,32+,33+	Support=12	Satisfied
Path constraint	5	49+,51+,52+	Support=2	Satisfied
Path constraint	6	54+,56+,57+	Support=1	Satisfied
Path constraint	7	15+,16+,15-	Support=12	Unsatisfied
Path constraint	8	24-,23+,24+	Support=14	Satisfied
Path constraint	9	6+,7+,15+	Support=1	Unsatisfied
Path constraint	10	12+,13+,13+,14+	Support=58	Satisfied
Path constraint	11	6+,7+,8+	Support=2	Satisfied
Path constraint	12	15+,16+,17+	Support=96	Satisfied
Path constraint	13	22+,23+,24+	Support=15	Satisfied
Path constraint	14	49+,50+,51+	Support=8	Satisfied
Path constraint	15	54+,55+,56+	Support=2	Satisfied
Cycle=1;Copy_count=3.8574164389262715;Segments=0+,5+,6+,7+,8+,9+,10+,11+,12+,13+,14+,15+,16+,17+,27+,6+,7+,8+,9+,10+,11+,12+,13+,14+,15+,16+,17+,27+,6+,7+,8+,9+,10+,11+,12+,13+,13+,14+,15+,16+,17+,18+,10+,11+,12+,13+,14+,15+,16+,17+,27+,6+,7+,8+,9+,10+,11+,12+,13+,14+,15+,16+,17+,18+,19+,0-;Path_constraints_satisfied=10,11,12
Cycle=2;Copy_count=3.8001404858032872;Segments=0+,39+,40+,41+,42+,0-;Path_constraints_satisfied=
Cycle=3;Copy_count=3.165662498442714;Segments=0+,20+,21+,0-;Path_constraints_satisfied=
Cycle=4;Copy_count=3.04413374549155;Segments=0+,1+,2+,0-;Path_constraints_satisfied=
Cycle=5;Copy_count=3.0309324976212757;Segments=0+,22+,23+,24+,25+,0-;Path_constraints_satisfied=13
Cycle=6;Copy_count=2.948191433018197;Segments=0+,29+,30+,32+,33+,34+,35+,47+,34+,35+,47+,48+,49+,51+,52+,54+,56+,57+,58+,0-;Path_constraints_satisfied=4,5
Cycle=7;Copy_count=2.1959669554063765;Segments=0+,26+,27+,28+,0-;Path_constraints_satisfied=
Cycle=8;Copy_count=2.123415783042603;Segments=0+,37+,38+,0-;Path_constraints_satisfied=
Cycle=9;Copy_count=1.970303031811106;Segments=0+,4-,44+,57-,56-,54-,53-,52-,51-,50-,49-,48-,47-,35-,34-,47-,35-,40+,41+,37-,0-;Path_constraints_satisfied=2,6,14
Cycle=10;Copy_count=1.8285167303750316;Segments=0+,43+,44+,57-,56-,54-,52-,51-,49-,48-,47-,46-,0-;Path_constraints_satisfied=
Cycle=11;Copy_count=1.60212559198529;Segments=0+,4-,44+,45+,0-;Path_constraints_satisfied=
Cycle=12;Copy_count=1.1282921215430193;Segments=0+,20+,41+,42+,0-;Path_constraints_satisfied=
Cycle=13;Copy_count=0.9005756417726687;Segments=0+,5+,6+,7+,8+,9+,10+,11+,24-,23+,24+,11-,10-,9-,51-,49-,48-,47-,35-,34-,33-,32-,30-,29-,0-;Path_constraints_satisfied=8
Cycle=14;Copy_count=0.8945466985190293;Segments=0+,29+,30+,32+,33+,34+,35+,36+,0-;Path_constraints_satisfied=
Cycle=15;Copy_count=0.8572647184183353;Segments=0+,20+,41+,37-,0-;Path_constraints_satisfied=
Cycle=16;Copy_count=0.7190071710443533;Segments=0+,43+,44+,57-,56-,55-,54-,53-,52-,51-,49-,48-,47-,46-,0-;Path_constraints_satisfied=
Cycle=17;Copy_count=0.6402548518493347;Segments=0+,2-,13+,13+,14+,15+,16+,17+,18+,10+,11+,24-,23-,24+,11-,10-,9-,8-,7-,6-,27-,26-,0-;Path_constraints_satisfied=1
Cycle=18;Copy_count=0.5216167669073717;Segments=0+,25-,24-,23-,24+,25+,0-;Path_constraints_satisfied=
Cycle=19;Copy_count=0.21732915087771373;Segments=0+,5+,6+,7+,8+,9+,10+,11+,12+,13+,13+,14+,15+,16+,17+,18+,19+,0-;Path_constraints_satisfied=
Cycle=20;Copy_count=0.17478185137363725;Segments=0+,5+,6+,7+,8+,9+,10+,11+,12+,13+,13+,14+,15+,16+,17+,18+,10+,11+,24-,23+,24+,11-,10-,9-,8-,7-,32-,30-,56-,55-,54-,53-,52-,51-,50-,49-,48-,47-,35-,34-,47-,35-,34-,33-,32-,31-,30-,29-,0-;Path_constraints_satisfied=3,15

@jluebeck
Copy link
Member

Hi,

Thanks for providing this output file.

The 0+ / 0- notation at the ends of entries in the cycles section indicates what we termed "source" nodes. These connect to the linear genome adjacent to the listed amplified segments. Thus they are not true genome cycles and not directly indicative of ecDNA.

The path constraints listed above the cycles do not report cyclic paths, and do not use the 0+ / 0- notation so that can be a bit confusing.

Because all entries for the decomposed paths in the Cycles section of the file are non-cyclic, I think it is fair to say that CoRAL did not identify ecDNA in this amplicon. It may exist but was not detected here.

Thanks and let me know if there are other questions,
Jens

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants