-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Chess sim quits with ERROR "all regions need to span at least 20 bins " #23
Comments
Also when I filtered my reads to be more than the 20X limit, 99.9% of my bedpe regions fail with the following error: |
Maybe you could try using the normalized hic files generated by juicer. I also found the cool files sometimes can fail. Please refer to #16, however, I also confused about the normalization procedures in the chess publication. Best wishes, |
Hi @rikrdo89 , to answer your first question, chess should only quit with this message if all regions you submitted in the bedpe span less than 20 bins. Is that the case? About the second: do I understand right that you get numeric output for some of the regions (0.01%) ? That is, not all rows in the ouput file are nan? |
Hi Nick. Yes out of the 5000 regions i used (that span more than 20x bins) only one didn't have nan for the first two fields. The last field, however, also have a nan. |
could you try to update fanc to 0.9.9 ( |
I updated fanc to 0.9.9 and I still have the same issue.
|
Do the regions in the bedpe need to be formatted in any particular way? I wonder if I need to do some sorting or maybe binning of the regions ? Or is this a different issue? |
The bedpe should look like the output of |
Correct. I generated the bedpe from running a loop caller, and then formatting it in a way chess would not immediately quit ( removing chr, adding unique identifier, etc) |
I think this is a reasonable application of chess, and the command you posted looks good to me. |
I generated a bedpe using
and it now works. I get a result table as expected. However there were 37 out of 610 regions that failed, and this regions are mainly at the beginning and at the end of chromosome 19. So this makes me think that there may be a total read cutoff and thus why it may be failing in my loop calls? or maybe it is not just the distance of the loop anchor (20X bins) but also the actual size of the patch (submatrix) enclosed by the anchors that needs to be greater than a certain number of bins? I will troubleshoot a little more. In the mean time, could you please recommend some optimal parameter for |
Also could you please clarify if the comparisons are being done with the raw values of the hic or normalized values? or could I specify which set of values to use by inputting something like control.hic@5000@KR ? |
The bedpe generated by chess requires all matrices to have at least 20 x 20 "pixels". If the distance between your loop anchors (i.e. the difference between columns 2 and 3 in the bedpe file) is at least 20 * your bin size, the region is large enough. However, chess is more suitable for larger regions, we recommend spans >= 100 bins. Chess also imposes a maximum fraction of unmappable / masked bins in a matrix, by default this is 0.1 (see You have to provide normalized matrices, e.g. Knight-Ruiz balanced. The comparisons are done on observed/expected transformed matrices. If your data is not in OE format, it is automatically transformed. This all goes through FAN-C. I believe that the normalized values are used automatically if present in the provided data, but I am not 100 percent sure here: @kaukrise might now better. I cannot really tell you what parameters to use for |
This is quite helpful... I see, so the problem must be the 20x20 pixels, i.e. the 20X min is actually the size of the side sub-matrix, not the min distance of the loop. I was filtering my loop coordinates based on the distance between column 5 and column 2. So maybe running For the balanced matrices, I can use a cool file with all the normalization and balance I previously applied to it, but This is extremely helpful. Thanks Nick! |
One more thing, it just occurred to me that I might be misunderstanding what you are trying to do: if you say that you want to compare loops, does that mean that you are trying to somehow feed off-diagonal regions to chess, for instance a small submatrix around a TAD corner peak? |
Yeah that's what I initially thought, I was hoping it could make some comparisons of these loops as well... and I think it does to some extend as long as they are large regions, but I understand that chess was not made to make this comparisons. Thanks again Nick. |
I have two hic files, and I want to compare the differences across this files for a given set of loops (bedpe). Whenever I use the hic files, even at the lowest resolution of 5k, I always get the 20 bin error. Is there any way to disable this?
The text was updated successfully, but these errors were encountered: