Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

straw and strawr don't dump same values #99

Open
ArielPaulson opened this issue Feb 16, 2022 · 1 comment
Open

straw and strawr don't dump same values #99

ArielPaulson opened this issue Feb 16, 2022 · 1 comment

Comments

@ArielPaulson
Copy link

So I am dumping obs/exp data from a hic file with command-line 'straw' versus R library 'strawr', and I am not getting the same results.

The data are very similar overall and correlate highly, but still are clearly not the same values, upwards of 80% of non-NA rows are different at 4 decimals of accuracy. This holds true across normalizations, bin sizes, and chromosomes, even unnormalized data (i.e. NONE oe) has this problem.

I am using the 'straw' compiled from the latest github release, and 'strawr' installed fresh just a few days ago on R-4.1.0 via install.packages().

I also compared the data from juicer tools 'dump' and found that it was basically identical to strawr.

Here is a row slice from a table showing both methods, same hic file, chr 1, VC, oe, 10kb:

PosA PosB dump strawr straw
40000 40000 0.463189 0.463189 0.463189
40000 45000 1.971135 1.971135 1.971135
40000 50000 2.149339 2.149339 2.149339
40000 55000 1.261088 1.261088 1.261088
40000 60000 0.776958 0.776958 0.624063
40000 65000 0.687151 0.687151 0.855503
40000 70000 0.394186 0.394187 0.333246
40000 80000 1.384906 1.384906 0.854544
40000 105000 1.731343 1.731343 1.358210
40000 110000 1.961904 1.961904 1.652741
40000 115000 0.312818 0.312818 0.240716
40000 120000 0.190295 0.190295 0.488769
40000 130000 0.333526 0.333526 0.338950
40000 135000 1.289947 1.289947 0.944601
40000 140000 0.450147 0.450147 0.437852
40000 145000 1.116514 1.116514 1.116514
40000 150000 0.638958 0.638958 0.459245
40000 165000 0.737243 0.737243 0.832634
40000 175000 0.632508 0.632508 0.678903
40000 190000 1.396106 1.396106 1.578750

Not sure how to proceed.

Thanks,
Ariel

@sa501428
Copy link
Member

sa501428 commented Feb 16, 2022

Straw is now using an improved expected model, specifically it applies a rolling median on the previous expected vector. This leads to reduction in the noise and more reliable O/E values. strawR has not yet been updated to use this expected model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants