-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
partition speed improvement #16
base: master
Are you sure you want to change the base?
Conversation
I could not compile with these changes, the errors were below. I think it is because of flowers9:index_t branch changed index_t to idx_t but in this branch it still uses index_t. Sorry I am not that experienced with git so not sure if there is an elegant way to merge both branches. I manually edit the files in this branch and changed index_t back to idx_t and now it compiles...
|
Dear all, thanks for your interest in MECAT, We have updated MECAT versiong 1.3 and fixed these issues by adding one new option '-k to specified the number of partition files. Please complie the new version again and use '-k -1' to let mecat2cns write as many as possible partition files at one pass. Thanks. |
When partitioning the *.can file, multiple passes are required - one per 10 output files (plus one to get the number of reads involved). As the *.can file can be many gigabytes, this slows down the partitioning process quite a bit. This fix increases the number of files that can be written to be closer to the system limit, rather than a fixed 10, likely reducing the number of passes to one (plus the one to get the number of reads).
My initial approach of using std::vector<PODArray > failed when other variables got overwritten. I didn't want to muck around in PODArray<> to figure out what the cause was, so I used new[]/delete[] instead.