Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Far too large pull - grid/slurm support, checkpointing, extra features #73

Open
wants to merge 57 commits into
base: master
Choose a base branch
from

Conversation

flowers9
Copy link

In brief, this pull provides basic support for grid/slurm (and possibly other remote queueing packages) with the -G and -S options (it supports job queueing and tracking of job completion). It does checkpointing for mecat2pw and mecat2cns, allowing restarting of failed jobs (but only for can runs, not m4). It also allows correction against (in mecat2pw) and of (in mecat2cns) a subset of the given reads with the -R option.

The -k option of mecat2cns has been changed to default to zero rather than 10, with the assumption that a quicker partitioning is better (and setting -k to zero is now the same as using a negative value, rather than creating an infinite loop).

It also changes index_t to idx_t (to avoid a solaris namespace conflict) and arbitrarily changes to the code style to one I can read more easily in code I needed to make changes to.

There was a small bug fix to findErrors.C as well to prevent crashes from chunks with no matching reads.

Dave Flowers added 30 commits February 19, 2019 14:49
packages) with the -G and -S options.  It does checkpointing for
mecat2pw and mecat2cns, allowing restarting of failed jobs.  It also
allows correction against (in mecat2pw) and of (in mecat2cns) a subset
of the given reads with the -R option.

The -k option of mecat2cns has been changed to default to zero rather
than 10, with assumption that a quicker partitioning is better (and
setting -k to zero is now the same as using a negative value, rather
than creating an infinite loop).
Some minor warnings (signed/unsigned comparisons and such)
were fixed, and packed_db has been reformatted in preparation
to allow larger read sets.
It was just the position of the entry in the array, which
is kinda pointless.
The rand() calls for non-AGCT basepairs in packed_db
got replaced by a deterministic function to allow
identical output from reruns.  Some more reformatting
while planning the upcoming change allowing for large
fasta files in mecat2cns.
commented out methods that weren't used anywhere
in mecat2cns; also moved packed_db into mecat2cns,
since that's the only place it's used (the bits that
were kinda used in common (lookup_table and split_database)
shouldn't have been using it, as they were treating it
as subroutines with a fixed interface, not a class
also removed unneeded aserts from dw
slightly worried about the memory footprint so currently using u4_t to hold
the read index which limits total reads to 2^32-1, rather than 2^63-1 for the
rest of the program.  Easy to change, but will up the memory footprint of the
reordering by taking 32 bytes per candidate rather than 16
if there are too many reads to reorder up front, check
for it and fail back to the older method (i.e., splitting
candidates by read id and reordering inside each partition)
mainly to make reordering optional for now, as it needs more work -
it's too memory intensive for something that's supposed to mainly
be used when memory is low.  Also made sure checks for minimum
coverage were always applied regardless of what processing options
were chosen.
Changed structures to help lower memory usage, some
refactoring to help add read sorting to also help
with memory usage
pulling recent changes into branch, since it's never going the other way
Also changed spun off processes to not bother listing
options if they're defaults
in the end, it would just take too much memory to hold the read-read pairings,
which doesn't work well when the whole point is to limit memory usage
mainly to reduce memory usage (no need to copy the list when I
can simply sort it instead)
also in the middle of some memory testing
Dave Flowers added 27 commits April 19, 2019 16:29
mostly - two of the subclasses still have small mallocs
…rings

however, this does appear to have slowed thigns down a bit, I suspect
mainly because of the clearing/recreating of strings, but that can
be addressed now that we're off static arrays
also created unified buffer for output instead of left/right buffers
but now getting free errors in the boost routines, for some reason
finally nailed all the bugs (I hope) I created by changing dw.cpp,
and the changes should speed up alignment creation as well as
reduce memory usage
also testing d_path as a deque rather than a vector
turns out dynamic allocation comes with a large cost - 33% slower, and
not appreciably less memory usage.  The other changes made a major speed
increase, though, 2.5-3x increase.
though it's not currently settable
renamed some variables, made end of band calculations a bit quicker
use actual error_rate, not .25, and correct align size, which should
be based directly off the extend size, not k_offset
changed a few vector<char> to vectir<uint1> when they just held values from 0-4;
changed pthread mutexes to std::mutex, which requires c++11
it's not just a right triangle, it's a bounded one
got rid of argument.*, which was no longer used, added and
removed various #includes to better reflect what was actually
needed, changed Align() to finish out the inner loop (k_min
to k_max) when it hit the termination condition and choose the best
of the terminating k values rather than the first one
The lto additions might not be portable, though (particularly the
change to src/mecat2cns/main.mk, as I had to specify the plugin
location for ar)
Got rid of non-standard basic type definitions in mecat2cns,
changed all asserts to assert()
vectorized using SSE2 commands and a touch of assembly
more conversion of idx_t to int64_t
improved both the vectorized and non-vectorized string
comparison in the inner loop; vectorization relies on sse2
gnu intrinsics and the bsfl/bsrl assembly commands

vectorized version is roughly 10% faster
some variables renamed to be more expressive, some int64_t
changed to int
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant