Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Efficient sequence alignment path data structure (scikit-bio#2011)
* initial commit * add path function to tabularMSA and create util file * Add _util.py file * Add basic to_cigar and from_cigar functions * Start unit tests for PairAlignPath * Add ability to handle = or X to from_cigar function * Add code attribution for part taken from SO * Remove print statement * Initial version of handling match vs mismatch for to_cigar * Split encoding into separate function * Change input name * Add test data * Start on unit tests * Update fix_arrays function * Change from np.nan to 0 for append in fix_arrays * Numpy version of run_length_encode * Fix fix_arrays function * Enable from_cigar to handle strings with or without ones * Add error handling and tests for from_bits in PairAlign * PairAlignPath fully covered * Expand unit tests * Test more than 8 seqs for from_bits * To_indices tests * Complete coverage for to_indices * Full coverage * Update init file * Added non default gap character handling to from_tabular * Add skbioobject to class * Update fix_arrays * Basic repr function * Switch to starts from n_seqs * Rewrite unit tests * Remove large data files * Simplify pairwise_align_score * Docstring as raw strings and fix toctree * Simplify pairwise align_score and rename multiple_align_scores * Fix tabular from_path_seqs * Improve docstrings * Change to np.int64 to make Windows compatible * Remove alignment score functions for future PR * Allow for non numpy array like in from_bits * Remove overrides decorator, see what happens * Finish docstrings for PairAlignPath * Remove unused import * Improve error handling for to_cigar and remove subset * Improve error handling for to_cigar * Figuring out to_cigar * Fix to_cigar functionality by converting to str * Complete unit tests for to_cigar * Finish unit test for from_cigar * Unit tests for initialization * Modify Pair to_bits to handle two gaps * Functional version of to_cigar * Enhanced to_cigar function * Remove gap chars for to_cigar * Enhancement comments * Modify to_cigar logic * Optimized version of to_cigar * Update docstrings * Update unittests * Test RLE * Add Tabular from_path_seqs test * Remove float option for gap in to_indices * Update CHANGELOG * Address most recent comments * Move mapping and switch to unsigned int for starts * Rename mapping and codes * Create class properties for states, starts, lengths, and shapes * Switch ValueError to TypeError where appropriate * Paired programming additions * Update to/from_indices functionality to handle starts * Update to/from_coordinates functionality to handle starts * Lint tests * Remove unused import * Start on docstring examples * AlignPath docstring * Add examples to docstrings * Add example text * More examples * Final examples * Change gaps to starts * Fix doctests
- Loading branch information