Skip to content

LexicMap v0.3.0

Compare
Choose a tag to compare
@shenwei356 shenwei356 released this 22 May 06:58
· 192 commits to main since this release

v0.3.0 - 2024-05-14

  • lexicmap index:
    • Better seed coverage by filling sketching deserts.
    • Use longer (1000bp N's, previous: k-1) intervals between contigs.
    • Fix a concurrency bug between genome data writing and k-mer-value data collecting.
    • Change the format of k-mer-value index file, and fix the computation of index partitions.
    • Optionally save seed positions which can be outputted by lexicmap utils seed-pos.
  • lexicmap search:
    • Improved seed-chaining algorithm.
    • Better support of long queries.
    • Add a new flag -w/--load-whole-seeds for loading the whole seed data into memory for faster search.
    • Parallelize alignment in each query, so it's faster for a single query.
    • Optional outputing matched query and subject sequences.
    • 2-5X searching speed with a faster masking method.
    • Change output format.
    • Add output of query start and end positions.
    • Fix a target sequence extracting bug.
    • Keep indexes of genome data in memory.
  • lexicmap utils kmers:
    • Fix a little bug, wrong number of k-mers for the second k-mer in each k-mer pair.
  • New commands:
    • lexicmap utils gen-masks for generating masks from the top N largest genomes.
    • lexicmap utils seed-pos for extracting seed positions via reference names.
    • lexicmap utils reindex-seeds for recreating indexes of k-mer-value (seeds) data.
    • lexicmap utils genomes for list genomes IDs in the index.