Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add assign_clade method to CladeTime class #57

Merged
merged 13 commits into from
Nov 13, 2024

Commits on Nov 6, 2024

  1. Add ncov_metadata property to Tree class

    Since it's possible to mix and match sequence_as_of and
    tree_as_of dates in cladetime, sequences and reference
    trees may have different ncov_metadata attributes
    (dataset version, nexclade cli version, for example)
    Add an ncov_metadata property to Tree that reflects
    metadata for the tree_as_of date (as opposed to
    CladeTime's ncov_metadata property, which reflects
    sequence_as_of).
    
    We'll use this new property to make sure we're using
    the correct nextclade dataset when assigning
    clades.
    bsweger committed Nov 6, 2024
    Configuration menu
    Copy the full SHA
    f07d85c View commit details
    Browse the repository at this point in the history
  2. Use "strain" as the id for filtering sequences

    Still in the NCBI mindset, earlier versions of sequence.filter
    used accession numbers to compare .fasta records to a set
    of sequence "ids". However, for the processed Nextstrain
    sequences, we need to use the "strain" column
    bsweger committed Nov 6, 2024
    Configuration menu
    Copy the full SHA
    d516677 View commit details
    Browse the repository at this point in the history

Commits on Nov 7, 2024

  1. Configuration menu
    Copy the full SHA
    477e01e View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    b03f01e View commit details
    Browse the repository at this point in the history

Commits on Nov 8, 2024

  1. Fix circular import / change the signature of Tree

    We will need to instantiate a Tree object from CladeTime
    when assigning clade sequences. Thus, we shouldn't use
    CladeTime objects to do this because circulate dependencies
    bsweger committed Nov 8, 2024
    Configuration menu
    Copy the full SHA
    049ec1b View commit details
    Browse the repository at this point in the history
  2. Add collection date parameters to sequence.filter_metadata

    Adding these parameters allows additional filtering on
    sequence metadata for min and max collection dates. This
    is in support of clade assignemnts, where we'll only
    want to assign clades to a small subset of sequences based
    on their collection date. Behavior is unchanged if these
    new parameters aren't specified.
    bsweger committed Nov 8, 2024
    Configuration menu
    Copy the full SHA
    ba7ff4a View commit details
    Browse the repository at this point in the history
  3. Move date validation function out of cladetime.py

    This will allow re-use of that function when working with
    collection begin/end dates in sequence assignment
    
    Additional test cases for date commit
    bsweger committed Nov 8, 2024
    Configuration menu
    Copy the full SHA
    38c384d View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    ed4dc43 View commit details
    Browse the repository at this point in the history

Commits on Nov 12, 2024

  1. Add assign_clades method to CladeTime

    This new method is how clade time users (including people
    using the upcoming CLI) will do custom clade assignments.
    After validating dates, assign_clades calls out to existing
    functions, performing a kind of "mini pipeline" to return
    a LazyFrame with the results from Nextclade merged with
    metdata from the sequences being assigned.
    bsweger committed Nov 12, 2024
    Configuration menu
    Copy the full SHA
    fd6da4c View commit details
    Browse the repository at this point in the history
  2. Add tests for the new CladeTime assign_clades method

    This changeset represents new tests for the assign_clades
    method, as well as updates that reflect some refactoring
    that occurred along the way.
    bsweger committed Nov 12, 2024
    Configuration menu
    Copy the full SHA
    c148db3 View commit details
    Browse the repository at this point in the history
  3. Update the return value of assign_clades

    This changeset returns a summarized version of the clade
    assignments as well as some metadata about the clade
    assignment process.
    bsweger committed Nov 12, 2024
    Configuration menu
    Copy the full SHA
    358f759 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    7c2aa5e View commit details
    Browse the repository at this point in the history
  5. Fix readthedocs build error

    bsweger committed Nov 12, 2024
    Configuration menu
    Copy the full SHA
    c9c09d1 View commit details
    Browse the repository at this point in the history