Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CrossEncoderModule with rerank API #389

Merged
merged 7 commits into from
Sep 12, 2024
Merged

Commits on Sep 11, 2024

  1. CrossEncoderModule with rerank API

    This module is closely related to EmbeddingModule.
    
    Cross-encoder models use Q and A pairs and are trained return a relevance score for rank().
    The existing rerank APIs in EmbeddingModule had to encode Q and A
    separately and use cosine similarity as a score. So the API is the same, but the results
    are supposed to be better (and slower).
    
    Cross-encoder models do not support returning embedding vectors or sentence-similarity.
    
    Support for the existing tokenization and model_info endpoints was also added.
    
    Signed-off-by: Mark Sturdevant <[email protected]>
    markstur committed Sep 11, 2024
    Configuration menu
    Copy the full SHA
    5b0989f View commit details
    Browse the repository at this point in the history

Commits on Sep 12, 2024

  1. Cross-encoder improvements from code review

    * mostly removing unnecessary code
    * some better clarity
    
    Signed-off-by: Mark Sturdevant <[email protected]>
    markstur committed Sep 12, 2024
    Configuration menu
    Copy the full SHA
    7146ffe View commit details
    Browse the repository at this point in the history
  2. Cross-encoder docstring fix

    * The already borrowed errors are fixed with tokenizers per thread,
      so there were some misleading comments about not changing params
      for truncation (which we do for cross-encoder truncation).
    
    Signed-off-by: Mark Sturdevant <[email protected]>
    markstur committed Sep 12, 2024
    Configuration menu
    Copy the full SHA
    ac46993 View commit details
    Browse the repository at this point in the history
  3. Cross-Encoder use configurable batch size.

    Default is 32.
    Can override with embedding batch_size in config or EMBEDDING_BATCH_SIZE env var.
    
    Signed-off-by: Mark Sturdevant <[email protected]>
    markstur committed Sep 12, 2024
    Configuration menu
    Copy the full SHA
    4e9c5aa View commit details
    Browse the repository at this point in the history
  4. Cross-encoder: Move truncation check and add tests

    * Moved the truncation check to a place that can determine
      the proper index for the error message (with batching).
    
    * Added test to validate some results after truncation.
      This is with a tiny model, but works for sanity.
    
    Signed-off-by: Mark Sturdevant <[email protected]>
    markstur committed Sep 12, 2024
    Configuration menu
    Copy the full SHA
    211668a View commit details
    Browse the repository at this point in the history
  5. Cross-encoder: fix truncation test

    The part that really tests that a token is truncated was wrong.
    
    * It was backwards and passing because the scores are sorted by rank
    * Using the index to get scores in the order of the inputs
    * Now correctly xx != xy but xy == xyz (truncated z)
    
    Signed-off-by: Mark Sturdevant <[email protected]>
    markstur committed Sep 12, 2024
    Configuration menu
    Copy the full SHA
    2cb6183 View commit details
    Browse the repository at this point in the history
  6. Cross-encoder: remove some unused and tidy up some comments

    Signed-off-by: Mark Sturdevant <[email protected]>
    markstur committed Sep 12, 2024
    Configuration menu
    Copy the full SHA
    8fa67cc View commit details
    Browse the repository at this point in the history