text_similarity_reranker returns negative scores for some models #120201

kderusso · 2025-01-15T14:12:03Z

Elasticsearch Version

8.16.1

Installed Plugins

No response

Java Version

bundled

OS Version

Reproducable on cloud

Problem Description

Certain supported rerank models, including cross-encoder__ms-marco-minilm-l-6-v2, return negative scores when used in conjunction with the text_similarity_reranker. Negative scores are not allowed in the query phase, so we need to handle this better.

One potential solution is linearly shifting or otherwise normalizing the returned score values, so they're always within certain parameters.

Steps to Reproduce

PUT _inference/rerank/ms-marco-minilm-l-6-v2
{
  "service": "elasticsearch",
  "task_type": "rerank",
  "service_settings": {
    "num_allocations": 1,
    "num_threads": 1,
    "model_id": "cross-encoder__ms-marco-minilm-l-6-v2"
  },
  "task_settings": {
    "return_documents": true
  }
}

PUT /my-index
{
  "mappings": {
    "properties": {
      "title": {
        "type": "text"
      }
    }
  }
}

POST /my-index/_doc/
{
  "text": "Dog training classes"
}

POST my-index/_search
{
  "retriever": {
    "text_similarity_reranker": {
      "retriever": {
        "standard": {
          "query": {
            "match": {
              "text": "dog"
            }
          }
        }
      },
      "field": "text",
      "inference_id": "ms-marco-minilm-l-6-v2",
      "inference_text": "dog",
      "rank_window_size": 100
    }
  }
}

The returned search result is:

{
  "took": 26,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": -0.5682936,
    "hits": [
      {
        "_index": "my-index",
        "_id": "s1xJapQBZfij0Ahq54L5",
        "_score": -0.5682936,
        "_source": {
          "text": "Dog training classes"
        }
      }
    ]
  }
}

The score of the document is < 0.

Logs (if relevant)

No response

The text was updated successfully, but these errors were encountered:

elasticsearchmachine · 2025-01-15T14:12:26Z

Pinging @elastic/ml-core (Team:ML)

elasticsearchmachine · 2025-01-15T14:12:27Z

Pinging @elastic/es-search-relevance (Team:Search Relevance)

elasticsearchmachine · 2025-01-15T14:12:27Z

Pinging @elastic/search-eng (Team:SearchOrg)

elasticsearchmachine · 2025-01-15T14:12:27Z

Pinging @elastic/search-relevance (Team:Search - Relevance)

leemthompo · 2025-01-15T14:23:19Z

This docs issue looks related: Clarify negative scores returned by Elastic Rerank

kderusso added :ml Machine learning :Search Relevance/Ranking Scoring, rescoring, rank evaluation. :SearchOrg/Relevance Label for the Search (solution/org) Relevance team >bug labels Jan 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

text_similarity_reranker returns negative scores for some models #120201

text_similarity_reranker returns negative scores for some models #120201

kderusso commented Jan 15, 2025

elasticsearchmachine commented Jan 15, 2025

elasticsearchmachine commented Jan 15, 2025

elasticsearchmachine commented Jan 15, 2025

elasticsearchmachine commented Jan 15, 2025

leemthompo commented Jan 15, 2025 •

edited

Loading

text_similarity_reranker returns negative scores for some models #120201

text_similarity_reranker returns negative scores for some models #120201

Comments

kderusso commented Jan 15, 2025

Elasticsearch Version

Installed Plugins

Java Version

OS Version

Problem Description

Steps to Reproduce

Logs (if relevant)

elasticsearchmachine commented Jan 15, 2025

elasticsearchmachine commented Jan 15, 2025

elasticsearchmachine commented Jan 15, 2025

elasticsearchmachine commented Jan 15, 2025

leemthompo commented Jan 15, 2025 • edited Loading

leemthompo commented Jan 15, 2025 •

edited

Loading