Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support query API property aliases in query profiles #33321

Open
LuisaWetzel opened this issue Feb 14, 2025 · 2 comments
Open

Support query API property aliases in query profiles #33321

LuisaWetzel opened this issue Feb 14, 2025 · 2 comments
Assignees
Milestone

Comments

@LuisaWetzel
Copy link

Describe the bug
Rank-profile inputs are not set via query-profile.

To Reproduce
Steps to reproduce the behavior:

  1. Used sample application
schema doc {
    document doc {
        field subject type string {
            indexing: summary | attribute | index 
            index: enable-bm25
            attribute {
                fast-search
            }
        }
        field body type array<string> {
            indexing: summary | attribute | index
            index: enable-bm25
            attribute {
                fast-search
            }
        }
    }
    field body_embedding type tensor<bfloat16>(p{},x[1024]) {
        indexing: input body | embed e5 | attribute | index
        attribute {
            distance-metric: angular
        }
        index {
            hnsw {
                max-links-per-node: 16
                neighbors-to-explore-at-insert: 200
            }
        }
    }
    fieldset default {
        fields: subject, body
    }
    rank-profile my_rank_profile {
        inputs {
            query(q) tensor<bfloat16>(x[1024])             
            query(subjectWeight)  : 3            
        }
        function weighted_subject() {
            expression {
                nativeRank(subject) * query(subjectWeight)
            }
        }
        first-phase {
            expression {
                cos(distance(field,body_embedding)) + weighted_subject
            }
        }
        match-features {
            query(subjectWeight)    
            weighted_subject      
            firstPhase
        }
    }
}

Embedder in services.xml

<!-- See https://docs.vespa.ai/en/embedding.html#huggingface-embedder -->
<component id="e5" type="hugging-face-embedder">
    <transformer-model url="https://github.com/vespa-engine/sample-apps/raw/master/simple-semantic-search/model/e5-small-v2-int8.onnx"/>
    <tokenizer-model url="https://raw.githubusercontent.com/vespa-engine/sample-apps/master/simple-semantic-search/model/tokenizer.json"/>
    <prepend> <!-- E5 prompt instructions -->
        <query>query:</query>
        <document>passage:</document>
    </prepend>
</component>
  1. Create QueryProfile

Query-profile type:

<query-profile-type id="NearestNeighborTestTypes">
    <field name="yql" type="string"/>
    <field name="nn-input" type="string"/>
    <field name="input.query(q)" type="tensor(x[1024])"/>
    <field name="input.query(subjectWeight)" type="float"/>
    <field name="ranking.profile" type="string"/>
</query-profile-type>

Query-profile:

<query-profile id="NearestNeighborTestProfile" type="NearestNeighborTypes">
    <field name="yql">select * from sources * where %{.nn-input} or userQuery()</field>
    <field name="nn-input">({targetHits:10}nearestNeighbor(body_embedding,q))</field>
    <field name="input.query(q)">embed(e5,@query)</field>
    <field name="input.query(subjectWeight)">5</field>
    <field name="ranking.profile">my_rank_profile</field>
</query-profile>
  1. Deploy Schema and see error
Error: invalid application package (status 400)
Invalid application:
Error reading query profile 'NearestNeighborTestProfile' of type 'NearestNeighborTestTypes':
Could not set 'input.query(q)' to 'embed(e5,@query)':
Can't find embedder 'e5'. Available embedder ids are 'default'.
  1. Move the embed(e5,@query) into the search request and deploy the schema
<query-profile id="NearestNeighborTestProfile" type="NearestNeighborTypes">
    <field name="yql">select * from sources * where %{.nn-input} or userQuery()</field>
    <field name="nn-input">({targetHits:10}nearestNeighbor(body_embedding,q))</field>
    <field name="input.query(subjectWeight)">5</field>
    <field name="ranking.profile">my_rank_profile</field>
</query-profile>
  1. Query via search request
{
	"query": "Some",
	"queryProfile": "NearestNeighborTestProfile",
	"input.query(q)": "embed(e5,@query)"
}
  1. See that the subjectWeight was not overwritten to 5 by the queryProfile:
{
	"root": {
		"id": "toplevel",
		"relevance": 1.0,
		"fields": {
			"totalCount": 1
		},
		"coverage": {
			"coverage": 100,
			"documents": 1,
			"full": true,
			"nodes": 1,
			"results": 1,
			"resultsFull": 1
		},
		"children": [
			{
				"id": "id:default:doc:g=testing:test::1",
				"relevance": 2.057677356674378,
				"source": "text",
				"fields": {
					"matchfeatures": {
						"firstPhase": 2.057677356674378,
						"query(subjectWeight)": 3.0,
						"weighted_subject": 1.1455871507985373
					},
					"sddocname": "doc",
					"documentid": "id:default:doc:g=testing:test::1",
					"subject": "Some Subject",
					"body": [
						"Some body"
					]
				}
			}
		]
	}
}

Expected behavior
It is possible to set the ranking embedder via QueryProfile. Setting rankProfile inputs through QueryProfile alters the preset in the rankProfile, observable in the matchfeatures.

Environment (please complete the following information):
Dockerized Vespa

Vespa version
Vespa version: 8.475.11

@bratseth
Copy link
Member

Query profiles must use the full name, ranking.features, aliases like input are only supported in queries.

The assumption behind this is that brevity matters more in requests and structure more in query profiles, but even though it is mentioned in the doc I think it is too easy to miss, so we should probably add alias support in query profiles as well. Let's use this issue for that.

@bratseth bratseth changed the title Setting rankProfile inputs via queryProfile Support query API property aliases in query profiles Feb 14, 2025
@LuisaWetzel
Copy link
Author

I changed and tested it and it works, thank you.
I would appreciate alias support.

@hmusum hmusum added this to the later milestone Feb 19, 2025
@hmusum hmusum self-assigned this Feb 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants