Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Phased ranking support for streaming mode #33283

Open
Alexander-Mark opened this issue Feb 8, 2025 · 1 comment
Open

Phased ranking support for streaming mode #33283

Alexander-Mark opened this issue Feb 8, 2025 · 1 comment
Milestone

Comments

@Alexander-Mark
Copy link

Is your feature request related to a problem? Please describe.
Currently streaming mode doesn't support phased ranking. This makes it tricky to efficiently run inference with more expensive models e.g. ColBERT max sim.

Describe the solution you'd like
For streaming mode to support phased ranking in the same way as indexing mode, or (if not possible within the design) an alternative approach that achieves something similar.

Describe alternatives you've considered
Using conditional logic to determine whether to run inference:

function myFunction() {
    if (cheapExpression > cutoff, cheapExpression, expensiveExpression)
}

Additional context
It's possible I've overlooked some existing features and the use case I'm describing is already doable within the current design.

@hmusum hmusum added this to the later milestone Feb 12, 2025
@bratseth
Copy link
Member

You can use global-phase instead of second-phase.

With indexed search, second-phase is usually preferable because it runs locally in parallel on each content node, and as fan-out increases this becomes important to achieve parallelism and avoid network saturation. However, with streaming, fan-out is close to 1 on average regardless of the size of the content cluster (since queries are only routed to the buckets having content for that user/group), so global-phase performs well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants