Skip to content

Conversation

@neildsh
Copy link
Contributor

@neildsh neildsh commented Mar 12, 2025

Weighted RRF support

This change adds support for weighted RRF in hybrid search.

We allow weights to be negative but the negative sign is used to signal that we should sort scores in ascending order for the corresponding component. The final WRRF score is then computed using the absolute value of the weight.

In this approach, the sign of the weight indicates the interpretation of the ranking itself rather than directly affecting the calculated score:
WRRF(d) = ∑ |w_i| × 1/(k + r_i'(d))

Where:

|w_i| is the absolute value of the weight for the i-th component
r_i'(d) is the rank of document d in the i-th component, with a crucial difference:

  • If w_i > 0: r_i'(d) is the original rank (descending order, where 1 is best)
  • If w_i < 0: r_i'(d) is the inverted rank (ascending order, where higher number is better)

k = 60

Intuition Behind This Modification

This modification addresses a key scenario in information retrieval and ranking fusion: sometimes "lower is better" rather than "higher is better."

  • Handling Different Ranking Interpretations: Some systems naturally produce rankings where lower values are better (like error rates, distances, or pricing). The sign becomes a semantic indicator of how to interpret the ranking.
  • Unifying Disparate Metrics: This allows you to combine rankings based on completely different metrics (relevance scores, error rates, freshness, etc.) without having to pre-process them into a common format.
  • Integration of Minimization Metrics: You can directly incorporate metrics that should be minimized (like latency or error) alongside metrics that should be maximized (like relevance or user engagement).

Type of change

  • New feature (non-breaking change which adds functionality)

sboshra
sboshra previously approved these changes Mar 13, 2025
Copy link
Contributor

@sboshra sboshra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Fix a couple of bugs for weighted RRF and add unit test infra for weighted RRF

Add more test coverage for weighted RRF

incorporate code review feedback
Copy link
Contributor

@sboshra sboshra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Copy link
Contributor

@adityasa adityasa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@neildsh neildsh added QUERY auto-merge Enables automation to merge PRs labels Mar 19, 2025
@microsoft-github-policy-service microsoft-github-policy-service bot merged commit 466708a into master Mar 19, 2025
27 checks passed
@microsoft-github-policy-service microsoft-github-policy-service bot deleted the users/ndeshpan/weightedRRF branch March 19, 2025 21:49
amanrao23 added a commit to Azure/azure-sdk-for-js that referenced this pull request May 9, 2025
…34222)

### Packages impacted by this PR
@azure/cosmos

### Issues associated with this PR
#34221


### Describe the problem that is addressed by this PR

1. ***Add support for weighted RRF in hybrid search.***
We allow weights to be negative but the negative sign is used to signal
that we should sort scores in ascending order for the corresponding
component. The final WRRF score is then computed using the absolute
value of the weight.
In this approach, the sign of the weight indicates the interpretation of
the ranking itself rather than directly affecting the calculated score:
WRRF(d) = ∑ |w_i| × 1/(k + r_i'(d))

2. ***Adds support for the optimized query plan***
Adds a QueryFeature that returns optimized query plan, effectively
removing the need for rewriting orderByExpressions in the SDK.
A flag `disableHybridSearchQueryPlanOptimization` is added to disable
returning optimized query plan. This is done to ensure the query works
as expected for older gateways.

### Are there test cases added in this PR? _(If not, why?)_
Yes

### Provide a list of related PRs _(if any)_
* Azure/azure-cosmos-dotnet-v3#5064
* Azure/azure-cosmos-dotnet-v3#5120
* Azure/azure-cosmos-dotnet-v3#5121

### Command used to generate this PR:**_(Applicable only to SDK release
request PRs)_

### Checklists
- [ ] Added impacted package name to the issue description
- [ ] Does this PR needs any fixes in the SDK Generator?** _(If so,
create an Issue in the
[Autorest/typescript](https://github.com/Azure/autorest.typescript)
repository and link it here)_
- [ ] Added a changelog (if necessary)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-merge Enables automation to merge PRs QUERY

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants