Skip to content

Updating score assertions to use close_to instead of match for linear_retriever yml tests #128865

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

pmpailis
Copy link
Contributor

@pmpailis pmpailis commented Jun 3, 2025

This PR changes the score validation in linear retriever yaml tests from match to close_to to avoid arithmetic errors like the following:

java.lang.AssertionError: Failure at [linear/10_linear_retriever:319]: field [hits.hits.3._score] doesn't match the expected value
Expected: <1.2>
     but: was <1.2000000476837158>

Closes #128774

@pmpailis pmpailis added >test Issues or PRs that are addressing/adding tests :Search Relevance/Ranking Scoring, rescoring, rank evaluation. labels Jun 3, 2025
@elasticsearchmachine elasticsearchmachine added Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.1.0 labels Jun 3, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@pmpailis
Copy link
Contributor Author

pmpailis commented Jun 4, 2025

run elasticsearch-ci/part-4

@pmpailis pmpailis requested a review from Copilot June 4, 2025 12:08
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the linear retriever YAML tests to use close_to assertions for floating-point scores (with a small tolerance) and re-enables the previously muted normalization test.

  • Switch hard match checks on hits.hits.*._score to close_to with error: 0.001
  • Remove the mute entry for the should normalize initial scores with l2_norm test

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
x-pack/plugin/rank-rrf/src/yamlRestTest/resources/rest-api-spec/test/linear/10_linear_retriever.yml Updated all _score assertions to use close_to with a ±0.001 margin
muted-tests.yml Unmuted the linear retriever normalization test by removing its entry
Comments suppressed due to low confidence (1)

x-pack/plugin/rank-rrf/src/yamlRestTest/resources/rest-api-spec/test/linear/10_linear_retriever.yml:265

  • [nitpick] Inline map spacing is inconsistent here; consider removing the extra space before the closing brace to match the formatting of other close_to entries.
-  - close_to: { hits.hits.3._score: { value: 0.0, error: 0.001 } }

Copy link
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IEEE 754 strikes again. I hate floating point.

@pmpailis pmpailis merged commit 2131323 into elastic:main Jun 4, 2025
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Search Relevance/Ranking Scoring, rescoring, rank evaluation. Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch >test Issues or PRs that are addressing/adding tests v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CI] LinearRankClientYamlTestSuiteIT test {yaml=linear/10_linear_retriever/should normalize initial scores with l2_norm} failing
3 participants