-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Set default similarity for Cohere model to cosine #125370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Cohere embeddings are expected to be normalized to unit vectors, but due to floating point precision issues, our check ({@link DenseVectorFieldMapper#isNotUnitVector(float)}) often fails. This change fixes this bug by setting the default similarity for newly created Cohere inference endpoint to cosine. Closes elastic#122878
Pinging @elastic/ml-core (Team:ML) |
Hi @jimczi, I've created a changelog YAML for you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There is a concern the a user might delete an Cohere inference endpoint with default similarity dot_product then recreate it but this time the similarity has defaulted to cosine. If that inference endpoint is used in semantic_text then this could cause suboptimal relevance.
This problem is described in #124272, that work should now be prioritised
We check the model configuration on every index request so this will result in an error. For users that deleted their endpoint and re-create it, they will need to override the similarity explicitly to match the old one. Since the user forced the deletion, I am not particularly concerned by this regression. WDYT? |
I see no reason not to merge this PR. It would be a better user experience to have the check performed up front and I will follow up on that |
Cohere embeddings are expected to be normalized to unit vectors, but due to floating point precision issues, our check ({@link DenseVectorFieldMapper#isNotUnitVector(float)}) often fails. This change fixes this bug by setting the default similarity for newly created Cohere inference endpoint to cosine. Closes elastic#122878
Cohere embeddings are expected to be normalized to unit vectors, but due to floating point precision issues, our check ({@link DenseVectorFieldMapper#isNotUnitVector(float)}) often fails. This change fixes this bug by setting the default similarity for newly created Cohere inference endpoint to cosine. Closes elastic#122878
Cohere embeddings are expected to be normalized to unit vectors, but due to floating point precision issues, our check ({@link DenseVectorFieldMapper#isNotUnitVector(float)}) often fails. This change fixes this bug by setting the default similarity for newly created Cohere inference endpoint to cosine. Closes elastic#122878
Cohere embeddings are expected to be normalized to unit vectors, but due to floating point precision issues, our check ({@link DenseVectorFieldMapper#isNotUnitVector(float)}) often fails. This change fixes this bug by setting the default similarity for newly created Cohere inference endpoint to cosine. Closes #122878
Cohere embeddings are expected to be normalized to unit vectors, but due to floating point precision issues, our check ({@link DenseVectorFieldMapper#isNotUnitVector(float)}) often fails. This change fixes this bug by setting the default similarity for newly created Cohere inference endpoint to cosine. Closes #122878
Cohere embeddings are expected to be normalized to unit vectors, but due to floating point precision issues, our check ({@link DenseVectorFieldMapper#isNotUnitVector(float)}) often fails. This change fixes this bug by setting the default similarity for newly created Cohere inference endpoint to cosine. Closes #122878
Cohere embeddings are expected to be normalized to unit vectors, but due to floating point precision issues, our check ({@link DenseVectorFieldMapper#isNotUnitVector(float)}) often fails. This change fixes this bug by setting the default similarity for newly created Cohere inference endpoint to cosine. Closes elastic#122878
Cohere embeddings are expected to be normalized to unit vectors, but due to floating point precision issues, our check (DenseVectorFieldMapper#isNotUnitVector) often fails. This change fixes this bug by setting the default similarity for newly created Cohere inference endpoint to cosine.
Closes #122878