Rescore search results
Stack Serverless
Rescoring can help to improve precision by reordering just the top (e.g. 100 - 500) documents returned by initial retrieval phase (query, knn search) by using a secondary (usually more costly) algorithm, instead of applying the costly algorithm to all documents in the index.
A rescore
request is executed on each shard before it returns its results
to be sorted by the node handling the overall search request.
The rescore API has 3 options:
query
rescorer that executes a providedrescore_query
on the top documentsscript
rescorer that uses a script to modify the scores of the top documentslearning_to_rank
rescorer that uses an LTR model to re-rank the top documents
All rescores have the window_size
parameter that controls how many top
documents will be considered for rescoring. The default is 10.
When implementing pagination, keep the window_size
consistent across pages.
Changing it while advancing through results (by using different from
values)
can cause the top hits to shift, leading to a confusing user experience.
The query rescorer executes a second query only on the top documents returned
from the previous phase. The number of docs which is examined on each shard
can be controlled by the window_size
parameter.
By default, the scores from the original query and the rescore query are combined
linearly to produce the final _score
for each document.
The relative importance of the original query and of the rescore query can be
controlled with the query_weight
and rescore_query_weight
respectively.
Both default to 1
.
For example:
POST /_search
{
"query" : {
"match" : {
"message" : {
"operator" : "or",
"query" : "the quick brown"
}
}
},
"rescore" : {
"window_size" : 10,
"query" : {
"rescore_query" : {
"match_phrase" : {
"message" : {
"query" : "the quick brown",
"slop" : 2
}
}
},
"query_weight" : 0.7,
"rescore_query_weight" : 1.2
}
}
}
An error will be thrown if an explicit sort
(other than _score
in descending order) is provided with a rescore
query.
The way the scores are combined can be controlled with the score_mode
:
Score Mode | Description |
---|---|
total |
Add the original score and the rescore query score. The default. |
multiply |
Multiply the original score by the rescore query score. Useful for function query rescores. |
avg |
Average the original score and the rescore query score. |
max |
Take the max of original score and the rescore query score. |
min |
Take the min of the original score and the rescore query score. |
Stack
script
rescorer uses a script to rescore the top documents returned
from the previous phase. The script has access to the original score as well
as values of document fields.
For example, the following script rescores documents based on the document's
original query score and the value of field num_likes
:
POST /_search
{
"query" : {
"match" : {
"message" : {
"operator" : "or",
"query" : "the quick brown"
}
}
},
"rescore" : {
"window_size" : 10,
"script" : {
"script" : {
"source": "doc['num_likes'].value * params.multiplier + _score",
"parameters": {
"multiplier": 0.1
}
}
}
}
}
learning_to_rank
uses an LTR model to rescore the top documents. You must
provide the model_id
of a deployed model, as well as any named parameters
required by the query templates for features used by the model.
GET my-index/_search
{
"query": {
"multi_match": {
"fields": ["title", "content"],
"query": "the quick brown fox"
}
},
"rescore": {
"learning_to_rank": {
"model_id": "ltr-model",
"params": {
"query_text": "the quick brown fox"
}
},
"window_size": 100
}
}
You can apply multiple rescoring operations in sequence. The first rescorer works on the top documents from the initial retrieval phase, while the second rescorer works on the output of the first rescorer, and so on. A common practice is to use a larger window for the first rescorer and smaller windows for more expensive subsequent rescorers.
POST /_search
{
"query": {
"match": {
"message": {
"operator": "or",
"query": "the quick brown"
}
}
},
"rescore": [
{
"window_size": 10,
"query": {
"rescore_query": {
"match_phrase": {
"message": {
"query": "the quick brown",
"slop": 2
}
}
},
"query_weight": 0.7,
"rescore_query_weight": 1.2
}
},
{
"window_size": 5,
"query": {
"score_mode": "multiply",
"rescore_query": {
"function_score": {
"script_score": {
"script": {
"source": "Math.log10(doc.count.value + 2)"
}
}
}
}
}
}
]
}