-
Notifications
You must be signed in to change notification settings - Fork 25.3k
ES|QL: Remove redundant sorts from execution plan #121156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ES|QL: Remove redundant sorts from execution plan #121156
Conversation
Hi @luigidellaquila, I've created a changelog YAML for you. |
…ort' into esql/remove_redundant_sort
Pinging @elastic/es-analytical-engine (Team:Analytics) |
@@ -1839,10 +1839,9 @@ public void testCombineOrderByThroughFilter() { | |||
|
|||
/** | |||
* Expected | |||
* TopN[[Order[first_name{f}#170,ASC,LAST]],1000[INTEGER]] | |||
* \_MvExpand[first_name{f}#170] | |||
* \_TopN[[Order[emp_no{f}#169,ASC,LAST]],1000[INTEGER]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notice that here you are losing a performance aspect: mv_expand
is applied on all documents. The idea with a sort (and the default pushed down limit
) in front of mv_expand
was to have fewer documents to expand from because the expansion is, theoretically, creating even more rows. Before this PR, each node was getting 1000 docs, mv_expanded them and a final sort performed. With this PR, each node is expanding everything, then sorting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but you were also discarding all the employees after the first 1000 (sorted by emp_no), that is not what the query says, so technically you got a faster execution, but a wrong result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's great to allow more queries - left a few comments regarding the rule implementation.
* | ||
* This rule finds and removes redundant SORTs, making the plan executable. | ||
*/ | ||
public class RemoveRedundantSort extends OptimizerRules.OptimizerRule<TopN> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense to combine this with PruneOrderByBeforeStats
since the logic is similar.
Remove -> Prune
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Costin, I didn't realize the two rules were so similar.
I'll merge them in a single one.
A detail worth noting is that PruneOrderByBeforeStats
currently considers Eval
as a sort-agnostic plan (and it's correct now, since we don't have window functions yet). I'll keep the same logic for now, and I'll add a comment so that we don't forget.
The good thing is that, with this logic, we allow SORT pruning after all the currently supported plans (apart from LIMIT, but it will become a TopN anyway), so now we no longer have unbounded sort.
p = lj.left(); | ||
// TODO do it also on the right-hand side? | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make the function recursive and let it run on both sides.
}); | ||
} | ||
|
||
private OrderBy findRedundantSort(TopN plan) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return a list to perform only one modification (and traversals) on the tree.
protected LogicalPlan rule(TopN plan) { | ||
OrderBy redundant = findRedundantSort(plan); | ||
if (redundant == null) { | ||
return plan; | ||
} | ||
return plan.transformDown(p -> { | ||
if (p == redundant) { | ||
return redundant.child(); | ||
} | ||
return p; | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bottom up traversal would help collect all pruned sorts in one traversal instead of a top-down per sort:
- those that occur before stats
- those that occur before other sorts
|
||
@Override | ||
public void postOptimizationVerification(Failures failures) { | ||
failures.add(fail(this, "The query cannot be executed because it would require unbounded sort")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rephrase the error message to include an action item for the user: `Unbounded sort not supported yet, please add a limit"
// IMPORTANT | ||
// If we introduce window functions or order-sensitive aggs (eg. STREAMSTATS), | ||
// the previous sort could actually become relevant | ||
// so we have to be careful with plans that could use them, ie. the following | ||
|| unary instanceof Filter | ||
|| unary instanceof Eval | ||
|| unary instanceof Aggregate) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is safe now, as long as we don't have window functions (and, in general, functions/aggs that rely on the order of the input).
We can decide to keep it like this for now and review it when we introduce such capabilities, or we can be more paranoid about future regressions and discard such cases, but in this case we won't be able to completely avoid unbounded sorts.
Thanks for the feedback @alex-spies @costin, I added another round of changes that should address your comments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is super nice. I have only minor suggestions. Thanks Luigi!
* </code> | ||
* <p> | ||
* | ||
* In all the other cases, eg. if the command does not implement this interface, or if dependsOnInputOrder() = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++ I think this is really, really important. The pruning of previous sorts must be opt-in, not opt-out.
Costin and I also discussed flipping the default the other way around, i.e. making it so pruning previous sorts is allowed unless this interface is implemented. While Costin would prefer that for verbosity reasons/because most commands are okay with this optimization, I believe this would be really dangerous as it would lead to wrongly optimized plans per default - and we're in the process of adding new commands right now by people who can't be aware of this gotcha in the optimizer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorting is expensive in a distributed system and commands and users should not depend on it.
At the moment, only Limit and the upcoming change point depend on it, all the other commands do not - hence my preference to opt in. That is the default is optimized for the majority of use-cases.
If it's opt in, the presence of the (marker) interface (SortAware) is enough - no need to implement a method, it's redundant.
If it's opt out, use the same marker interface but change its name to reflect the intent, e.g. SortIgnorant
, SortIgnorant
and simply use its presence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorting is expensive in a distributed system and commands and users should not depend on it.
At the moment, only Limit and the upcoming change point depend on it, all the other commands do not - hence my preference to opt in. That is the default is optimized for the majority of use-cases.
@luigidellaquila and I discussed this and we both think that making the optimization opt-out is a bad idea.
Personally, I have a somewhat strong opinion here. I believe that we'd be really doing ourselves a disservice by making the optimization opt-out as this will, per default, silently break correct queries whenever we add a new command. One such command was just merged (#120998) today.
The key part here is the "silent". It's silent in the code by being opt-out, but more importantly it's silent to users when they get wrong query results. This increases the risk of rolling out buggy commands whose output doesn't make sense and finding the bug in production at a much later point in time.
Forgetting to ask for the opt-out marker when reviewing a PR with a new command is also something that's really easy to do, resp. the number of things that a reviewer needs to know and look out for is increased. And even for new commands that should be fine with removing upstream sorts, I think it's still better if the first implementation doesn't automatically enable the optimization, so that we can enable it later while simultaneously adding tests.
On the other hand, making the optimization opt-in only increases the verbosity a little bit because most plan nodes actually do opt in. I believe the trade offs here are therefore very much in favor of the opt-in solution. And if we want to keep verbosity down, there's still the possibility to create a new subclass of UnaryPlan
called StreamingPlan
, opt this one in, and have all our commands that operate in a row-based manner inherit from that.
If it's opt in, the presence of the (marker) interface (SortAware) is enough - no need to implement a method, it's redundant.
++, @luigidellaquila and I agreed on going this route.
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/SortAware.java
Outdated
Show resolved
Hide resolved
...rc/main/java/org/elasticsearch/xpack/esql/optimizer/rules/logical/PruneRedundantOrderBy.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/SortAware.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/SortAware.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/SortAware.java
Outdated
Show resolved
Hide resolved
var query = """ | ||
ROW x = [1,2,3], y = 1 | ||
| SORT y | ||
| MV_EXPAND x | ||
| WHERE x > 2 | ||
"""; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Could we have a simple test like this but for LOOKUP JOIN? This test is a nice minimal example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe it makes sense to move tests related to the new optimizer rule into a dedicated file. The LogicalPlanOptimizerTests file is already huge and it's hard to see what's tested and not.
* | \_EsRelation[test][_meta_field{f}#26, emp_no{f}#20, first_name{f}#21, ..] | ||
* \_EsRelation[languages_lookup][LOOKUP][language_code{f}#31] | ||
*/ | ||
public void testRedundantSortOnMvExpandJoinEnrichGrokDissect() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a test with SORT + DROP/KEEP/RENAME + SORT. Could we add one if they already exist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding one.
nit: maybe it makes sense to move tests related to the new optimizer rule into a dedicated file.
Let's do it with a separate PR. LogicalPlanOptimizerTests has quite some initialization logic, it will need some refactoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small tweaks. LGTM
* </code> | ||
* <p> | ||
* | ||
* In all the other cases, eg. if the command does not implement this interface, or if dependsOnInputOrder() = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorting is expensive in a distributed system and commands and users should not depend on it.
At the moment, only Limit and the upcoming change point depend on it, all the other commands do not - hence my preference to opt in. That is the default is optimized for the majority of use-cases.
If it's opt in, the presence of the (marker) interface (SortAware) is enough - no need to implement a method, it's redundant.
If it's opt out, use the same marker interface but change its name to reflect the intent, e.g. SortIgnorant
, SortIgnorant
and simply use its presence.
/** | ||
* breadth-first recursion to find redundant SORTs in the children tree. | ||
*/ | ||
private IdentityHashMap<OrderBy, Void> findRedundantSort(LogicalPlan plan) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hide the implementation by returning the keySet as a Collection<OrderBy>
on which call contains
if (redundant.containsKey(p)) { | ||
return ((OrderBy) p).child(); | ||
} | ||
return p; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-> collection.contains(p) ? ((UnaryPlan) p).child() ? p
Thanks @costin
I'd prefer to keep it safe and have the optimization happen only when we know it's safe. ES|QL is evolving fast, and the risk of breaking things is pretty high
👍 I renamed it |
|
Thanks folks! |
💔 Backport failed
You can use sqren/backport to manually backport by running |
Fixes: #120817
Fixes: #120803
When we have multiple SORT commands in a query, some of them could be redundant.
Eg.
in both cases, the first SORT is practically irrelevant, as the second SORT will reorder the results, and the commands in-between don't rely on the order of results.
In addition, since the second SORT "absorbes" the LIMIT, the first SORT will remain unbounded, and it is problematic, since ES|QL does not support unbounded sort.
This change adds an optimization rule that finds and removes redundant sorts.
A SORT is redundant if all the following conditions are met:
Eg. in
FROM idx | ... | SORT x | <command1> | ... | <commandN> | SORT y
,SORT x
can be considered redundant:SORT y
will reorder the results<command1>
,<commandN>
rely on the order of resultsTypes of commands that could rely on input order are:
At this stage we don't support window functions and STREAMSTATS, but in future we could, so to avoid regressions we consider EVAL and STATS as relevant for order for now.Today we don't support window functions and STREAMSTATS, and we already had rules that do the same thing and don't consider these cases (the old
PruneOrderByBeforeStats
rule), so we'll consider EVAL, FILTER and STATS safe for now.Given this limitation,AddDefaultTopN
is still in place to make a couple of tests pass, but as reported here, it is most likely incorrect and needs a double-check.Since in some cases we can still have unbounded sorts (eg.
| sort | mv_expand | where
), we also let OrderBy implementPostOptimizationVerificationAware
and fail validation in a more graceful way.