-
Notifications
You must be signed in to change notification settings - Fork 25.2k
ESQL: Enable physical plan verification #118114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESQL: Enable physical plan verification #118114
Conversation
Hi @bpintea, I've created a changelog YAML for you. |
Pinging @elastic/es-analytical-engine (Team:Analytics) |
Yes, will "fixes" it here, thanks. (I actually started with verifying plans from fragments after being replanned. But the spacial aggregations makes it tricky, so stripped that out for now and what remains could address this issue, true.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice - thanks for looking into this and cleaning this up, especially the AggregateMapper.
LGTM
if (p instanceof AggregateExec agg) { | ||
var ordinalAttributes = agg.ordinalAttributes(); | ||
missing.removeAll(Expressions.references(ordinalAttributes)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much better.
List<Attribute> orginalAttributs = new ArrayList<>(groupings.size()); | ||
if (groupings().size() == 1) { | ||
// CATEGORIZE requires the standard hash aggregator as well. | ||
if (groupings.get(0).anyMatch(e -> e instanceof Categorize) == false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (groupings.get(0).anyMatch(e -> e instanceof Categorize) == false) { | |
if (groupings.get(0).noneMatch(e -> e instanceof Categorize)) { |
I think this could also be merged with above if
to minimize nesting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
groupings
is a list of Expression
(not a stream), so won't have noneMatch()
. But I think refuting a predicate is quite common, adding a noneMatch()
would actually make sense. Will leave this as is for now, but will follow up.
I think this could also be merged with above if to minimize nesting.
I've only grafted this patch, so I can only guess that the intention might have been to allow for more exceptions (or easier comments). But maybe this can be now merged and split again when needed.
if (groupings.get(0).anyMatch(e -> e instanceof Categorize) == false) { | ||
var leaves = new LinkedList<>(); | ||
// TODO: this seems out of place | ||
aggregates.stream().filter(a -> groupings.contains(a) == false).forEach(a -> leaves.addAll(a.collectLeaves())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets use collect instead of addAll
in forEach
.
Since it is used for .contains
check below, should it collect to set instead of the list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, turned it into a Set.
Lets use collect instead of addAll in forEach.
Yeh, that'd be an alternative, but with stream chaining and the line breaking, it wouldn't be more succint than current format, so I've left as is.
💚 Backport successful
|
This enables the physical plan verification. For it, a couple of changes needed to be applied/corrected: * AggregateMapper creates attributes with unique names; * AggregateExec's verification needs not consider ordinal attribute(s); * LookupJoinExec needs to merge attributes of same name at output, "winning" the right child; * ExchangeExec does no input referencing, since it only outputs all synthetic attributes, "sourced" from remote exchanges; * FieldExtractExec doesn't reference the attributes it "produces".
var addedFieldsNames = addedFields.stream().map(Attribute::name).toList(); | ||
lazyOutput.removeIf(a -> addedFieldsNames.contains(a.name())); | ||
lazyOutput.addAll(addedFields); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, this is very interesting. Thanks for catching this.
I'm wondering if enforcing unique names for PhysicalPlan
outputs is actually the right way to go. It is what we assumed so far, so I don't wanna depart from here, but there's also a reason to be more lenient:
In PhysicalPlan land, this output method was technically correct, because it represented exactly what the physical operators will do: they're gonna append the blocks for the added fields to the pages that are processed. The blocks corresponding to shadowed attributes/channels will not be stripped from incoming pages.
This means that the output layout technically has duplicate attribute names. This change just hides them and makes them inaccessible during the physical planning stage.
FWIW, EvalExec
has the exact same situation. Eval
performs shadowing, but EvalOperator
can only ever append blocks, so the output pages actually still contain the blocks corresponding to shadowed attributes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Late to the party, but had to take a look anyway because this will interact with LOOKUP JOIN a little.
Just wanted to say thank you again and LGTM!
) * ESQL: Enable physical plan verification (#118114) This enables the physical plan verification. For it, a couple of changes needed to be applied/corrected: * AggregateMapper creates attributes with unique names; * AggregateExec's verification needs not consider ordinal attribute(s); * LookupJoinExec needs to merge attributes of same name at output, "winning" the right child; * ExchangeExec does no input referencing, since it only outputs all synthetic attributes, "sourced" from remote exchanges; * FieldExtractExec doesn't reference the attributes it "produces". * ESQL: Disable remote enrich verification (#118534) This disables verifying the plans generated for remote ENRICHing. It also re-enables corresponding failing test. Related: #118531 Fixes #118307. (cherry picked from commit e7a4436)
…#118534) (elastic#118302) * ESQL: Enable physical plan verification (elastic#118114) This enables the physical plan verification. For it, a couple of changes needed to be applied/corrected: * AggregateMapper creates attributes with unique names; * AggregateExec's verification needs not consider ordinal attribute(s); * LookupJoinExec needs to merge attributes of same name at output, "winning" the right child; * ExchangeExec does no input referencing, since it only outputs all synthetic attributes, "sourced" from remote exchanges; * FieldExtractExec doesn't reference the attributes it "produces". * ESQL: Disable remote enrich verification (elastic#118534) This disables verifying the plans generated for remote ENRICHing. It also re-enables corresponding failing test. Related: elastic#118531 Fixes elastic#118307. (cherry picked from commit e7a4436)
…#118534) (elastic#118302) * ESQL: Enable physical plan verification (elastic#118114) This enables the physical plan verification. For it, a couple of changes needed to be applied/corrected: * AggregateMapper creates attributes with unique names; * AggregateExec's verification needs not consider ordinal attribute(s); * LookupJoinExec needs to merge attributes of same name at output, "winning" the right child; * ExchangeExec does no input referencing, since it only outputs all synthetic attributes, "sourced" from remote exchanges; * FieldExtractExec doesn't reference the attributes it "produces". * ESQL: Disable remote enrich verification (elastic#118534) This disables verifying the plans generated for remote ENRICHing. It also re-enables corresponding failing test. Related: elastic#118531 Fixes elastic#118307. (cherry picked from commit e7a4436)
This enables the physical plan verification.
For it, a couple of changes needed to be applied/corrected:
Fixes #105436.