Skip to content

ESQL: Limit memory usage of fold #118602

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 30 commits into from
Jan 13, 2025
Merged

ESQL: Limit memory usage of fold #118602

merged 30 commits into from
Jan 13, 2025

Conversation

nik9000
Copy link
Member

@nik9000 nik9000 commented Dec 12, 2024

fold can be surprisingly heavy! The maximally efficient/paranoid thing would be to fold each expression one time, in the constant folding rule, and then store the result as a Literal. But this PR doesn't do that because it's a big change. Instead, it creates the infrastructure for tracking memory usage for folding as plugs it into as many places as possible. That's not perfect, but it's better.

This infrastructure limit the allocations of fold similar to the CircuitBreaker infrastructure we use for values, but it's different in a critical way: you don't manually free any of the values. This is important because the plan itself isn't Releasable, which is required when using a real CircuitBreaker. We could have tried to make the plan releasable, but that'd be a huge change.

Right now there's a single limit of 5% of heap per query. We create the limit at the start of query planning and use it throughout planning.

There are about 40 places that don't yet use it. We should get them plugged in as quick as we can manage. After that, we should look to the maximally efficient/paranoid thing that I mentioned about waiting for constant folding. That's an even bigger change, one I'm not equipped to make on my own.

@nik9000 nik9000 requested a review from a team as a code owner December 12, 2024 18:37
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Dec 12, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine
Copy link
Collaborator

Hi @nik9000, I've created a changelog YAML for you.

@nik9000
Copy link
Member Author

nik9000 commented Dec 13, 2024

This is sort of the next step in #118101. Or, I guess, #118101 was really an attempt to unblock as many of the unboundeds as I could. After working more on this I realized I wasn't going to zap them all in pre-PRs. There are, like I said in the description, like 40 of them that all need looking at. Maybe ~10 PRs?

@nik9000
Copy link
Member Author

nik9000 commented Dec 13, 2024

I shall defeat you, merge conflicts!

@nik9000
Copy link
Member Author

nik9000 commented Dec 13, 2024

il_fullxfull 3857933549_br3t

@nik9000
Copy link
Member Author

nik9000 commented Dec 13, 2024

run elasticsearch-ci/part-1

Copy link
Contributor

@ivancea ivancea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 😰 The bulk of the trivial changes are here at least, let's wish future PRs get smaller haha

/**
* {@link Expression#fold} using a small amount of memory.
*/
public static FoldContext small() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we using this? It's unbounded

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll double check before merging. We really should only have one unbounded one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point I was playing around with the idea that you could know up front that a fold would always be "small" so you didn't have to limit it. But that's:

  1. Impossible to know
  2. Weird to want to do. If you know it's small them limit it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've since made it bounded and, well, as big as the default.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't read it from the user request because the whole point is to have one I can read statically from the user request without plumbing changes. But, it has the default size!

Comment on lines 92 to 93
BigArrays.NON_RECYCLING_INSTANCE,
new BlockFactory(ctx.circuitBreakerView(source), BigArrays.NON_RECYCLING_INSTANCE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those BigArrays aren't being used; are we sure that the blockfactory methods are enough? Some arrays in there are accounted by the BigArrays, right? So we're missing part of them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OOOH. Probably not. Let me have a look at wiring those in too.

@nik9000
Copy link
Member Author

nik9000 commented Dec 27, 2024

I'd love to get another set of eyes on this one before I merge it. I'll resolve conflicts now though.

Copy link
Contributor

@luigidellaquila luigidellaquila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks Nik.
I just left a comment

* {@link Expression#fold} using any amount of memory. Only safe for tests.
*/
public static FoldContext unbounded() {
return new FoldContext(Long.MAX_VALUE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are still using it in a few places, wouldn't it make sense to default it to the same 5% of the memory as the default in the query pragmas (or a slightly higher value), rather than making it really unbounded? It's a bit paranoid maybe, I guess the final goal here is safety

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hopeful we can remove the usages in a follow-up. Let's get this in an have a conversation about whether or not we should replace unbounded with small.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we want to give this the 5% limit before merging this?

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've looked at the ComputeService mostly; it's an intricate core piece of ES|QL and every time I look at it I need to be extra careful in how I read the code considering all the execution paths code takes.

So, please take my comments there with some patience :-). My comment related to LocalExecutionPlanner might be wrong due to how code is called from many places and how the folding context is passed around, but the one related to query pragma for data nodes might be correct (?) or a learning opportunity for me regarding the % memory size value compute timing.

@@ -161,13 +161,14 @@ public LocalExecutionPlanner(
/**
* turn the given plan into a list of drivers to execute
*/
public LocalExecutionPlan plan(PhysicalPlan localPhysicalPlan) {
public LocalExecutionPlan plan(FoldContext foldCtx, PhysicalPlan localPhysicalPlan) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need here to pass as parameter a FoldContext?
From what I can tell the constructor of LocalExecutionPlanner already has the needed pragma to build the folding context.

Also, it's unfortunate that we must build the PhysicalOperationProviders (which also takes a folding context) outside the LocalExecutionPlanner (for testing purposes that is) otherwise PhysicalOperationProviders could have been built as part of the LocalExecutionPlanner build and would have gotten the same folding context as the surrounding class.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had wanted to use a single context so we'd share the same limit across the whole process. I think at some point we'll no longer need FoldContext at the physical level at all eventually so maybe it's mute?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to double check exactly how much sharing I get here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. We share with the the LocalLogicalPlanOptimizer and LocalPhysicalPlanOptimizer.

clusterAlias,
searchContexts,
configuration,
new FoldContext(configuration.pragmas().foldLimit().getBytes()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can tell this foldLimit here is the one coming from coordinator. And if it's 5% of the heap size, is this the heap of the coordinator or the heap of the data node?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that's a very good point. ++

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh fun! I'll resolve the rest of this and then dig into this.

Comment on lines 93 to 94
Object lowerValue = lower.fold(FoldContext.unbounded());
Object upperValue = lower.fold(FoldContext.unbounded());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong. Copy-pasta mistake probably.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keen eyes!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After merging main into this, this should become a non-issue because Range doesn't fold there anymore.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fun! It's interesting that tests didn't catch this. Let me make a test that fails....

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. There are many edge cases, we can create tests for as many as we can think of. There will always be something left behind, I came to believe it’s impossible to cover everything.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Er, alex's test catches it already.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I'm not accusing at all. We try to cover it all. I try not to make mistakes too. But I figure if I've made a mistake once it's good to have a case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be fair, I think we just never fold ranges in actual queries, at all; so it makes sense that no test caught this in the past. It doesn't help that Range was inherited from ql.

@@ -49,7 +49,7 @@ protected LogicalPlan rule(Aggregate aggregate) {
&& alias.child() instanceof AggregateFunction aggFunction
&& aggFunction.hasFilter()
&& aggFunction.filter() instanceof Literal literal
&& Boolean.FALSE.equals(literal.fold())) {
&& Boolean.FALSE.equals(literal.value())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heya, I'm not sure if you already had a chance to address my last batch of comments (before this one). I resolved all remarks that I think are done now, but there's still a couple that may be good to check.

The most important thing from my review, unit tests for the fold context, has been addressed, so this nearly LGTM.

However, two major points remain:

  • I think @astefan raised an important question though about whether the memory limit on the data nodes is correctly determined, or if maybe wrongly takes 5% of the coordinator node's memory. I'm also interested in this.
  • @luigidellaquila suggested to default the unbounded context to 5% of memory as well; do we want to address this in this PR?

* {@link Expression#fold} using any amount of memory. Only safe for tests.
*/
public static FoldContext unbounded() {
return new FoldContext(Long.MAX_VALUE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we want to give this the 5% limit before merging this?

Comment on lines 93 to 94
Object lowerValue = lower.fold(FoldContext.unbounded());
Object upperValue = lower.fold(FoldContext.unbounded());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After merging main into this, this should become a non-issue because Range doesn't fold there anymore.

@nik9000
Copy link
Member Author

nik9000 commented Jan 10, 2025

  • I think @astefan raised an important question though about whether the memory limit on the data nodes is correctly determined, or if maybe wrongly takes 5% of the coordinator node's memory. I'm also interested in this.

OK! I'd sort of assumed it was the data node without check and that was bad. It is, indeed, the data node. Well, it's 5% of the node who reads the Setting. So it's 5% of whatever node calls foldLimit. It's because we use the Settings thing which is, at it's heart, a TreeMap. And, in our case, an empty one 99% of the time.

  • @luigidellaquila suggested to default the unbounded context to 5% of memory as well; do we want to address this in this PR?

Yeah! I'll do it now.

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nik9000
Copy link
Member Author

nik9000 commented Jan 10, 2025

I'm going to spend some time this weekend thinking about how we can test it.

@nik9000
Copy link
Member Author

nik9000 commented Jan 10, 2025

I'm going to spend some time this weekend thinking about how we can test it.

Er, "it" here being the memory being 5% of each node's memory. That seems super important.

@@ -35,7 +35,7 @@ public boolean equals(Object obj) {
if (obj == this) return true;
if (obj == null || obj.getClass() != this.getClass()) return false;
var that = (LogicalOptimizerContext) obj;
return Objects.equals(this.configuration, that.configuration);
return this.configuration.equals(that.configuration) && this.foldCtx.equals(that.foldCtx);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we update the hashCode implementation, too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Super important. You should block the merge for a mistake like that. Sorry.

It's not that it's likely to break anything, but it's sort of a bomb set up for someone years in the future.

if (attr instanceof FieldAttribute fieldAttribute) {
Geometry geometry = SpatialRelatesUtils.makeGeometryFromLiteral(foldable);
Geometry geometry = SpatialRelatesUtils.makeGeometryFromLiteral(ctx, foldable);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes, absolutely. Sorry, I didn't mean this to be a suggestion for this PR; just wanted to share an observation and see if you agree :)

Copy link
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @nik9000 ! I think this PR is in a great state and is a great addition!

I think it makes sense to add a test to ensure the 5% of memory is relative to the current node.

Replacing the unbounded context by a 5% context would be better, but if that blows the PR's scope I think it's fine to do in a follow-up. (And I think you wanted to use this 5% limit in csv tests, too, although I think it's sufficient to use a limit in ITs.)

In any case, this could already be merged as-is IMHO. Nice!

@nik9000
Copy link
Member Author

nik9000 commented Jan 13, 2025

OK! Status. I've got the unbounded->5% change in. I scanned the diff and it looks sane and it tested for me locally. That should be fine but I wouldn't be surprised if we have to poke tests some. unbounded is possible to rebuild in tests if we want it.

I think the right way to test the 5%-is-relative-memeory thing is.... oh boy that's going to take some time. For a few days we'll have to rely on my brain.

@nik9000
Copy link
Member Author

nik9000 commented Jan 13, 2025

Follow ups:

  • Remove calls to small.
  • Test heap usage is relative in a sane integration test.

@nik9000 nik9000 enabled auto-merge (squash) January 13, 2025 12:54
@nik9000 nik9000 merged commit c990377 into elastic:main Jan 13, 2025
16 checks passed
nik9000 added a commit to nik9000/elasticsearch that referenced this pull request Jan 14, 2025
`fold` can be surprisingly heavy! The maximally efficient/paranoid thing
would be to fold each expression one time, in the constant folding rule,
and then store the result as a `Literal`. But this PR doesn't do that
because it's a big change. Instead, it creates the infrastructure for
tracking memory usage for folding as plugs it into as many places as
possible. That's not perfect, but it's better.

This infrastructure limit the allocations of fold similar to the
`CircuitBreaker` infrastructure we use for values, but it's different
in a critical way: you don't manually free any of the values. This is
important because the plan itself isn't `Releasable`, which is required
when using a real CircuitBreaker. We could have tried to make the plan
releasable, but that'd be a huge change.

Right now there's a single limit of 5% of heap per query. We create the
limit at the start of query planning and use it throughout planning.

There are about 40 places that don't yet use it. We should get them
plugged in as quick as we can manage. After that, we should look to the
maximally efficient/paranoid thing that I mentioned about waiting for
constant folding. That's an even bigger change, one I'm not equipped
to make on my own.
@nik9000
Copy link
Member Author

nik9000 commented Jan 14, 2025

Backport: #120100

martijnvg pushed a commit to martijnvg/elasticsearch that referenced this pull request Jan 14, 2025
`fold` can be surprisingly heavy! The maximally efficient/paranoid thing
would be to fold each expression one time, in the constant folding rule,
and then store the result as a `Literal`. But this PR doesn't do that
because it's a big change. Instead, it creates the infrastructure for
tracking memory usage for folding as plugs it into as many places as
possible. That's not perfect, but it's better.

This infrastructure limit the allocations of fold similar to the
`CircuitBreaker` infrastructure we use for values, but it's different
in a critical way: you don't manually free any of the values. This is
important because the plan itself isn't `Releasable`, which is required
when using a real CircuitBreaker. We could have tried to make the plan
releasable, but that'd be a huge change.

Right now there's a single limit of 5% of heap per query. We create the
limit at the start of query planning and use it throughout planning.

There are about 40 places that don't yet use it. We should get them
plugged in as quick as we can manage. After that, we should look to the
maximally efficient/paranoid thing that I mentioned about waiting for
constant folding. That's an even bigger change, one I'm not equipped
to make on my own.
nik9000 added a commit that referenced this pull request Jan 15, 2025
`fold` can be surprisingly heavy! The maximally efficient/paranoid thing
would be to fold each expression one time, in the constant folding rule,
and then store the result as a `Literal`. But this PR doesn't do that
because it's a big change. Instead, it creates the infrastructure for
tracking memory usage for folding as plugs it into as many places as
possible. That's not perfect, but it's better.

This infrastructure limit the allocations of fold similar to the
`CircuitBreaker` infrastructure we use for values, but it's different
in a critical way: you don't manually free any of the values. This is
important because the plan itself isn't `Releasable`, which is required
when using a real CircuitBreaker. We could have tried to make the plan
releasable, but that'd be a huge change.

Right now there's a single limit of 5% of heap per query. We create the
limit at the start of query planning and use it throughout planning.

There are about 40 places that don't yet use it. We should get them
plugged in as quick as we can manage. After that, we should look to the
maximally efficient/paranoid thing that I mentioned about waiting for
constant folding. That's an even bigger change, one I'm not equipped
to make on my own.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.18.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants