ESQL: Limit memory usage of `fold` #118602

nik9000 · 2024-12-12T18:37:24Z

fold can be surprisingly heavy! The maximally efficient/paranoid thing would be to fold each expression one time, in the constant folding rule, and then store the result as a Literal. But this PR doesn't do that because it's a big change. Instead, it creates the infrastructure for tracking memory usage for folding as plugs it into as many places as possible. That's not perfect, but it's better.

This infrastructure limit the allocations of fold similar to the CircuitBreaker infrastructure we use for values, but it's different in a critical way: you don't manually free any of the values. This is important because the plan itself isn't Releasable, which is required when using a real CircuitBreaker. We could have tried to make the plan releasable, but that'd be a huge change.

Right now there's a single limit of 5% of heap per query. We create the limit at the start of query planning and use it throughout planning.

There are about 40 places that don't yet use it. We should get them plugged in as quick as we can manage. After that, we should look to the maximally efficient/paranoid thing that I mentioned about waiting for constant folding. That's an even bigger change, one I'm not equipped to make on my own.

elasticsearchmachine · 2024-12-12T18:37:48Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

elasticsearchmachine · 2024-12-12T18:37:48Z

Hi @nik9000, I've created a changelog YAML for you.

nik9000 · 2024-12-13T12:52:31Z

This is sort of the next step in #118101. Or, I guess, #118101 was really an attempt to unblock as many of the unboundeds as I could. After working more on this I realized I wasn't going to zap them all in pre-PRs. There are, like I said in the description, like 40 of them that all need looking at. Maybe ~10 PRs?

nik9000 · 2024-12-13T14:50:14Z

I shall defeat you, merge conflicts!

nik9000 · 2024-12-13T14:59:07Z

nik9000 · 2024-12-13T16:11:20Z

run elasticsearch-ci/part-1

ivancea

LGTM 😰 The bulk of the trivial changes are here at least, let's wish future PRs get smaller haha

ivancea · 2024-12-26T18:08:43Z

...plugin/esql-core/src/main/java/org/elasticsearch/xpack/esql/core/expression/FoldContext.java

+    /**
+     * {@link Expression#fold} using a small amount of memory.
+     */
+    public static FoldContext small() {


Are we using this? It's unbounded

I'll double check before merging. We really should only have one unbounded one.

At some point I was playing around with the idea that you could know up front that a fold would always be "small" so you didn't have to limit it. But that's:

Impossible to know

Weird to want to do. If you know it's small them limit it.

I've since made it bounded and, well, as big as the default.

I can't read it from the user request because the whole point is to have one I can read statically from the user request without plumbing changes. But, it has the default size!

ivancea · 2024-12-26T18:12:53Z

...plugin/esql/src/main/java/org/elasticsearch/xpack/esql/evaluator/mapper/EvaluatorMapper.java

+            BigArrays.NON_RECYCLING_INSTANCE,
+            new BlockFactory(ctx.circuitBreakerView(source), BigArrays.NON_RECYCLING_INSTANCE)


Those BigArrays aren't being used; are we sure that the blockfactory methods are enough? Some arrays in there are accounted by the BigArrays, right? So we're missing part of them.

OOOH. Probably not. Let me have a look at wiring those in too.

nik9000 · 2024-12-27T15:45:28Z

I'd love to get another set of eyes on this one before I merge it. I'll resolve conflicts now though.

luigidellaquila

LGTM, thanks Nik.
I just left a comment

luigidellaquila · 2024-12-27T16:19:28Z

...plugin/esql-core/src/main/java/org/elasticsearch/xpack/esql/core/expression/FoldContext.java

+     * {@link Expression#fold} using any amount of memory. Only safe for tests.
+     */
+    public static FoldContext unbounded() {
+        return new FoldContext(Long.MAX_VALUE);


Since we are still using it in a few places, wouldn't it make sense to default it to the same 5% of the memory as the default in the query pragmas (or a slightly higher value), rather than making it really unbounded? It's a bit paranoid maybe, I guess the final goal here is safety

I'm hopeful we can remove the usages in a follow-up. Let's get this in an have a conversation about whether or not we should replace unbounded with small.

Did we want to give this the 5% limit before merging this?

astefan

I've looked at the ComputeService mostly; it's an intricate core piece of ES|QL and every time I look at it I need to be extra careful in how I read the code considering all the execution paths code takes.

So, please take my comments there with some patience :-). My comment related to LocalExecutionPlanner might be wrong due to how code is called from many places and how the folding context is passed around, but the one related to query pragma for data nodes might be correct (?) or a learning opportunity for me regarding the % memory size value compute timing.

astefan · 2025-01-10T08:06:31Z

...ck/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/planner/LocalExecutionPlanner.java

@@ -161,13 +161,14 @@ public LocalExecutionPlanner(
    /**
     * turn the given plan into a list of drivers to execute
     */
-    public LocalExecutionPlan plan(PhysicalPlan localPhysicalPlan) {
+    public LocalExecutionPlan plan(FoldContext foldCtx, PhysicalPlan localPhysicalPlan) {


Why do you need here to pass as parameter a FoldContext?
From what I can tell the constructor of LocalExecutionPlanner already has the needed pragma to build the folding context.

Also, it's unfortunate that we must build the PhysicalOperationProviders (which also takes a folding context) outside the LocalExecutionPlanner (for testing purposes that is) otherwise PhysicalOperationProviders could have been built as part of the LocalExecutionPlanner build and would have gotten the same folding context as the surrounding class.

I had wanted to use a single context so we'd share the same limit across the whole process. I think at some point we'll no longer need FoldContext at the physical level at all eventually so maybe it's mute?

I'm going to double check exactly how much sharing I get here.

Right. We share with the the LocalLogicalPlanOptimizer and LocalPhysicalPlanOptimizer.

astefan · 2025-01-10T08:17:23Z

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/ComputeService.java

+                    clusterAlias,
+                    searchContexts,
+                    configuration,
+                    new FoldContext(configuration.pragmas().foldLimit().getBytes()),


From what I can tell this foldLimit here is the one coming from coordinator. And if it's 5% of the heap size, is this the heap of the coordinator or the heap of the data node?

Oh, that's a very good point. ++

Oh fun! I'll resolve the rest of this and then dig into this.

astefan · 2025-01-10T14:00:26Z

...in/esql-core/src/main/java/org/elasticsearch/xpack/esql/core/expression/predicate/Range.java

+            Object lowerValue = lower.fold(FoldContext.unbounded());
+            Object upperValue = lower.fold(FoldContext.unbounded());


This is wrong. Copy-pasta mistake probably.

After merging main into this, this should become a non-issue because Range doesn't fold there anymore.

Fun! It's interesting that tests didn't catch this. Let me make a test that fails....

Indeed. There are many edge cases, we can create tests for as many as we can think of. There will always be something left behind, I came to believe it’s impossible to cover everything.

Er, alex's test catches it already.

Yeah. I'm not accusing at all. We try to cover it all. I try not to make mistakes too. But I figure if I've made a mistake once it's good to have a case.

To be fair, I think we just never fold ranges in actual queries, at all; so it makes sense that no test caught this in the past. It doesn't help that Range was inherited from ql.

astefan · 2025-01-10T15:50:34Z

...va/org/elasticsearch/xpack/esql/optimizer/rules/logical/ReplaceStatsFilteredAggWithEval.java

@@ -49,7 +49,7 @@ protected LogicalPlan rule(Aggregate aggregate) {
                && alias.child() instanceof AggregateFunction aggFunction
                && aggFunction.hasFilter()
                && aggFunction.filter() instanceof Literal literal
-                && Boolean.FALSE.equals(literal.fold())) {
+                && Boolean.FALSE.equals(literal.value())) {


alex-spies

Heya, I'm not sure if you already had a chance to address my last batch of comments (before this one). I resolved all remarks that I think are done now, but there's still a couple that may be good to check.

The most important thing from my review, unit tests for the fold context, has been addressed, so this nearly LGTM.

However, two major points remain:

I think @astefan raised an important question though about whether the memory limit on the data nodes is correctly determined, or if maybe wrongly takes 5% of the coordinator node's memory. I'm also interested in this.
@luigidellaquila suggested to default the unbounded context to 5% of memory as well; do we want to address this in this PR?

alex-spies · 2025-01-10T16:58:13Z

...plugin/esql-core/src/main/java/org/elasticsearch/xpack/esql/core/expression/FoldContext.java

+     * {@link Expression#fold} using any amount of memory. Only safe for tests.
+     */
+    public static FoldContext unbounded() {
+        return new FoldContext(Long.MAX_VALUE);


Did we want to give this the 5% limit before merging this?

alex-spies · 2025-01-10T17:53:55Z

...in/esql-core/src/main/java/org/elasticsearch/xpack/esql/core/expression/predicate/Range.java

+            Object lowerValue = lower.fold(FoldContext.unbounded());
+            Object upperValue = lower.fold(FoldContext.unbounded());


After merging main into this, this should become a non-issue because Range doesn't fold there anymore.

nik9000 · 2025-01-10T21:32:53Z

I think @astefan raised an important question though about whether the memory limit on the data nodes is correctly determined, or if maybe wrongly takes 5% of the coordinator node's memory. I'm also interested in this.

OK! I'd sort of assumed it was the data node without check and that was bad. It is, indeed, the data node. Well, it's 5% of the node who reads the Setting. So it's 5% of whatever node calls foldLimit. It's because we use the Settings thing which is, at it's heart, a TreeMap. And, in our case, an empty one 99% of the time.

@luigidellaquila suggested to default the unbounded context to 5% of memory as well; do we want to address this in this PR?

Yeah! I'll do it now.

astefan

LGTM

nik9000 · 2025-01-10T22:17:58Z

I'm going to spend some time this weekend thinking about how we can test it.

nik9000 · 2025-01-10T22:18:26Z

I'm going to spend some time this weekend thinking about how we can test it.

Er, "it" here being the memory being 5% of each node's memory. That seems super important.

alex-spies · 2025-01-13T10:32:15Z

...lugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/LogicalOptimizerContext.java

@@ -35,7 +35,7 @@ public boolean equals(Object obj) {
        if (obj == this) return true;
        if (obj == null || obj.getClass() != this.getClass()) return false;
        var that = (LogicalOptimizerContext) obj;
-        return Objects.equals(this.configuration, that.configuration);
+        return this.configuration.equals(that.configuration) && this.foldCtx.equals(that.foldCtx);


Should we update the hashCode implementation, too?

Yes. Super important. You should block the merge for a mistake like that. Sorry.

It's not that it's likely to break anything, but it's sort of a bomb set up for someone years in the future.

alex-spies · 2025-01-13T10:39:43Z

.../main/java/org/elasticsearch/xpack/esql/optimizer/rules/physical/local/PushTopNToSource.java

            if (attr instanceof FieldAttribute fieldAttribute) {
-                Geometry geometry = SpatialRelatesUtils.makeGeometryFromLiteral(foldable);
+                Geometry geometry = SpatialRelatesUtils.makeGeometryFromLiteral(ctx, foldable);


Oh yes, absolutely. Sorry, I didn't mean this to be a suggestion for this PR; just wanted to share an observation and see if you agree :)

alex-spies

Thanks @nik9000 ! I think this PR is in a great state and is a great addition!

I think it makes sense to add a test to ensure the 5% of memory is relative to the current node.

Replacing the unbounded context by a 5% context would be better, but if that blows the PR's scope I think it's fine to do in a follow-up. (And I think you wanted to use this 5% limit in csv tests, too, although I think it's sufficient to use a limit in ITs.)

In any case, this could already be merged as-is IMHO. Nice!

nik9000 · 2025-01-13T12:03:02Z

OK! Status. I've got the unbounded->5% change in. I scanned the diff and it looks sane and it tested for me locally. That should be fine but I wouldn't be surprised if we have to poke tests some. unbounded is possible to rebuild in tests if we want it.

I think the right way to test the 5%-is-relative-memeory thing is.... oh boy that's going to take some time. For a few days we'll have to rely on my brain.

nik9000 · 2025-01-13T12:03:54Z

Follow ups:

Remove calls to small.
Test heap usage is relative in a sane integration test.

`fold` can be surprisingly heavy! The maximally efficient/paranoid thing would be to fold each expression one time, in the constant folding rule, and then store the result as a `Literal`. But this PR doesn't do that because it's a big change. Instead, it creates the infrastructure for tracking memory usage for folding as plugs it into as many places as possible. That's not perfect, but it's better. This infrastructure limit the allocations of fold similar to the `CircuitBreaker` infrastructure we use for values, but it's different in a critical way: you don't manually free any of the values. This is important because the plan itself isn't `Releasable`, which is required when using a real CircuitBreaker. We could have tried to make the plan releasable, but that'd be a huge change. Right now there's a single limit of 5% of heap per query. We create the limit at the start of query planning and use it throughout planning. There are about 40 places that don't yet use it. We should get them plugged in as quick as we can manage. After that, we should look to the maximally efficient/paranoid thing that I mentioned about waiting for constant folding. That's an even bigger change, one I'm not equipped to make on my own.

nik9000 · 2025-01-14T11:09:05Z

Backport: #120100

`fold` can be surprisingly heavy! The maximally efficient/paranoid thing would be to fold each expression one time, in the constant folding rule, and then store the result as a `Literal`. But this PR doesn't do that because it's a big change. Instead, it creates the infrastructure for tracking memory usage for folding as plugs it into as many places as possible. That's not perfect, but it's better. This infrastructure limit the allocations of fold similar to the `CircuitBreaker` infrastructure we use for values, but it's different in a critical way: you don't manually free any of the values. This is important because the plan itself isn't `Releasable`, which is required when using a real CircuitBreaker. We could have tried to make the plan releasable, but that'd be a huge change. Right now there's a single limit of 5% of heap per query. We create the limit at the start of query planning and use it throughout planning. There are about 40 places that don't yet use it. We should get them plugged in as quick as we can manage. After that, we should look to the maximally efficient/paranoid thing that I mentioned about waiting for constant folding. That's an even bigger change, one I'm not equipped to make on my own.

nik9000 added 6 commits December 10, 2024 17:23

WIP

8fa64b1

Merge branch 'main' into fold_ctx_2

8ccd678

WIP:

a126e84

Merge branch 'main' into fold_ctx_2

bc661ce

Fixup

28ac84f

actual tests

facd394

nik9000 added >bug :Analytics/ES|QL AKA ESQL v9.0.0 v8.18.0 labels Dec 12, 2024

nik9000 requested a review from a team as a code owner December 12, 2024 18:37

elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Dec 12, 2024

Update docs/changelog/118602.yaml

17d2fc4

Merge branch 'main' into fold_ctx_2

faf0e5e

nik9000 added 2 commits December 13, 2024 10:00

Compile again

dd9eeba

Merge remote-tracking branch 'nik9000/fold_ctx_2' into fold_ctx_2

f46447e

ivancea approved these changes Dec 26, 2024

View reviewed changes

ivancea reviewed Dec 26, 2024

View reviewed changes

luigidellaquila approved these changes Dec 27, 2024

View reviewed changes

nik9000 added 3 commits December 27, 2024 11:55

Merge branch 'main' into fold_ctx_2

739e9bf

BigArrays too

6660b88

Merge branch 'main' into fold_ctx_2

b9ea47f

Review

f3fd791

astefan reviewed Jan 10, 2025

View reviewed changes

alex-spies reviewed Jan 10, 2025

View reviewed changes

nik9000 added 5 commits January 10, 2025 14:10

Merge branch 'main' into fold_ctx_2

fa12629

Make FoldContext have equality

d790556

Helper

2abbeef

Catch my bug

39440d8

Merge branch 'main' into fold_ctx_2

0d5afb1

astefan approved these changes Jan 10, 2025

View reviewed changes

Contextualizification

b20f541

alex-spies reviewed Jan 13, 2025

View reviewed changes

alex-spies approved these changes Jan 13, 2025

View reviewed changes

5%

456833d

nik9000 added 2 commits January 13, 2025 07:15

Fix hash

573a238

Moar tests

5742c23

nik9000 enabled auto-merge (squash) January 13, 2025 12:54

nik9000 added 2 commits January 13, 2025 07:55

Merge branch 'main' into fold_ctx_2

b43b8d8

Update heap attack now

0651b78

nik9000 merged commit c990377 into elastic:main Jan 13, 2025
16 checks passed

		BigArrays.NON_RECYCLING_INSTANCE,
		new BlockFactory(ctx.circuitBreakerView(source), BigArrays.NON_RECYCLING_INSTANCE)

		Object lowerValue = lower.fold(FoldContext.unbounded());
		Object upperValue = lower.fold(FoldContext.unbounded());

ESQL: Limit memory usage of fold #118602

ESQL: Limit memory usage of fold #118602

Uh oh!

Conversation

nik9000 commented Dec 12, 2024

Uh oh!

elasticsearchmachine commented Dec 12, 2024

Uh oh!

elasticsearchmachine commented Dec 12, 2024

Uh oh!

nik9000 commented Dec 13, 2024

Uh oh!

nik9000 commented Dec 13, 2024

Uh oh!

nik9000 commented Dec 13, 2024

Uh oh!

nik9000 commented Dec 13, 2024

Uh oh!

ivancea left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nik9000 commented Dec 27, 2024

Uh oh!

luigidellaquila left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

astefan left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

ESQL: Limit memory usage of `fold` #118602

ESQL: Limit memory usage of `fold` #118602