Skip to content

Optimize ST_EXTENT_AGG for geo_shape and cartesian_shape #119889

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 39 commits into from
Jan 16, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
ec4f4e7
Support extracting extent from doc values as int[] for both cartesian…
craigtaverner Jan 9, 2025
daedc53
Merge remote-tracking branch 'origin/main' into st_extent_optimize_ge…
craigtaverner Jan 9, 2025
7fdef3e
Merge branch 'main' into st_extent_optimize_geoshape
craigtaverner Jan 9, 2025
5c79f04
Update docs/changelog/119889.yaml
craigtaverner Jan 9, 2025
f7f9403
Fix changelog
craigtaverner Jan 9, 2025
898d1ba
Added capability to block mixed-cluster tests which fail
craigtaverner Jan 10, 2025
2ade43c
Merge branch 'main' into st_extent_optimize_geoshape
craigtaverner Jan 10, 2025
a32e45c
Revert "Added capability to block mixed-cluster tests which fail"
craigtaverner Jan 10, 2025
26d3708
Change ST_EXTENT_AGG tests on airport_city_boundary to avoid multi-cl…
craigtaverner Jan 10, 2025
ad3f846
Add back non-ENRICH versions to be sure we have multi-shape aggs
craigtaverner Jan 10, 2025
6a7e517
Merge branch 'main' into st_extent_optimize_geoshape
craigtaverner Jan 10, 2025
49e912a
Bring back capability because the mixed-cluster tests failed on the c…
craigtaverner Jan 10, 2025
2716c03
Reorder parameters in geo_point/geo_shape aggregators
craigtaverner Jan 10, 2025
f877c9e
Merge branch 'main' into st_extent_optimize_geoshape
craigtaverner Jan 10, 2025
a503549
Merge branch 'main' into st_extent_optimize_geoshape
craigtaverner Jan 13, 2025
5284203
Merge branch 'main' into st_extent_optimize_geoshape
craigtaverner Jan 13, 2025
9c45f72
Code review updates
craigtaverner Jan 14, 2025
acd58ea
Merge branch 'st_extent_optimize_geoshape' of github.com:craigtaverne…
craigtaverner Jan 14, 2025
44d0392
Merge branch 'main' into st_extent_optimize_geoshape
craigtaverner Jan 14, 2025
7c3957c
Merge branch 'main' into st_extent_optimize_geoshape
craigtaverner Jan 14, 2025
a05c435
One more code review comment
craigtaverner Jan 15, 2025
2295cd9
Merge branch 'st_extent_optimize_geoshape' of github.com:craigtaverne…
craigtaverner Jan 15, 2025
8adf8db
Two more code review comments
craigtaverner Jan 15, 2025
7b63b8a
Support optimization for cartesian shapes
craigtaverner Jan 15, 2025
b517801
Merge branch 'main' into st_extent_optimize_geoshape
craigtaverner Jan 15, 2025
eaa5b21
Merge branch 'main' into st_extent_optimize_geoshape
craigtaverner Jan 15, 2025
229aacb
Merge branch 'main' into st_extent_optimize_geoshape
craigtaverner Jan 15, 2025
7379c8f
Merge branch 'main' into st_extent_optimize_geoshape
craigtaverner Jan 15, 2025
9fb92b1
Updated SpatialGeometryFieldMapperTests and identified bug
craigtaverner Jan 15, 2025
86f350f
Merge branch 'main' into st_extent_optimize_geoshape
craigtaverner Jan 15, 2025
e13bf12
Merge branch 'st_extent_optimize_geoshape' of github.com:craigtaverne…
craigtaverner Jan 15, 2025
736285d
Merge branch 'main' into st_extent_optimize_geoshape
craigtaverner Jan 15, 2025
3332224
Support missing groupIds in intermediate state creation
craigtaverner Jan 16, 2025
394cc19
Merge branch 'st_extent_optimize_geoshape' of github.com:craigtaverne…
craigtaverner Jan 16, 2025
a66c67b
Merge branch 'main' into st_extent_optimize_geoshape
craigtaverner Jan 16, 2025
f0b7ab1
Added more cartesian tests with larger result sets
craigtaverner Jan 16, 2025
1100bf3
Merge branch 'main' into st_extent_optimize_geoshape
craigtaverner Jan 16, 2025
77d639b
Merge branch 'main' into st_extent_optimize_geoshape
craigtaverner Jan 16, 2025
ffb590f
Merge branch 'main' into st_extent_optimize_geoshape
craigtaverner Jan 16, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions docs/changelog/119889.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
pr: 119889
summary: Optimize ST_EXTENT_AGG for `geo_shape` and `cartesian_shape`
area: "ES|QL"
type: enhancement
issues: []
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,9 @@ public interface PointVisitor {
boolean isValid();

Rectangle getResult();

/** To allow for memory optimizations through object reuse, the visitor can be reset to its initial state. */
void reset();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a line of documentation on what this is supposed to do.

}

/**
Expand All @@ -124,18 +127,14 @@ public interface PointVisitor {
*/
public static class CartesianPointVisitor implements PointVisitor {
private double minX = Double.POSITIVE_INFINITY;
private double minY = Double.POSITIVE_INFINITY;
private double maxX = Double.NEGATIVE_INFINITY;
private double maxY = Double.NEGATIVE_INFINITY;
private double minY = Double.POSITIVE_INFINITY;

public double getMinX() {
return minX;
}

public double getMinY() {
return minY;
}

public double getMaxX() {
return maxX;
}
Expand All @@ -144,12 +143,16 @@ public double getMaxY() {
return maxY;
}

public double getMinY() {
return minY;
}

@Override
public void visitPoint(double x, double y) {
minX = Math.min(minX, x);
minY = Math.min(minY, y);
maxX = Math.max(maxX, x);
maxY = Math.max(maxY, y);
minY = Math.min(minY, y);
}

@Override
Expand All @@ -160,9 +163,9 @@ public void visitRectangle(double minX, double maxX, double maxY, double minY) {
);
}
this.minX = Math.min(this.minX, minX);
this.minY = Math.min(this.minY, minY);
this.maxX = Math.max(this.maxX, maxX);
this.maxY = Math.max(this.maxY, maxY);
this.minY = Math.min(this.minY, minY);
}

@Override
Expand All @@ -174,6 +177,14 @@ public boolean isValid() {
public Rectangle getResult() {
return new Rectangle(minX, maxX, maxY, minY);
}

@Override
public void reset() {
minX = Double.POSITIVE_INFINITY;
maxX = Double.NEGATIVE_INFINITY;
maxY = Double.NEGATIVE_INFINITY;
minY = Double.POSITIVE_INFINITY;
}
}

/**
Expand All @@ -186,82 +197,117 @@ public Rectangle getResult() {
* </ul>
*/
public static class GeoPointVisitor implements PointVisitor {
protected double minY = Double.POSITIVE_INFINITY;
protected double maxY = Double.NEGATIVE_INFINITY;
protected double minNegX = Double.POSITIVE_INFINITY;
protected double maxNegX = Double.NEGATIVE_INFINITY;
protected double minPosX = Double.POSITIVE_INFINITY;
protected double maxPosX = Double.NEGATIVE_INFINITY;
protected double top = Double.NEGATIVE_INFINITY;
protected double bottom = Double.POSITIVE_INFINITY;
protected double negLeft = Double.POSITIVE_INFINITY;
protected double negRight = Double.NEGATIVE_INFINITY;
protected double posLeft = Double.POSITIVE_INFINITY;
protected double posRight = Double.NEGATIVE_INFINITY;

private final WrapLongitude wrapLongitude;

public GeoPointVisitor(WrapLongitude wrapLongitude) {
this.wrapLongitude = wrapLongitude;
}

public double getTop() {
return top;
}

public double getBottom() {
return bottom;
}

public double getNegLeft() {
return negLeft;
}

public double getNegRight() {
return negRight;
}

public double getPosLeft() {
return posLeft;
}

public double getPosRight() {
return posRight;
}

@Override
public void visitPoint(double x, double y) {
minY = Math.min(minY, y);
maxY = Math.max(maxY, y);
bottom = Math.min(bottom, y);
top = Math.max(top, y);
visitLongitude(x);
}

@Override
public void visitRectangle(double minX, double maxX, double maxY, double minY) {
this.minY = Math.min(this.minY, minY);
this.maxY = Math.max(this.maxY, maxY);
// TODO: Fix bug with rectangle crossing the dateline (see Extent.addRectangle for correct behaviour)
this.bottom = Math.min(this.bottom, minY);
this.top = Math.max(this.top, maxY);
visitLongitude(minX);
visitLongitude(maxX);
}

private void visitLongitude(double x) {
if (x >= 0) {
minPosX = Math.min(minPosX, x);
maxPosX = Math.max(maxPosX, x);
posLeft = Math.min(posLeft, x);
posRight = Math.max(posRight, x);
} else {
minNegX = Math.min(minNegX, x);
maxNegX = Math.max(maxNegX, x);
negLeft = Math.min(negLeft, x);
negRight = Math.max(negRight, x);
}
}

@Override
public boolean isValid() {
return minY != Double.POSITIVE_INFINITY;
return bottom != Double.POSITIVE_INFINITY;
}

@Override
public Rectangle getResult() {
return getResult(minNegX, minPosX, maxNegX, maxPosX, maxY, minY, wrapLongitude);
return getResult(top, bottom, negLeft, negRight, posLeft, posRight, wrapLongitude);
}

@Override
public void reset() {
bottom = Double.POSITIVE_INFINITY;
top = Double.NEGATIVE_INFINITY;
negLeft = Double.POSITIVE_INFINITY;
negRight = Double.NEGATIVE_INFINITY;
posLeft = Double.POSITIVE_INFINITY;
posRight = Double.NEGATIVE_INFINITY;
}

protected static Rectangle getResult(
double minNegX,
double minPosX,
double maxNegX,
double maxPosX,
double maxY,
double minY,
public static Rectangle getResult(
double top,
double bottom,
double negLeft,
double negRight,
double posLeft,
double posRight,
WrapLongitude wrapLongitude
) {
assert Double.isFinite(maxY);
if (Double.isInfinite(minPosX)) {
return new Rectangle(minNegX, maxNegX, maxY, minY);
} else if (Double.isInfinite(minNegX)) {
return new Rectangle(minPosX, maxPosX, maxY, minY);
assert Double.isFinite(top);
if (posRight == Double.NEGATIVE_INFINITY) {
return new Rectangle(negLeft, negRight, top, bottom);
} else if (negLeft == Double.POSITIVE_INFINITY) {
return new Rectangle(posLeft, posRight, top, bottom);
} else {
return switch (wrapLongitude) {
case NO_WRAP -> new Rectangle(minNegX, maxPosX, maxY, minY);
case WRAP -> maybeWrap(minNegX, minPosX, maxNegX, maxPosX, maxY, minY);
case NO_WRAP -> new Rectangle(negLeft, posRight, top, bottom);
case WRAP -> maybeWrap(top, bottom, negLeft, negRight, posLeft, posRight);
};
}
}

private static Rectangle maybeWrap(double minNegX, double minPosX, double maxNegX, double maxPosX, double maxY, double minY) {
double unwrappedWidth = maxPosX - minNegX;
double wrappedWidth = 360 + maxNegX - minPosX;
private static Rectangle maybeWrap(double top, double bottom, double negLeft, double negRight, double posLeft, double posRight) {
double unwrappedWidth = posRight - negLeft;
double wrappedWidth = 360 + negRight - posLeft;
return unwrappedWidth <= wrappedWidth
? new Rectangle(minNegX, maxPosX, maxY, minY)
: new Rectangle(minPosX, maxNegX, maxY, minY);
? new Rectangle(negLeft, posRight, top, bottom)
: new Rectangle(posLeft, negRight, top, bottom);
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@
import org.elasticsearch.index.IndexVersions;
import org.elasticsearch.index.analysis.NamedAnalyzer;
import org.elasticsearch.index.mapper.AbstractShapeGeometryFieldMapper;
import org.elasticsearch.index.mapper.BlockLoader;
import org.elasticsearch.index.mapper.DocumentParserContext;
import org.elasticsearch.index.mapper.DocumentParsingException;
import org.elasticsearch.index.mapper.FieldMapper;
Expand All @@ -46,7 +47,6 @@
import org.elasticsearch.legacygeo.builders.ShapeBuilder;
import org.elasticsearch.legacygeo.parsers.ShapeParser;
import org.elasticsearch.legacygeo.query.LegacyGeoShapeQueryProcessor;
import org.elasticsearch.lucene.spatial.CoordinateEncoder;
import org.elasticsearch.xcontent.XContentBuilder;
import org.elasticsearch.xcontent.XContentParser;
import org.locationtech.spatial4j.shape.Point;
Expand Down Expand Up @@ -84,6 +84,7 @@
* "field" : "POLYGON ((100.0 0.0, 101.0 0.0, 101.0 1.0, 100.0 1.0, 100.0 0.0))
*
* @deprecated use the field mapper in the spatial module
* TODO: Remove this class once we no longer need to supported reading 7.x indices that might have this field type
*/
@Deprecated
public class LegacyGeoShapeFieldMapper extends AbstractShapeGeometryFieldMapper<ShapeBuilder<?, ?, ?>> {
Expand Down Expand Up @@ -533,14 +534,9 @@ public PrefixTreeStrategy resolvePrefixTreeStrategy(String strategyName) {
}

@Override
protected boolean isBoundsExtractionSupported() {
// Extracting bounds for geo shapes is not implemented yet.
return false;
}

@Override
protected CoordinateEncoder coordinateEncoder() {
return CoordinateEncoder.GEO;
public BlockLoader blockLoader(BlockLoaderContext blContext) {
// Legacy geo-shapes do not support doc-values, we can only lead from source in ES|QL
return blockLoaderFromSource(blContext);
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -180,12 +180,6 @@ protected Object parseSourceValue(Object value) {
};
}

@Override
public BlockLoader blockLoader(BlockLoaderContext blContext) {
// Currently we can only load from source in ESQL
return blockLoaderFromSource(blContext);
}

protected BlockLoader blockLoaderFromSource(BlockLoaderContext blContext) {
ValueFetcher fetcher = valueFetcher(blContext.sourcePaths(name()), nullValue, GeometryFormatterFactory.WKB);
// TODO consider optimization using BlockSourceReader.lookupFromFieldNames(blContext.fieldNames(), name())
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,16 +10,12 @@

import org.apache.lucene.index.BinaryDocValues;
import org.apache.lucene.index.LeafReaderContext;
import org.apache.lucene.util.BytesRef;
import org.elasticsearch.common.Explicit;
import org.elasticsearch.common.geo.Orientation;
import org.elasticsearch.geometry.Rectangle;
import org.elasticsearch.geometry.utils.WellKnownBinary;
import org.elasticsearch.lucene.spatial.CoordinateEncoder;
import org.elasticsearch.lucene.spatial.Extent;
import org.elasticsearch.lucene.spatial.GeometryDocValueReader;

import java.io.IOException;
import java.nio.ByteOrder;
import java.util.Map;
import java.util.function.Function;

Expand Down Expand Up @@ -75,29 +71,27 @@ public Orientation orientation() {

@Override
protected Object nullValueAsSource(T nullValue) {
// we don't support null value fors shapes
// we don't support null value for shapes
return nullValue;
}

@Override
public BlockLoader blockLoader(BlockLoaderContext blContext) {
return blContext.fieldExtractPreference() == FieldExtractPreference.EXTRACT_SPATIAL_BOUNDS && isBoundsExtractionSupported()
? new BoundsBlockLoader(name(), coordinateEncoder())
: blockLoaderFromSource(blContext);
}

protected abstract boolean isBoundsExtractionSupported();

protected abstract CoordinateEncoder coordinateEncoder();

// Visible for testing
static class BoundsBlockLoader extends BlockDocValuesReader.DocValuesBlockLoader {
protected static class BoundsBlockLoader extends BlockDocValuesReader.DocValuesBlockLoader {
private final String fieldName;
private final CoordinateEncoder encoder;

BoundsBlockLoader(String fieldName, CoordinateEncoder encoder) {
protected BoundsBlockLoader(String fieldName) {
this.fieldName = fieldName;
this.encoder = encoder;
}

protected void writeExtent(BlockLoader.IntBuilder builder, Extent extent) {
// We store the 6 values as a single multi-valued field, in the same order as the fields in the Extent class
builder.beginPositionEntry();
builder.appendInt(extent.top);
builder.appendInt(extent.bottom);
builder.appendInt(extent.negLeft);
builder.appendInt(extent.negRight);
builder.appendInt(extent.posLeft);
builder.appendInt(extent.posRight);
builder.endPositionEntry();
}

@Override
Expand All @@ -107,7 +101,7 @@ public BlockLoader.AllReader reader(LeafReaderContext context) throws IOExceptio
public BlockLoader.Block read(BlockLoader.BlockFactory factory, BlockLoader.Docs docs) throws IOException {
var binaryDocValues = context.reader().getBinaryDocValues(fieldName);
var reader = new GeometryDocValueReader();
try (var builder = factory.bytesRefs(docs.count())) {
try (var builder = factory.ints(docs.count())) {
for (int i = 0; i < docs.count(); i++) {
read(binaryDocValues, docs.get(i), reader, builder);
}
Expand All @@ -119,27 +113,17 @@ public BlockLoader.Block read(BlockLoader.BlockFactory factory, BlockLoader.Docs
public void read(int docId, BlockLoader.StoredFields storedFields, BlockLoader.Builder builder) throws IOException {
var binaryDocValues = context.reader().getBinaryDocValues(fieldName);
var reader = new GeometryDocValueReader();
read(binaryDocValues, docId, reader, (BytesRefBuilder) builder);
read(binaryDocValues, docId, reader, (IntBuilder) builder);
}

private void read(BinaryDocValues binaryDocValues, int doc, GeometryDocValueReader reader, BytesRefBuilder builder)
private void read(BinaryDocValues binaryDocValues, int doc, GeometryDocValueReader reader, IntBuilder builder)
throws IOException {
if (binaryDocValues.advanceExact(doc) == false) {
builder.appendNull();
return;
}
reader.reset(binaryDocValues.binaryValue());
var extent = reader.getExtent();
// This is rather silly: an extent is already encoded as ints, but we convert it to Rectangle to
// preserve its properties as a WKB shape, only to convert it back to ints when we compute the
// aggregation. An obvious optimization would be to avoid this back-and-forth conversion.
var rectangle = new Rectangle(
encoder.decodeX(extent.minX()),
encoder.decodeX(extent.maxX()),
encoder.decodeY(extent.maxY()),
encoder.decodeY(extent.minY())
);
builder.appendBytesRef(new BytesRef(WellKnownBinary.toWKB(rectangle, ByteOrder.LITTLE_ENDIAN)));
writeExtent(builder, reader.getExtent());
}

@Override
Expand All @@ -151,7 +135,7 @@ public boolean canReuse(int startingDocID) {

@Override
public BlockLoader.Builder builder(BlockLoader.BlockFactory factory, int expectedCount) {
return factory.bytesRefs(expectedCount);
return factory.ints(expectedCount);
}
}
}
Expand Down
Loading