You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Merge pull request #233 from mashhurs/esql-support
ES|QL support:
- introduces query_type params, accepts dsl or esql option.
- adds ES|QL executor to execute ESQL query and parse/map response to event
validations
- make sure LS (8.17.4+) supports ES|QL (new elasticsearch-ruby client)
- make sure connected ES is greater than 8.11+
- query isn't empty or meaningful that starts with command syntax
- if query_type is esql, make sure we accept meaningful inputs and do not allow response_type, index, etc.. DSL related params
- informing if query isn't using METADATA which adds _id, _version to the response entries
- informing ineffective params such as size, search_api, target if users configure
ES|QL results field names in a dotted format. The plugin reproduces nested (example {a.b.c: 'val'} => {'a':{'b':{'c':'val'}}})
Copy file name to clipboardExpand all lines: docs/index.asciidoc
+122-4Lines changed: 122 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -230,6 +230,110 @@ The next scheduled run:
230
230
* uses {ref}/point-in-time-api.html#point-in-time-api[Point in time (PIT)] + {ref}/paginate-search-results.html#search-after[Search after] to paginate through all the data, and
231
231
* updates the value of the field at the end of the pagination.
232
232
233
+
[id="plugins-{type}s-{plugin}-esql"]
234
+
==== {esql} support
235
+
236
+
.Technical Preview
237
+
****
238
+
The {esql} feature that allows using ES|QL queries with this plugin is in Technical Preview.
239
+
Configuration options and implementation details are subject to change in minor releases without being preceded by deprecation warnings.
240
+
****
241
+
242
+
{es} Query Language ({esql}) provides a SQL-like interface for querying your {es} data.
243
+
244
+
To use {esql}, this plugin needs to be installed in {ls} 8.17.4 or newer, and must be connected to {es} 8.11 or newer.
245
+
246
+
To configure {esql} query in the plugin, set the `query_type` to `esql` and provide your {esql} query in the `query` parameter.
247
+
248
+
IMPORTANT: {esql} is evolving and may still have limitations with regard to result size or supported field types. We recommend understanding https://www.elastic.co/guide/en/elasticsearch/reference/current/esql-limitations.html[ES|QL current limitations] before using it in production environments.
249
+
250
+
The following is a basic scheduled {esql} query that runs hourly:
251
+
[source, ruby]
252
+
input {
253
+
elasticsearch {
254
+
id => hourly_cron_job
255
+
hosts => [ 'https://..']
256
+
api_key => '....'
257
+
query_type => 'esql'
258
+
query => '
259
+
FROM food-index
260
+
| WHERE spicy_level = "hot" AND @timestamp > NOW() - 1 hour
261
+
| LIMIT 500
262
+
'
263
+
schedule => '0 * * * *' # every hour at min 0
264
+
}
265
+
}
266
+
267
+
Set `config.support_escapes: true` in `logstash.yml` if you need to escape special chars in the query.
268
+
269
+
NOTE: With {esql} query, {ls} doesn't generate `event.original`.
For this case, the plugin emits two events look like
286
+
[source, json]
287
+
[
288
+
{
289
+
"timestamp": "2025-04-10T12:00:00",
290
+
"user_id": 123,
291
+
"action": "login",
292
+
"status": {
293
+
"code": 200,
294
+
"desc": "Success"
295
+
}
296
+
},
297
+
{
298
+
"timestamp": "2025-04-10T12:05:00",
299
+
"user_id": 456,
300
+
"action": "purchase",
301
+
"status": {
302
+
"code": 403,
303
+
"desc": "Forbidden (unauthorized user)"
304
+
}
305
+
}
306
+
]
307
+
308
+
NOTE: If your index has a mapping with sub-objects where `status.code` and `status.desc` actually dotted fields, they appear in {ls} events as a nested structure.
309
+
310
+
[id="plugins-{type}s-{plugin}-esql-multifields"]
311
+
===== Conflict on multi-fields
312
+
313
+
{esql} query fetches all parent and sub-fields fields if your {es} index has https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/multi-fields[multi-fields] or https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/subobjects[subobjects].
314
+
Since {ls} events cannot contain parent field's concrete value and sub-field values together, the plugin ignores sub-fields with warning and includes parent.
315
+
We recommend using the `RENAME` (or `DROP` to avoid warnings) keyword in your {esql} query explicitly rename the fields to include sub-fields into the event.
316
+
317
+
This a common occurrence if your template or mapping follows the pattern of always indexing strings as "text" (`field`) + " keyword" (`field.keyword`) multi-field.
318
+
In this case it's recommended to do `KEEP field` if the string is identical and there is only one subfield as the engine will optimize and retrieve the keyword, otherwise you can do `KEEP field.keyword | RENAME field.keyword as field`.
319
+
320
+
To illustrate the situation with example, assuming your mapping has a time `time` field with `time.min` and `time.max` sub-fields as following:
321
+
[source, ruby]
322
+
"properties": {
323
+
"time": { "type": "long" },
324
+
"time.min": { "type": "long" },
325
+
"time.max": { "type": "long" }
326
+
}
327
+
328
+
The {esql} result will contain all three fields but the plugin cannot map them into {ls} event.
329
+
To avoid this, you can use the `RENAME` keyword to rename the `time` parent field to get all three fields with unique fields.
330
+
[source, ruby]
331
+
...
332
+
query => 'FROM my-index | RENAME time AS time.current'
333
+
...
334
+
335
+
For comprehensive {esql} syntax reference and best practices, see the https://www.elastic.co/guide/en/elasticsearch/reference/current/esql-syntax.html[{esql} documentation].
336
+
233
337
[id="plugins-{type}s-{plugin}-options"]
234
338
==== Elasticsearch Input configuration options
235
339
@@ -257,6 +361,7 @@ Please check out <<plugins-{type}s-{plugin}-obsolete-options>> for details.
@@ -498,22 +603,35 @@ environment variables e.g. `proxy => '${LS_PROXY:}'`.
498
603
* Value type is <<string,string>>
499
604
* Default value is `'{ "sort": [ "_doc" ] }'`
500
605
501
-
The query to be executed. Read the {ref}/query-dsl.html[Elasticsearch query DSL
502
-
documentation] for more information.
606
+
The query to be executed.
607
+
Accepted query shape is DSL or {esql} (when `query_type => 'esql'`).
608
+
Read the {ref}/query-dsl.html[{es} query DSL documentation] or {ref}/esql.html[{esql} documentation] for more information.
503
609
504
610
When <<plugins-{type}s-{plugin}-search_api>> resolves to `search_after` and the query does not specify `sort`,
505
611
the default sort `'{ "sort": { "_shard_doc": "asc" } }'` will be added to the query. Please refer to the {ref}/paginate-search-results.html#search-after[Elasticsearch search_after] parameter to know more.
506
612
613
+
[id="plugins-{type}s-{plugin}-query_type"]
614
+
===== `query_type`
615
+
616
+
* Value can be `dsl` or `esql`
617
+
* Default value is `dsl`
618
+
619
+
Defines the <<plugins-{type}s-{plugin}-query>> shape.
620
+
When `dsl`, the query shape must be valid {es} JSON-style string.
621
+
When `esql`, the query shape must be a valid {esql} string and `index`, `size`, `slices`, `search_api`, `docinfo`, `docinfo_target`, `docinfo_fields`, `response_type` and `tracking_field` parameters are not allowed.
622
+
507
623
[id="plugins-{type}s-{plugin}-response_type"]
508
624
===== `response_type`
509
625
510
-
* Value can be any of: `hits`, `aggregations`
626
+
* Value can be any of: `hits`, `aggregations`, `esql`
511
627
* Default value is `hits`
512
628
513
629
Which part of the result to transform into Logstash events when processing the
514
630
response from the query.
631
+
515
632
The default `hits` will generate one event per returned document (i.e. "hit").
516
-
When set to `aggregations`, a single Logstash event will be generated with the
633
+
634
+
When set to `aggregations`, a single {ls} event will be generated with the
517
635
contents of the `aggregations` object of the query's response. In this case the
518
636
`hits` object will be ignored. The parameter `size` will be always be set to
519
637
0 regardless of the default or user-defined value set in this plugin.
LS_ESQL_SUPPORT_VERSION="8.17.4"# the version started using elasticsearch-ruby v8
297
+
ES_ESQL_SUPPORT_VERSION="8.11.0"
298
+
289
299
definitialize(params={})
290
300
super(params)
291
301
@@ -302,10 +312,17 @@ def register
302
312
fill_hosts_from_cloud_id
303
313
setup_ssl_params!
304
314
305
-
@base_query=LogStash::Json.load(@query)
306
-
if@slices
307
-
@base_query.include?('slice') && fail(LogStash::ConfigurationError,"Elasticsearch Input Plugin's `query` option cannot specify specific `slice` when configured to manage parallel slices with `slices` option")
308
-
@slices < 1 && fail(LogStash::ConfigurationError,"Elasticsearch Input Plugin's `slices` option must be greater than zero, got `#{@slices}`")
raise(LogStash::ConfigurationError,"Configured #{not_allowed_options} params are not allowed while using ES|QL query")ifnot_allowed_options&.size > 1
320
+
else
321
+
@base_query=LogStash::Json.load(@query)
322
+
if@slices
323
+
@base_query.include?('slice') && fail(LogStash::ConfigurationError,"Elasticsearch Input Plugin's `query` option cannot specify specific `slice` when configured to manage parallel slices with `slices` option")
324
+
@slices < 1 && fail(LogStash::ConfigurationError,"Elasticsearch Input Plugin's `slices` option must be greater than zero, got `#{@slices}`")
325
+
end
309
326
end
310
327
311
328
@retries < 0 && fail(LogStash::ConfigurationError,"Elasticsearch Input Plugin's `retries` option must be equal or greater than zero, got `#{@retries}`")
@@ -341,11 +358,13 @@ def register
341
358
342
359
test_connection!
343
360
361
+
validate_es_for_esql_support!
362
+
344
363
setup_serverless
345
364
346
365
setup_search_api
347
366
348
-
setup_query_executor
367
+
@query_executor=create_query_executor
349
368
350
369
setup_cursor_tracker
351
370
@@ -363,16 +382,6 @@ def run(output_queue)
363
382
end
364
383
end
365
384
366
-
defget_query_object
367
-
if@cursor_tracker
368
-
query=@cursor_tracker.inject_cursor(@query)
369
-
@logger.debug("new query is #{query}")
370
-
else
371
-
query=@query
372
-
end
373
-
LogStash::Json.load(query)
374
-
end
375
-
376
385
##
377
386
# This can be called externally from the query_executor
fail("Current version of Logstash does not include Elasticsearch client which supports ES|QL. Please upgrade Logstash to at least #{LS_ESQL_SUPPORT_VERSION}")
742
+
end
743
+
end
744
+
745
+
defvalidate_esql_query!
746
+
fail(LogStash::ConfigurationError,"`query` cannot be empty")if@query.strip.empty?
fail("Connected Elasticsearch #{es_version} version does not supports ES|QL. ES|QL feature requires at least Elasticsearch #{ES_ESQL_SUPPORT_VERSION} version.")unlesses_supports_esql
0 commit comments