[FEATURE] Perform big data performance and regression tests

## Is your feature request related to a problem?
Issue from the customer: https://opensearch.slack.com/archives/C0526AVT84S/p1689036508739229
> For context, `my_alias` points to 800 indices, each index is sorted by `hw_id` and `snapshot_day`, each index has only one unique value of `snapshot_day` and each index has ~750 million records distributed on 10 shards of ~30GB each.
> 
> The cluster is in AWS, has 20 data nodes and 3 master nodes. All are `r6g.12xlarge.search`. 

SQL query runs ~8 seconds:
```sql
select * from my_alias where hw_id = 'abcd' and (snapshot_day between cast('2023-04-27' as date) and cast('2023-05-03' as date) or snapshot_day between cast('2023-06-30' as date) and cast('2023-07-05' as date)) limit 10000
```
DSL equivalent runs ~4:
```json
{
  "from": 0,
  "size": 10000,
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "hw_id": "abcd"
          }
        },
        {
          "bool": {
            "should": [
              {
                "range": {
                  "snapshot_day": {
                    "gte": "2023-04-27",
                    "lte": "2023-05-03"
                  }
                }
              },
              {
                "range": {
                  "snapshot_day": {
                    "gte": "2023-06-30",
                    "lte": "2023-07-05"
                  }
                }
              }
            ]
          }
        }
      ]
    }
  }
}
```

**NOTE**
Track down all slack discussion, the query was optimized and accelerated.

## What solution would you like?
* Get or generate a huge dataset
* Allocate a cluster for tests
* Make test framework (maybe reuse Jenkins)
* Run the test and investigate this specific issue
* Run more tests to find other bottlenecks
* Re-run all tests on all OpenSearch releases to detect degradation
* Update release workflow to repeat these tests before every code freeze

## What alternatives have you considered?
* Create `Best Practices` documentation section
* Automatically optimize some functions and replace them by literals, e.g. `DATE('...')` to `DATE '...'`, `PI()` to `3.1415` and so on to reduce or even completely avoid scripts in filters pushed down

## Do you have any additional context?
Opened #1847

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FEATURE] Perform big data performance and regression tests #1862

Is your feature request related to a problem?

What solution would you like?

What alternatives have you considered?

Do you have any additional context?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEATURE] Perform big data performance and regression tests #1862

Description

Is your feature request related to a problem?

What solution would you like?

What alternatives have you considered?

Do you have any additional context?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions