es的语法笔记

写的有点乱,先把容易引起我误解的地方记录下

一:模糊查询wildcard

注意: 如果是用message则查不出来,因为message经过分词了

{
   "query": {
    "wildcard": {
      "message.keyword": {
        "value": "*开始执行查询es的*"
      }
    }
  }
  ,
        "_source":{
            "includes":["applicationName","message"]
        }
}

二:查询:分组后取其中一条记录的自它字段

指定时间范围的applicationName为A服务的且requestUri关键词中有 “test1”或者“user”

的路径中,以这些进行分组,按照数量降序取前20个requestUri,而且还要取每个requestUri中按照时间降序的第一条记录的其他字段信息

{
      "query": {
        "bool": {
          "filter": [
            {
              "range": {
                "@timestamp": {
                  "gte": "2024-06-12 15:01:01",
                 "lte": "2024-06-12 21:21:01",
                  "time_zone": "Asia/Shanghai",
                  "format": "yyyy-MM-dd HH:mm:ss"
                }
              }
            },
            {
              "term": {
                "applicationName.keyword": "A服务"
              }
            },
            {
              "terms": {
                "requestUri": ["test1","user"]
              }
            }
          ]
        }
      },
      "size": 0,
      "aggs": {
        "group_by_requestUri": {
          "terms": {
            "field": "requestUri.keyword",
            "size": "20",
            "order": [
              {
                "_count": "desc"
              }
            ]
          },
          "aggs": {
            "time_top_last": {
              "top_hits": {
                "sort": [
                  {
                    "@timestamp": {
                      "order": "desc"
                    }
                  }
                ],
                "size": 1,
                "_source": {
                  "includes": [
                    "userIp",
                    "ipCity",
                    "serverIp",
                    "systemId",
                    "requestID",
                    "userIdentity"
                  ]
                }
              }
            }
          }
        }
      }
    }

以下为查询结果: 将top20改为了top2

{

    "took": 2384,

    "timed_out": false,

    "_shards": {

        "total": 161,

        "successful": 161,

        "skipped": 115,

        "failed": 0

    },

    "hits": {

        "total": {

            "value": 10000,

            "relation": "gte"

        },

        "max_score": null,

        "hits": []

    },

    "aggregations": {

        "group_by_requestUri": {

            "doc_count_error_upper_bound": 90,

            "sum_other_doc_count": 5035,

            "buckets": [

                {

                    "key": "/user/v1/login",

                    "doc_count": 5737,

                    "time_top_last": {

                        "hits": {

                            "total": {

                                "value": 5737,

                                "relation": "eq"

                            },

                            "max_score": null,

                            "hits": [

                                {

                                    "_index": "ubsp-monitor-prod-2024.06.12",

                                    "_type": "_doc",

                                    "_id": "31qaDJABHhV04n43xF0e",

                                    "_score": null,

                                    "_ignored": [

                                        "message.keyword"

                                    ],

                                    "_source": {

                                        "systemId": "000",

                                        "requestID": "6e685bbf6f066ecb"

                                    },

                                    "sort": [

                                        1718198385770

                                    ]

                                }

                            ]

                        }

                    }

                },

                {

                    "key": "/ibconfig/v1/user/auth/check",

                    "doc_count": 4972,

                    "time_top_last": {

                        "hits": {

                            "total": {

                                "value": 4972,

                                "relation": "eq"

                            },

                            "max_score": null,

                            "hits": [

                                {

                                    "_index": "ubsp-gateway-2024.06.12",

                                    "_type": "_doc",

                                    "_id": "QPGbDJABJdDMneTabMrr",

                                    "_score": null,

                                    "_ignored": [

                                        "message.keyword"

                                    ],

                                    "_source": {

                                        "systemId": "097",

                                        "ipCity": "内网IP",

                                        "userIp": "10.178.141.51",

                                        "userIdentity": "AT-115-xxx"

                                    },

                                    "sort": [

                                        1718198425834

                                    ]

                                }

                            ]

                        }

                    }

                }

            ]

        }

    }

}

三:并列分组 

查找时间范围且服务名为A服务的数据后以systemId进行分组倒叙排列取其前5条,且每条记录里以requestUri分组并取数量最多的前两条,同时也以userIp进行分组取其前两条数据

{
      "query": {
        "bool": {
          "filter": [
            {
              "range": {
                "@timestamp": {
                  "gte": "2024-06-12 15:01:01",
                  "lte": "2024-06-12 21:21:01",
                  "time_zone": "Asia/Shanghai",
                  "format": "yyyy-MM-dd HH:mm:ss"
                }
              }
            },
            {
              "term": {
                "applicationName.keyword": "A服务"
              }
            }
          ]
        }
      },
      "size": 0,
      "aggs": {
        "group_by_systemId": {
          "terms": {
            "field": "systemId.keyword",
            "size": 5,
            "order": [
              {
                "_count": "desc"
              }
            ]
          },
          "aggs": {
            "group_by_requestUri": {
              "terms": {
                "field": "requestUri.keyword",
                "size": 2,
                "order": [
                  {
                    "_count": "desc"
                  }
                ]
              }
            },
            "group_by_userIp": {
              "terms": {
                "field": "userIp.keyword",
                "size": 2,
                "order": [
                  {
                    "_count": "desc"
                  }
                ]
              }
            }
          }
        }
      }
    }

结果:

{

    "took": 1855,

    "timed_out": false,

    "_shards": {

        "total": 161,

        "successful": 161,

        "skipped": 78,

        "failed": 0

    },

    "hits": {

        "total": {

            "value": 10000,

            "relation": "gte"

        },

        "max_score": null,

        "hits": []

    },

    "aggregations": {

        "group_by_systemId": {

            "doc_count_error_upper_bound": 851,

            "sum_other_doc_count": 46777,

            "buckets": [

                {

                    "key": "000",

                    "doc_count": 12535,

                    "group_by_requestUri": {

                        "doc_count_error_upper_bound": 148,

                        "sum_other_doc_count": 6559,

                        "buckets": [

                            {

                                "key": "/user/v1/login",

                                "doc_count": 5061

                            },

                            {

                                "key": "/user/v1/info",

                                "doc_count": 915

                            }

                        ]

                    },

                    "group_by_userIp": {

                        "doc_count_error_upper_bound": 0,

                        "sum_other_doc_count": 27,

                        "buckets": [

                            {

                                "key": "10.178.152.0",

                                "doc_count": 10848

                            },

                            {

                                "key": "192.168.250.250",

                                "doc_count": 539

                            }

                        ]

                    }

                },

                {

                    "key": "069",

                    "doc_count": 11476,

                    "group_by_requestUri": {

                        "doc_count_error_upper_bound": 152,

                        "sum_other_doc_count": 10285,

                        "buckets": [

                            {

                                "key": "/ocrmmanage/v1/system/org/path/16",

                                "doc_count": 898

                            },

                            {

                                "key": "/ocrmlz/v1/pc/email/queryorgname",

                                "doc_count": 293

                            }

                        ]

                    },

                    "group_by_userIp": {

                        "doc_count_error_upper_bound": 3,

                        "sum_other_doc_count": 1177,

                        "buckets": [

                            {

                                "key": "10.0.0.182",

                                "doc_count": 1654

                            },

                            {

                                "key": "192.168.236.165",

                                "doc_count": 1208

                            }

                        ]

                    }

                }

            ]

        }

    }

}

四:ES查询时只能查询10000条数据解决方案

      总的来说用"track_total_hits": true 会返回真实条数,为false不显示总数量,不设置时最多取10000

参考文档: ES查询时只能查询10000条数据解决方案_es查询超过一万怎么解决-CSDN博客

五: es默认检索数据为1w,那么再查询后聚合的数据也是限制再1w内吗?

不限制

虽然Elasticsearch默认设置中的某些参数(如index.max_result_window)可能暗示了对返回数据数量的限制,但这并不代表聚合分析只能针对10000条数据进行

六:关于doc_count_error_upper_bound和sum_other_doc_count

举个例子:水果种类有100种,当size:20种时只展示前20时,则有其余80种未聚合出来,

则: sum_other_doc_count代表的是其他相关却未展示的文档的数量,像其余80种因为再每个分片取top前20的原因,未取到的后80种水果聚合的文档数量

doc_count_error_upper_bound:则是有可能与前20有关但是没返回的文档数量,表示没有在这次聚合中返回、但是可能存在的潜在聚合结果。 比如也可能有另一个种类的水果,或许排在19位置

"aggs": {
        "group_by_systemId": {
          "terms": {
            "field": "水果种类",
            "size": 20,
             "order": [
              {
                 "_count": "desc"
               }
             ]
          }        
        }
    }    

参考:Elasticsearch核心技术与实战学习笔记_doccounterrorupperbound-CSDN博客

七:query_string加上\" 用的是短语查询还是关键词查询?

用的是短语查询

假设message里有很多信息,我们随便挑一个

"message": "【API网关】 【请求错误】 【请求地址】【/monitor/health】 \n【响应状态】 : 401 【响应信息】:Full authentication is required to access this resource\n",

因为message是不可能分出关键词:"应信息】:Full authentication is required"

{
      "query": {
        "bool": {
          "filter": [
              {
                "term": {
                 "message":"应信息】:Full authentication is required "
    
                }
              },
            {
              "range": {
                "@timestamp": {
                  "gte": "2024-06-12 15:01:01",
                  "lte": "2024-06-12 21:21:01",
                  "time_zone": "Asia/Shanghai",
                  "format": "yyyy-MM-dd HH:mm:ss"
                }
              }
            }
          ]
        }
      }
    }

上述查询出的结果为空

但是如果用下面语法是可以查出结果的:

注意:要用message不能是message.keyword

{
      "query": {
        "bool": {
          "filter": [
              {
                "query_string": {
                  "query": "message:\"应信息】:Full authentication is required \"",
                  "analyze_wildcard": true,
                  "time_zone": "Asia/Shanghai"
                }
              },
            {
              "range": {
                "@timestamp": {
                  "gte": "2024-06-12 15:01:01",
                  "lte": "2024-06-12 21:21:01",
                  "time_zone": "Asia/Shanghai",
                  "format": "yyyy-MM-dd HH:mm:ss"
                }
              }
            }
          ]
        }
      }
    }

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值