Elasticsearch 基础教程

Elasticsearch 高级教程

Elasticsearch 插件

Elasticsearch 笔记

Elasticsearch 笔记

Elasticsearch FAQ

Elasticsearch FAQ

本文链接：https://www.knowledgedict.com/tutorial/elasticsearch-distinct-field-aggr.html

es 去重查询（聚合、分组、分页、求和统计等）

Elasticsearch 笔记

elasticsearch（es）如何针对指定字段进行去重相关查询，完成如聚合、分组、分页、类似求和统计等操作？

1获取所有的不同值
2去重后分页
3聚合求和统计

获取所有的不同值

es 获取指定字段所有可能的值，可以使用桶聚合的 terms 聚合，如下示例：

GET {index}/_search
{
  "size": 0,
  "aggs": {
    "distinct_aggs": {
      "terms": {
        "field": "status"
      }
    }
  }
}

如上示例，获取指定索引的 status 字段的不同值，size 字段设置为 0，表示搜索出来的文档数为 0 个，也表示不关心文档内容只要聚合结果。如果为 1 ，就会搜索出 1 个文档。返回如下：

{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 3,
    "successful": 3,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 58439,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "distinct_aggs": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": 3,
          "doc_count": 46619
        },
        {
          "key": 2,
          "doc_count": 11810
        },
        {
          "key": 1,
          "doc_count": 10
        }
      ]
    }
  }
}

去重后分页

分页的话，肯定需要有排序规则，接着如上示例，增加的获取的条数参数 size 和排序参数 order 即可：

GET {index}/_search
{
  "size": 0,
  "aggs": {
    "distinct_aggs": {
      "terms": {
        "field": "item_id",
        "size" : 1000,
        "order": {
          "_term": "asc"
        }
      }
    }
  }
}

输出如下：

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 3,
    "successful": 3,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 58463,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "distinct_aggs": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": 1,
          "doc_count": 32
        },
        {
          "key": 2,
          "doc_count": 11811
        },
        {
          "key": 3,
          "doc_count": 46620
        },
        ...
      ]
    }
  }
}

聚合求和统计

聚合字段的排序，也可以通过指定字段的求和等计算统计结果后进行升降序排序，具体示例如下：

GET {index}/_search
{
  "size": 0,
  "aggs": {
    "item_terms": {
      "terms": {
        "field": "item_id",
        "size": 1000,
        "order":[{
          "gmv_stat": "desc"
        },{
          "gmv_180d": "desc"
        }]
      },
      "aggs": {
        "gmv_stat": {
          "sum": {
            "field": "gmv"
          }
        },
        "gmv_180d": {
          "sum": {
            "script": "doc['gmv_90d'].value*2"
          }
        }
      }
    }
  }
}

返回如下：

{
  ...
  "aggregations": {
    "item_terms": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 260,
      "buckets": [
        {
          "key": 23388,
          "doc_count": 18,
          "gmv_stat": {
            "value": 176220
          },
          "gmv_180d": {
            "value": 89732
          }
        },
        {
          "key": 96117,
          "doc_count": 16,
          "gmv_stat": {
            "value": 129306
          },
          "gmv_180d": {
            "value": 56988
          }
        },
        ...
      ]
    }
  }
}

Elasticsearch（es）插件安装、管理、开发、使用全解

es 插件是一种增强 Elasticsearch 核心功能的途径，诸如常用的 ik 中文分词器、可视化操作界面 head 插件、集群监控工具 ...

es 中文分词器详解（安装、使用、自定义词典等）

elasticsearch 默认提供的分词器 standard 对中文分词不优化，效果差，一般会安装第三方中文分词插件，通常首先 elast ...

Elasticsearch（Es）聚合查询（指标聚合、桶聚合）

Elasticsearch 的聚合功能十分强大，可在数据上做复杂的分析统计。它提供的聚合分析功能有指标聚合（metrics aggregat ...

Spring AOP 中 JoinPoint 如何获取参数值、参数名称、方法名称、返回值、注解等

Spring AOP 通过“面向切面编程”可以在指定的 controller、service、dao 层等，无感知无侵入性地嵌入逻辑；切面编 ...

Elasticsearch Analyzer（分析器）组成、配置、执行顺序等详解

Elasticsearch 不管是索引任务还是搜索工作，都需要经过 es 的 analyzer（分析器），至于分析器，它分为内置分析器和自定 ...