-
聚合可以實現對文檔數據的統計,分析,運算,聚合常見有三類(聚合的值一定不能是text類型的):
桶(Bucket)聚合:用來對文檔做分組。
度量(Metric)聚合:用以計算一些值,比方說最大值,最小值,平均值等。
管道(pipeline)聚合:其它聚合的結果為基礎進行聚合。
參與聚合的字段類型:keyword,數值,日期,布爾。
-
DSL實現Bucket聚合
lasticsearch 的 Bucket 聚合(桶聚合)是將文檔分組到 "桶" 中的強大工具,類似于 SQL 中的
GROUP BY
。每個桶關聯一個條件,符合條件的文檔會被分到對應的桶中。Terms聚合
-
場景:統計博客文章中每個標簽的文檔數量。
-
GET /blog/_search {"size": 0, ?// 不返回原始文檔,只返回聚合結果"aggs": {"tags": {"terms": {"field": "tags.keyword", ?// 使用keyword類型避免分詞"size": 10, ?// 返回前10個最常見的標簽"order": {"_count": "desc" ?// 按文檔數量降序排序}}}} } 結果示例 {"aggregations": {"tags": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 15,"buckets": [{"key": "elasticsearch","doc_count": 25},{"key": "java","doc_count": 18}]}} } //大多數 Bucket 聚合默認按文檔數量(_count)降序排序。 //按文檔數量排序DSL演示 GET /products/_search {"size": 0,"aggs": {"by_category": {"terms": {"field": "category.keyword","order": {"_count": "desc" ?// 按文檔數量降序(默認)}}}} } //結果演示 {"aggregations": {"by_category": {"buckets": [{ "key": "electronics", "doc_count": 120 },{ "key": "clothing", "doc_count": 80 },{ "key": "books", "doc_count": 50 }]}} } //場景:只對價格大于 100 的商品進行類別聚合 //DSL示例 GET /products/_search {"query": {"range": {"price": {"gt": 100}}},"size": 0,"aggs": {"by_category": {"terms": {"field": "category.keyword"}}} } //結果顯示 {"aggregations": {"by_category": {"buckets": [{"key": "electronics","doc_count": 100,"expensive_products": {"doc_count": 75, ?// 價格>100的電子產品數量"count": {"value": 75}}}]}} }
-
aggs代表聚合,與query同級,此時query的作用是限定聚合的的文檔范圍
-
聚合必須的三要素
-
聚合名稱
-
聚合類型
-
聚合字段
-
-
聚合可配置的屬性有:size:指定聚合結果數量,order指定聚合結果排序方式,field指定聚合字段。
-
-
DSL實現Metric聚合
計算所有產品的平均價格
GET /products/_search {"size": 0, ?// 不返回原始文檔"aggs": {"avg_price": {"avg": {"field": "price"}}} } //結果顯示 {"aggregations": {"avg_price": {"value": 125.5 ?// 平均價格}} }
嵌套聚合metric聚合的組合使用
//按類別分組,計算每個類別的平均價格、最高價格和最低價格。 GET /products/_search {"size": 0,"aggs": {"by_category": {"terms": {"field": "category.keyword"},"aggs": {"avg_price": { "avg": { "field": "price" } },"max_price": { "max": { "field": "price" } },"min_price": { "min": { "field": "price" } },"price_stats": { "stats": { "field": "price" } }}}} }
在java中進行聚合
import org.elasticsearch.action.search.SearchRequest; import org.elasticsearch.action.search.SearchResponse; import org.elasticsearch.client.RequestOptions; import org.elasticsearch.client.RestHighLevelClient; import org.elasticsearch.index.query.QueryBuilders; import org.elasticsearch.search.aggregations.AggregationBuilders; import org.elasticsearch.search.aggregations.bucket.filter.Filter; import org.elasticsearch.search.aggregations.bucket.terms.Terms; import org.elasticsearch.search.builder.SearchSourceBuilder; ? import java.io.IOException; ? public class FilterAggregationExample {private final RestHighLevelClient client;public FilterAggregationExample(RestHighLevelClient client) {this.client = client;}public void filterAggregation() throws IOException {SearchRequest searchRequest = new SearchRequest("products");SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();// 構建按類別分組的聚合,在每個類別中過濾價格>100的商品searchSourceBuilder.aggregation(AggregationBuilders.terms("by_category").field("category.keyword").subAggregation(AggregationBuilders.filter("expensive_products",QueryBuilders.rangeQuery("price").gt(100)).subAggregation(AggregationBuilders.valueCount("count").field("id"))));searchRequest.source(searchSourceBuilder);SearchResponse response = client.search(searchRequest, RequestOptions.DEFAULT);// 處理聚合結果Terms byCategory = response.getAggregations().get("by_category");for (Terms.Bucket bucket : byCategory.getBuckets()) {String category = bucket.getKeyAsString();long totalCount = bucket.getDocCount();Filter expensiveProducts = bucket.getAggregations().get("expensive_products");long expensiveCount = expensiveProducts.getDocCount();System.out.println("Category: " + category + ", Total: " + totalCount + ", Expensive: " + expensiveCount);}} }