elasticsearch5.x:查詢建議介紹、Suggester 介紹
參考:http://www.cnblogs.com/leeSmall/p/9206646.html
參考(重點):https://elasticsearch.cn/article/142
參考(官網):https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-completion.html
一、查詢建議介紹
1. 查詢建議是什么?
查詢建議,為用戶提供良好的使用體驗。主要包括: 拼寫檢查; 自動建議查詢詞(自動補全)
拼寫檢查如圖:?
自動建議查詢詞(自動補全):?
2. ES中查詢建議的API
查詢建議也是使用_search端點地址。在DSL中suggest節點來定義需要的建議查詢
示例1:定義單個建議查詢詞
POST twitter/_search
{"query" : {"match": {"message": "tring out Elasticsearch"}},"suggest" : { <!-- 定義建議查詢 -->"my-suggestion" : { <!-- 一個建議查詢名 -->"text" : "tring out Elasticsearch", <!-- 查詢文本 -->"term" : { <!-- 使用詞項建議器 -->"field" : "message" <!-- 指定在哪個字段上獲取建議詞 -->}}}
}PUT index
{"mappings":{"completion":{"properties":{"title": {"type": "text","analyzer": "ik_smart"},"title_suggest": {"type": "completion","analyzer": "ik_smart","search_analyzer": "ik_smart"}}}}}
示例2:定義多個建議查詢詞
POST _search {"suggest": {"my-suggest-1" : {"text" : "tring out Elasticsearch","term" : {"field" : "message"}},"my-suggest-2" : {"text" : "kmichy","term" : {"field" : "user"}}} }
?
示例3:多個建議查詢可以使用全局的查詢文本
POST _search {"suggest": {"text" : "tring out Elasticsearch","my-suggest-1" : {"term" : {"field" : "message"}},"my-suggest-2" : {"term" : {"field" : "user"}}} }
?
二、Suggester 介紹
1. Term suggester
term 詞項建議器,對給入的文本進行分詞,為每個詞進行模糊查詢提供詞項建議。對于在索引中存在詞默認不提供建議詞,不存在的詞則根據模糊查詢結果進行排序后取一定數量的建議詞。
常用的建議選項:?
示例1:
POST twitter/_search
{"query" : {"match": {"message": "tring out Elasticsearch"}},"suggest" : { <!-- 定義建議查詢 -->"my-suggestion" : { <!-- 一個建議查詢名 -->"text" : "tring out Elasticsearch", <!-- 查詢文本 -->"term" : { <!-- 使用詞項建議器 -->"field" : "message" <!-- 指定在哪個字段上獲取建議詞 -->}}}
}
2. phrase suggester
phrase 短語建議,在term的基礎上,會考量多個term之間的關系,比如是否同時出現在索引的原文里,相鄰程度,以及詞頻等
示例
POST twitter/_search {"query" : {"match": {"message": "tring out Elasticsearch"}},"suggest" : {"my-suggestion" : {"text" : "tring out Elasticsearch","phrase" : {"field" : "message"}}} }
?
結果:
{"took": 30,"timed_out": false,"_shards": {"total": 5,"successful": 5,"skipped": 0,"failed": 0},"hits": {"total": 2,"max_score": 1.113083,"hits": [{"_index": "twitter","_type": "tweet","_id": "4","_score": 1.113083,"_source": {"user": "kimchy","postDate": "2018-07-23T07:29:57.653Z","message": "trying out Elasticsearch"}},{"_index": "twitter","_type": "tweet","_id": "7","_score": 0.98382175,"_source": {"user": "yuchen20","postDate": "2018-07-23T08:12:05.604Z","message": "trying out Elasticsearch"}}]},"suggest": { <!-- 建議-->"my-suggestion": [{"text": "tring out Elasticsearch","offset": 0,"length": 23,"options": [{{"text": "trying out elasticsearch","score": 0.5118434}]}]}
}
3. Completion suggester 自動補全
針對自動補全場景而設計的建議器。此場景下用戶每輸入一個字符的時候,就需要即時發送一次查詢請求到后端查找匹配項,在用戶輸入速度較高的情況下對后端響應速度要求比較苛刻。因此實現上它和前面兩個Suggester采用了不同的數據結構,索引并非通過倒排來完成,而是將analyze過的數據編碼成FST和索引一起存放。對于一個open狀態的索引,FST會被ES整個裝載到內存里的,進行前綴查找速度極快。但是FST只能用于前綴查找,這也是Completion Suggester的局限所在。
官網鏈接:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-completion.html
示例1:
為了使用自動補全,索引中用來提供補全建議的字段需特殊設計,字段類型為 completion。 先設置mapping:
PUT index/ {"mappings":{"completion":{"properties":{"title": {"type": "text","analyzer": "ik_smart"},"title_suggest": {"type": "completion","analyzer": "ik_smart","search_analyzer": "ik_smart"}}}}}
?
重點是title_suggest,這個字段就是之后我們搜索補全的字段,需要設置type為completion,analyzer按情況設置分析器
索引數據:
POST /index/completion/_bulk { "index" : { } } { "title": "背景天安門廣場大學", "title_suggest": "背景天安門廣場大學"} { "index" : { } } { "title": "北京天安門","title_suggest": "北京天安門"} { "index" : { } } { "title": "北京鳥巢","title_suggest": "北京鳥巢"} { "index" : { } } { "title": "奧林匹克公園","title_suggest": "奧林匹克公園"} { "index" : { } } { "title": "奧林匹克森林公園","title_suggest": "奧林匹克森林公園"} { "index" : { } } { "title": "北京奧林匹克公園","title_suggest": "北京奧林匹克公園"} { "index" : { } } { "title": "北京奧林匹克公園","title_suggest": {"input": "我愛中國","weight": 100}}
?
索引的時候可以對suggest字段,增加weight增加排序權重
搜索補全:
POST /index/completion/_search {"size": 0,"suggest":{"blog-suggest":{"prefix":"北京","completion":{"field":"title_suggest"}}} }
?
結果:
{"took": 3,"timed_out": false,"_shards": {"total": 5,"successful": 5,"skipped": 0,"failed": 0},"hits": {"total": 0,"max_score": 0,"hits": []},"suggest": {"blog-suggest": [{"text": "北京","offset": 0,"length": 2,"options": [{"text": "北京天安門","_index": "index","_type": "completion","_id": "AWSRo_hn9K_aupETR6FR","_score": 1,"_source": {"title": "北京天安門","title_suggest": "北京天安門"}},{"text": "北京奧林匹克公園","_index": "index","_type": "completion","_id": "AWSRo_hn9K_aupETR6FV","_score": 1,"_source": {"title": "北京奧林匹克公園","title_suggest": "北京奧林匹克公園"}},{"text": "北京鳥巢","_index": "index","_type": "completion","_id": "AWSRo_hn9K_aupETR6FS","_score": 1,"_source": {"title": "北京鳥巢","title_suggest": "北京鳥巢"}}]}]} }
?
示例2:
創建映射
PUT music
{"mappings": {"docc" : {"properties" : {"suggest" : {"type" : "completion"},"title" : {"type": "keyword"}}}}
}
Input 指定輸入詞 Weight 指定排序值(可選)
PUT music/docc/1?refresh
{"suggest" : {"input": [ "Nevermind", "Nirvana" ],"weight" : 34}
}
指定不同的排序值:
PUT music/_doc/1?refresh
{"suggest" : [{"input": "Nevermind","weight" : 10},{"input": "Nirvana","weight" : 3}]}
放入一條重復數據
PUT music/docc/2?refresh {"suggest" : {"input": [ "Nevermind", "Nirvana" ],"weight" : 20} }
?
查詢建議根據前綴查詢:
POST music/_search?pretty
{"suggest": {"song-suggest" : {"prefix" : "nir", "completion" : { "field" : "suggest" }}}
}
對建議查詢結果去重: "skip_duplicates": true ,該特性在6.x支持,5.x不支持
POST music/_search?pretty {"suggest": {"song-suggest" : {"prefix" : "nir", "completion" : { "field" : "suggest","skip_duplicates": true }} }}
?
查詢建議文檔存儲短語
PUT music/docc/3?refresh {"suggest" : {"input": [ "lucene solr", "lucene so cool","lucene elasticsearch" ],"weight" : 20} }PUT music/docc/4?refresh {"suggest" : {"input": ["lucene solr cool","lucene elasticsearch" ],"weight" : 10} }
?
查詢
POST music/_search?pretty
{"suggest": {"song-suggest" : {"prefix" : "lucene s", "completion" : { "field" : "suggest" }}}}
三 、java -api
## elasticsearch5.x:查詢建議java-api介紹、Suggester 介紹
參考:http://www.mamicode.com/info-detail-2347270.htmlpackage com.youlan.es.util;import java.util.concurrent.ExecutionException;import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.transport.TransportClient;
import org.elasticsearch.rest.RestStatus;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import org.elasticsearch.search.suggest.*;
import org.elasticsearch.search.suggest.completion.CompletionSuggestion;
import org.elasticsearch.search.suggest.phrase.PhraseSuggestion;
import org.elasticsearch.search.suggest.term.TermSuggestion;public class SuggestDemo {private static Logger logger = LogManager.getRootLogger();//拼寫檢查(英文)public static void termSuggest(TransportClient client) {// 1、創建search請求//SearchRequest searchRequest = new SearchRequest();SearchRequest searchRequest = new SearchRequest("twitter");// 2、用SearchSourceBuilder來構造查詢請求體 ,請仔細查看它的方法,構造各種查詢的方法都在這。SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();sourceBuilder.size(0);//做查詢建議//詞項建議SuggestionBuilder termSuggestionBuilder =SuggestBuilders.termSuggestion("message").text("tring out Elticsearch");//搜索框輸入內容:tring out ElticsearchSuggestBuilder suggestBuilder = new SuggestBuilder();suggestBuilder.addSuggestion("suggest_user", termSuggestionBuilder);sourceBuilder.suggest(suggestBuilder);searchRequest.source(sourceBuilder);try{//3、發送請求SearchResponse searchResponse = client.search(searchRequest).get();//4、處理響應//搜索結果狀態信息if(RestStatus.OK.equals(searchResponse.status())) {// 獲取建議結果Suggest suggest = searchResponse.getSuggest();TermSuggestion termSuggestion = suggest.getSuggestion("suggest_user");for (TermSuggestion.Entry entry : termSuggestion.getEntries()) {logger.info("text: " + entry.getText().string());for (TermSuggestion.Entry.Option option : entry) {String suggestText = option.getText().string();//建議內容logger.info(" suggest option : " + suggestText);}}}} catch (InterruptedException | ExecutionException e) {logger.error(e);}/*"suggest": {"my-suggestion": [{"text": "tring","offset": 0,"length": 5,"options": [{"text": "trying","score": 0.8,"freq": 2}]},{"text": "out","offset": 6,"length": 3,"options": []},{"text": "elasticsearch","offset": 10,"length": 13,"options": []}]}*/}public static void phraseSuggest(TransportClient client){//1、創建search請求SearchRequest searchRequest = new SearchRequest("twitter");//2、構造查詢qing'qi請求體SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();sourceBuilder.size(0);SuggestionBuilder phraseSuggestBuilder = SuggestBuilders.phraseSuggestion( "message").text("tring out");SuggestBuilder suggestBuilder = new SuggestBuilder();suggestBuilder.addSuggestion("my-suggestion",phraseSuggestBuilder);sourceBuilder.suggest(suggestBuilder);searchRequest.source(sourceBuilder);try {//3、發送請求SearchResponse searchResponse = client.search(searchRequest).get();//4、處理響應//搜索狀態信息if (RestStatus.OK.equals(searchResponse.status())){//獲得建議Suggest suggest = searchResponse.getSuggest();PhraseSuggestion phraseSuggestion =suggest.getSuggestion("my-suggestion");for (PhraseSuggestion.Entry entry:phraseSuggestion){logger.info("text:"+entry.getText().string());for (PhraseSuggestion.Entry.Option option:entry){String suggestText = option.getText().string();logger.info(" suggest option :"+suggestText);}}}} catch (InterruptedException e) {logger.error("請求出錯:"+e);} catch (ExecutionException e) {logger.error(e);}}//自動補全public static void completionSuggester(TransportClient client) {// 1、創建search請求//SearchRequest searchRequest = new SearchRequest();SearchRequest searchRequest = new SearchRequest("music");// 2、用SearchSourceBuilder來構造查詢請求體 ,請仔細查看它的方法,構造各種查詢的方法都在這。SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();sourceBuilder.size(0);//做查詢建議//自動補全/*POST music/_search?pretty{"suggest": {"song-suggest" : {"prefix" : "lucene s","completion" : {"field" : "suggest" ,"skip_duplicates": true}}}}*/SuggestionBuilder termSuggestionBuilder =SuggestBuilders.completionSuggestion("suggest").prefix("lucene s");// .skipDuplicates(true) 6.x去重;SuggestBuilder suggestBuilder = new SuggestBuilder();suggestBuilder.addSuggestion("song-suggest", termSuggestionBuilder);sourceBuilder.suggest(suggestBuilder);searchRequest.source(sourceBuilder);try {//3、發送請求SearchResponse searchResponse = client.search(searchRequest).get();//4、處理響應//搜索結果狀態信息if(RestStatus.OK.equals(searchResponse.status())) {// 獲取建議結果Suggest suggest = searchResponse.getSuggest();CompletionSuggestion termSuggestion = suggest.getSuggestion("song-suggest");for (CompletionSuggestion.Entry entry : termSuggestion.getEntries()) {logger.info("text: " + entry.getText().string());for (CompletionSuggestion.Entry.Option option : entry) {String suggestText = option.getText().string();logger.info(" suggest option : " + suggestText);}}}} catch (InterruptedException | ExecutionException e) {logger.error(e);}
// 結果:
// {
// "took": 7,
// "timed_out": false,
// "_shards": {
// "total": 5,
// "successful": 5,
// "skipped": 0,
// "failed": 0
// },
// "hits": {
// "total": 0,
// "max_score": 0,
// "hits": []
// },
// "suggest": {
// "song-suggest": [
// {
// "text": "lucene s",
// "offset": 0,
// "length": 8,
// "options": [
// {
// "text": "lucene so cool",
// "_index": "music",
// "_type": "docc",
// "_id": "3",
// "_score": 20,
// "_source": {
// "suggest": {
// "input": [
// "lucene solr",
// "lucene so cool",
// "lucene elasticsearch"
// ],
// "weight": 20
// }
// }
// },
// {
// "text": "lucene solr cool",
// "_index": "music",
// "_type": "docc",
// "_id": "4",
// "_score": 10,
// "_source": {
// "suggest": {
// "input": [
// "lucene solr cool",
// "lucene elasticsearch"
// ],
// "weight": 10
// }
// }
// }
// ]
// }
// ]
// }
// }}public static void main(String[] args) {EsClient esClient= new EsClient();try (TransportClient client =esClient.getConnection() ;) {logger.info("---------------- 拼寫檢查:termSuggest----------------------");termSuggest(client);logger.info("------------------ 短語建議:phraseSuggest--------------------");phraseSuggest(client);logger.info("------------------ 自動補全:completionSuggester--------------------");completionSuggester(client);} catch (Exception e) {logger.error(e);}}
}