前面將結(jié)構(gòu)化查詢講完了,接下來主要學(xué)習(xí)的是es的全文檢索功能,其實(shí)如果說全文檢索包含哪些搜索方式的話,
主要就有大概以下幾種:
匹配查詢(match query)、短語查詢(match phrase query)、短語前綴查詢(match phrase prefix)、
多字段查詢(multi match query)、common terms query、Intervals query、simple query string,
基本就這么多,其實(shí)我們前面講述的query string查詢,如果嚴(yán)格區(qū)分的話,也算是全文檢索中的一個(gè)吧。
不過被稱為lucene語法查詢。后面我們會(huì)一一學(xué)習(xí)這些查詢的。
match query
match query即匹配查詢,在前面介紹term查詢的時(shí)候我們?cè)?jīng)提過一嘴,term查詢是不分詞的,match查詢是
分詞的。我們先查詢一個(gè),然后根據(jù)查詢的結(jié)果進(jìn)行分析:
GET bank/_search
{
"query": {
"match": {
"firstname": "血肉苦弱,機(jī)械飛升"
}
},
"profile": "true"
}
返回結(jié)果:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"profile" : {
"shards" : [
{
"id" : "[UhzKWPIsSgi8QaaJLHVmFg][bank][0]",
"searches" : [
{
"query" : [
{
"type" : "BooleanQuery",
"description" : "firstname:血 firstname:肉 firstname:苦 firstname:弱 firstname:機(jī) firstname:械 firstname:飛 firstname:升",
"time_in_nanos" : 177915,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 0,
"shallow_advance_count" : 0,
"set_min_competitive_score" : 0,
"next_doc" : 0,
"match" : 0,
"next_doc_count" : 0,
"score_count" : 0,
"compute_max_score_count" : 0,
"compute_max_score" : 0,
"advance" : 0,
"advance_count" : 0,
"score" : 0,
"build_scorer_count" : 2,
"create_weight" : 125337,
"shallow_advance" : 0,
"create_weight_count" : 1,
"build_scorer" : 52578
},
"children" : [
{
"type" : "TermQuery",
"description" : "firstname:血",
"time_in_nanos" : 42629,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 0,
"shallow_advance_count" : 0,
"set_min_competitive_score" : 0,
"next_doc" : 0,
"match" : 0,
"next_doc_count" : 0,
"score_count" : 0,
"compute_max_score_count" : 0,
"compute_max_score" : 0,
"advance" : 0,
"advance_count" : 0,
"score" : 0,
"build_scorer_count" : 2,
"create_weight" : 40568,
"shallow_advance" : 0,
"create_weight_count" : 1,
"build_scorer" : 2061
}
},
{
"type" : "TermQuery",
"description" : "firstname:肉",
"time_in_nanos" : 8407,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 0,
"shallow_advance_count" : 0,
"set_min_competitive_score" : 0,
"next_doc" : 0,
"match" : 0,
"next_doc_count" : 0,
"score_count" : 0,
"compute_max_score_count" : 0,
"compute_max_score" : 0,
"advance" : 0,
"advance_count" : 0,
"score" : 0,
"build_scorer_count" : 2,
"create_weight" : 7742,
"shallow_advance" : 0,
"create_weight_count" : 1,
"build_scorer" : 665
}
},
{
"type" : "TermQuery",
"description" : "firstname:苦",
"time_in_nanos" : 7289,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 0,
"shallow_advance_count" : 0,
"set_min_competitive_score" : 0,
"next_doc" : 0,
"match" : 0,
"next_doc_count" : 0,
"score_count" : 0,
"compute_max_score_count" : 0,
"compute_max_score" : 0,
"advance" : 0,
"advance_count" : 0,
"score" : 0,
"build_scorer_count" : 2,
"create_weight" : 6628,
"shallow_advance" : 0,
"create_weight_count" : 1,
"build_scorer" : 661
}
},
{
"type" : "TermQuery",
"description" : "firstname:弱",
"time_in_nanos" : 7141,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 0,
"shallow_advance_count" : 0,
"set_min_competitive_score" : 0,
"next_doc" : 0,
"match" : 0,
"next_doc_count" : 0,
"score_count" : 0,
"compute_max_score_count" : 0,
"compute_max_score" : 0,
"advance" : 0,
"advance_count" : 0,
"score" : 0,
"build_scorer_count" : 2,
"create_weight" : 6509,
"shallow_advance" : 0,
"create_weight_count" : 1,
"build_scorer" : 632
}
},
{
"type" : "TermQuery",
"description" : "firstname:機(jī)",
"time_in_nanos" : 6791,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 0,
"shallow_advance_count" : 0,
"set_min_competitive_score" : 0,
"next_doc" : 0,
"match" : 0,
"next_doc_count" : 0,
"score_count" : 0,
"compute_max_score_count" : 0,
"compute_max_score" : 0,
"advance" : 0,
"advance_count" : 0,
"score" : 0,
"build_scorer_count" : 2,
"create_weight" : 6179,
"shallow_advance" : 0,
"create_weight_count" : 1,
"build_scorer" : 612
}
},
{
"type" : "TermQuery",
"description" : "firstname:械",
"time_in_nanos" : 6721,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 0,
"shallow_advance_count" : 0,
"set_min_competitive_score" : 0,
"next_doc" : 0,
"match" : 0,
"next_doc_count" : 0,
"score_count" : 0,
"compute_max_score_count" : 0,
"compute_max_score" : 0,
"advance" : 0,
"advance_count" : 0,
"score" : 0,
"build_scorer_count" : 2,
"create_weight" : 6113,
"shallow_advance" : 0,
"create_weight_count" : 1,
"build_scorer" : 608
}
},
{
"type" : "TermQuery",
"description" : "firstname:飛",
"time_in_nanos" : 6541,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 0,
"shallow_advance_count" : 0,
"set_min_competitive_score" : 0,
"next_doc" : 0,
"match" : 0,
"next_doc_count" : 0,
"score_count" : 0,
"compute_max_score_count" : 0,
"compute_max_score" : 0,
"advance" : 0,
"advance_count" : 0,
"score" : 0,
"build_scorer_count" : 2,
"create_weight" : 5947,
"shallow_advance" : 0,
"create_weight_count" : 1,
"build_scorer" : 594
}
},
{
"type" : "TermQuery",
"description" : "firstname:升",
"time_in_nanos" : 6719,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 0,
"shallow_advance_count" : 0,
"set_min_competitive_score" : 0,
"next_doc" : 0,
"match" : 0,
"next_doc_count" : 0,
"score_count" : 0,
"compute_max_score_count" : 0,
"compute_max_score" : 0,
"advance" : 0,
"advance_count" : 0,
"score" : 0,
"build_scorer_count" : 2,
"create_weight" : 6094,
"shallow_advance" : 0,
"create_weight_count" : 1,
"build_scorer" : 625
}
}
]
}
],
"rewrite_time" : 21692,
"collector" : [
{
"name" : "SimpleTopScoreDocCollector",
"reason" : "search_top_hits",
"time_in_nanos" : 2376
}
]
}
],
"aggregations" : [ ]
}
]
}
}
通過返回結(jié)果,我們可以分析出來,match底層利用的是term來查詢的,首先將“血肉苦弱,機(jī)械飛升”按照默認(rèn)的
分詞器進(jìn)行了分詞處理,然后再一個(gè)一個(gè)的去根據(jù)分詞器分出來的詞進(jìn)行term搜索,最后將搜索結(jié)果返回。
還有一個(gè)match_all的用法,這個(gè)用法我就不再返回結(jié)果,因?yàn)椴樵兊牡氖撬械闹?
GET bank/_search
{
"query": {
"match_all": {}
},
"profile": "true"
}
上述其實(shí)只是簡化的一種match查詢,其實(shí)match的查詢還有許多的其他條件可以使用,我們學(xué)習(xí)一下經(jīng)常使用的:
query:即需要搜索的內(nèi)容
fuzziness:和我們?cè)谀:樵冎械囊馑际且粯拥?,即指的是最大編輯距離。即允許匹配的值與關(guān)鍵字之間最大的
偏差。
operator:操作符,and或者or,默認(rèn)是or
zero_terms_query:默認(rèn)值是none,表示如果使用的analyzer是停用詞分析器的話,那么就會(huì)在索引時(shí),去掉
所有的停用詞,如果是all表示查詢所有。
GET bank/_search
{
"query": {
"match": {
"firstname":{
"query": "Hattie",
"operator": "and",
"fuzziness":1,
"zero_terms_query": "none"
}
}
},
"profile": "true"
}
返回結(jié)果:
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 6.5042877,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "6",
"_score" : 6.5042877,
"_source" : {
"account_number" : 6,
"balance" : 5686,
"firstname" : "Hattie",
"lastname" : "Bond",
"age" : 36,
"gender" : "M",
"address" : "671 Bristol Street",
"employer" : "Netagy",
"email" : "hattiebond@netagy.com",
"city" : "Dante",
"state" : "TN"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "983",
"_score" : 5.4202404,
"_source" : {
"account_number" : 983,
"balance" : 47205,
"firstname" : "Mattie",
"lastname" : "Eaton",
"age" : 24,
"gender" : "F",
"address" : "418 Allen Avenue",
"employer" : "Trasola",
"email" : "mattieeaton@trasola.com",
"city" : "Dupuyer",
"state" : "NJ"
}
}
]
},
"profile" : {
"shards" : [
{
"id" : "[pgvIy_S0QwiNETOTSEWFtw][bank][0]",
"searches" : [
{
"query" : [
{
"type" : "BooleanQuery",
"description" : "firstname:hattie (firstname:mattie)^0.8333333",
"time_in_nanos" : 122563,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 2,
"shallow_advance_count" : 0,
"set_min_competitive_score" : 0,
"next_doc" : 2180,
"match" : 863,
"next_doc_count" : 2,
"score_count" : 2,
"compute_max_score_count" : 0,
"compute_max_score" : 0,
"advance" : 7855,
"advance_count" : 1,
"score" : 7241,
"build_scorer_count" : 3,
"create_weight" : 56686,
"shallow_advance" : 0,
"create_weight_count" : 1,
"build_scorer" : 47738
},
"children" : [
{
"type" : "TermQuery",
"description" : "firstname:hattie",
"time_in_nanos" : 33019,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 0,
"shallow_advance_count" : 3,
"set_min_competitive_score" : 0,
"next_doc" : 0,
"match" : 0,
"next_doc_count" : 0,
"score_count" : 1,
"compute_max_score_count" : 3,
"compute_max_score" : 1634,
"advance" : 398,
"advance_count" : 2,
"score" : 3982,
"build_scorer_count" : 4,
"create_weight" : 13509,
"shallow_advance" : 1249,
"create_weight_count" : 1,
"build_scorer" : 12247
}
},
{
"type" : "BoostQuery",
"description" : "(firstname:mattie)^0.8333333",
"time_in_nanos" : 30033,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 0,
"shallow_advance_count" : 3,
"set_min_competitive_score" : 0,
"next_doc" : 0,
"match" : 0,
"next_doc_count" : 0,
"score_count" : 1,
"compute_max_score_count" : 3,
"compute_max_score" : 360,
"advance" : 807,
"advance_count" : 2,
"score" : 595,
"build_scorer_count" : 4,
"create_weight" : 26055,
"shallow_advance" : 323,
"create_weight_count" : 1,
"build_scorer" : 1893
}
}
]
}
],
"rewrite_time" : 852484,
"collector" : [
{
"name" : "SimpleTopScoreDocCollector",
"reason" : "search_top_hits",
"time_in_nanos" : 16098
}
]
}
],
"aggregations" : [ ]
}
]
}
}
match phrase query
match phrase query即短語搜索,與match搜索不同的地方是,短語搜索的搜索結(jié)果是匹配對(duì)應(yīng)的短語,
而不是將短語進(jìn)行分割后存在其中某一條詞匯就返回,而是和整條短語都匹配時(shí),才會(huì)返回。
先看示例代碼:
GET bank/_search
{
"query": {
"match_phrase": {
"address": "Bristol Street"
}
},
"profile": "true"
}
或者
GET bank/_search
{
"query": {
"match_phrase": {
"address": {
"query":"Bristol Street",
"analyzer": "ik_smart"
}
}
},
"profile": "true"
}
以上兩種寫法返回結(jié)果是一樣的,
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 7.457467,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "6",
"_score" : 7.457467,
"_source" : {
"account_number" : 6,
"balance" : 5686,
"firstname" : "Hattie",
"lastname" : "Bond",
"age" : 36,
"gender" : "M",
"address" : "671 Bristol Street",
"employer" : "Netagy",
"email" : "hattiebond@netagy.com",
"city" : "Dante",
"state" : "TN"
}
}
]
},
"profile" : {
"shards" : [
{
"id" : "[pgvIy_S0QwiNETOTSEWFtw][bank][0]",
"searches" : [
{
"query" : [
{
"type" : "PhraseQuery",
"description" : "address:\"bristol street\"",
"time_in_nanos" : 1695954,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 1,
"shallow_advance_count" : 0,
"set_min_competitive_score" : 0,
"next_doc" : 19358,
"match" : 19600,
"next_doc_count" : 1,
"score_count" : 1,
"compute_max_score_count" : 0,
"compute_max_score" : 0,
"advance" : 51360,
"advance_count" : 1,
"score" : 14776,
"build_scorer_count" : 3,
"create_weight" : 99282,
"shallow_advance" : 0,
"create_weight_count" : 1,
"build_scorer" : 1491578
}
}
],
"rewrite_time" : 1718,
"collector" : [
{
"name" : "SimpleTopScoreDocCollector",
"reason" : "search_top_hits",
"time_in_nanos" : 30899
}
]
}
],
"aggregations" : [ ]
}
]
}
}
其搜索原理是:
先根據(jù)短語進(jìn)行分詞,然后對(duì)分詞后單詞在feild中進(jìn)行搜索,將搜索后的結(jié)果進(jìn)一步篩選,找到在同一個(gè)field中
的doc,再對(duì)doc進(jìn)行篩選,篩選出與短語順序一致的doc。返回最終結(jié)果。
這里要說明一個(gè)配置slop,意思是要經(jīng)過幾次移動(dòng)才能與一個(gè)document的field中的匹配,這個(gè)移動(dòng)的次數(shù),
就是slop,默認(rèn)是0
match phrase prefix
即短語前綴匹配,和短語查詢類似,我們學(xué)習(xí)一下用法:
GET bank/_search
{
"query": {
"match_phrase_prefix": {
"address": "Bristol"
}
},
"profile": "true"
}
或者
GET bank/_search
{
"query": {
"match_phrase_prefix": {
"address": {
"query":"Bristol",
"analyzer": "ik_smart"
}
}
},
"profile": "true"
}
返回結(jié)果:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 6.5025153,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "6",
"_score" : 6.5025153,
"_source" : {
"account_number" : 6,
"balance" : 5686,
"firstname" : "Hattie",
"lastname" : "Bond",
"age" : 36,
"gender" : "M",
"address" : "671 Bristol Street",
"employer" : "Netagy",
"email" : "hattiebond@netagy.com",
"city" : "Dante",
"state" : "TN"
}
}
]
},
"profile" : {
"shards" : [
{
"id" : "[pgvIy_S0QwiNETOTSEWFtw][bank][0]",
"searches" : [
{
"query" : [
{
"type" : "TermQuery",
"description" : "address:bristol",
"time_in_nanos" : 42851,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 0,
"shallow_advance_count" : 0,
"set_min_competitive_score" : 0,
"next_doc" : 1042,
"match" : 0,
"next_doc_count" : 1,
"score_count" : 1,
"compute_max_score_count" : 0,
"compute_max_score" : 0,
"advance" : 1286,
"advance_count" : 1,
"score" : 15944,
"build_scorer_count" : 3,
"create_weight" : 15913,
"shallow_advance" : 0,
"create_weight_count" : 1,
"build_scorer" : 8666
}
}
],
"rewrite_time" : 87691,
"collector" : [
{
"name" : "SimpleTopScoreDocCollector",
"reason" : "search_top_hits",
"time_in_nanos" : 43088
}
]
}
],
"aggregations" : [ ]
}
]
}
}
原理:返回包含提供的文本的單詞且以相同順序出現(xiàn)的文檔。提供的文本的最后一個(gè)分詞被視為前綴,匹配
以該分詞 開頭的任何單詞。
multi match query
傳送門
即多字段查詢,標(biāo)準(zhǔn)查詢的基礎(chǔ)上,支持多字段查詢
GET bank/_search
{
"query": {
"multi_match": {
"query": "Hattie",
"fields": ["firstname", "email"]
}
},
"profile": "true"
}
返回結(jié)果:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 6.5042877,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "6",
"_score" : 6.5042877,
"_source" : {
"account_number" : 6,
"balance" : 5686,
"firstname" : "Hattie",
"lastname" : "Bond",
"age" : 36,
"gender" : "M",
"address" : "671 Bristol Street",
"employer" : "Netagy",
"email" : "hattiebond@netagy.com",
"city" : "Dante",
"state" : "TN"
}
}
]
},
"profile" : {
"shards" : [
{
"id" : "[pgvIy_S0QwiNETOTSEWFtw][bank][0]",
"searches" : [
{
"query" : [
{
"type" : "DisjunctionMaxQuery",
"description" : "(firstname:hattie | email:hattie)",
"time_in_nanos" : 133378,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 0,
"shallow_advance_count" : 0,
"set_min_competitive_score" : 0,
"next_doc" : 1354,
"match" : 0,
"next_doc_count" : 1,
"score_count" : 1,
"compute_max_score_count" : 0,
"compute_max_score" : 0,
"advance" : 1375,
"advance_count" : 1,
"score" : 25980,
"build_scorer_count" : 3,
"create_weight" : 83277,
"shallow_advance" : 0,
"create_weight_count" : 1,
"build_scorer" : 21392
},
"children" : [
{
"type" : "TermQuery",
"description" : "firstname:hattie",
"time_in_nanos" : 91743,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 0,
"shallow_advance_count" : 0,
"set_min_competitive_score" : 0,
"next_doc" : 829,
"match" : 0,
"next_doc_count" : 1,
"score_count" : 1,
"compute_max_score_count" : 0,
"compute_max_score" : 0,
"advance" : 729,
"advance_count" : 1,
"score" : 25407,
"build_scorer_count" : 3,
"create_weight" : 51324,
"shallow_advance" : 0,
"create_weight_count" : 1,
"build_scorer" : 13454
}
},
{
"type" : "TermQuery",
"description" : "email:hattie",
"time_in_nanos" : 11182,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 0,
"shallow_advance_count" : 0,
"set_min_competitive_score" : 0,
"next_doc" : 0,
"match" : 0,
"next_doc_count" : 0,
"score_count" : 0,
"compute_max_score_count" : 0,
"compute_max_score" : 0,
"advance" : 0,
"advance_count" : 0,
"score" : 0,
"build_scorer_count" : 2,
"create_weight" : 10752,
"shallow_advance" : 0,
"create_weight_count" : 1,
"build_scorer" : 430
}
}
]
}
],
"rewrite_time" : 3804,
"collector" : [
{
"name" : "SimpleTopScoreDocCollector",
"reason" : "search_top_hits",
"time_in_nanos" : 64512
}
]
}
],
"aggregations" : [ ]
}
]
}
}
同時(shí)還支持在fields中使用通配符*和caret(^)進(jìn)行加權(quán),如下例子所示:
GET bank/_search
{
"query": {
"multi_match": {
"query": "Hattie",
"fields": ["firstname^3", "**email"]
}
},
"profile": "true"
}
上述查詢表示的是firstname字段要比**email重要3倍,所以匹配的時(shí)候更偏向于firstname
多匹配查詢還有一個(gè)比較重要的參數(shù)type,其中type的類型影響著多匹配查詢方式內(nèi)部的執(zhí)行狀態(tài),type的類型有
如下幾種:
best_fields:查詢匹配任何字段的文檔,也是默認(rèn)的類型,但是使用最佳匹配字段的_score;
most_fields:查找匹配任何字段的文檔,結(jié)合每個(gè)字段的_score
cross_fields:用相同的分析器處理字段,把這些字段當(dāng)作一個(gè)大字段。查找任何字段的每個(gè)單詞
phrase:在每個(gè)字段上運(yùn)行短語匹配查詢,結(jié)合每個(gè)字段的_score
phrase_prefix:在每個(gè)字段上運(yùn)行短語前綴匹配查詢,結(jié)合每個(gè)字段的_score
best_fields
在同一個(gè)字段中搜索多個(gè)單詞的時(shí)候此參數(shù)最有用,best_fields類型對(duì)每個(gè)字段生成一個(gè)匹配查詢并且封裝
成dis_max查詢,來找到最佳匹配字段,關(guān)于dis_max和tie_breaker 的詳細(xì)講解可參考
傳送門
這里我就不在敘述了,不過es7.9使用dis_max查詢的時(shí)候好像并沒有實(shí)現(xiàn)傳送門中的效果,也許es7.9進(jìn)行了部分
優(yōu)化吧,不得而知,有了解的可以告知一下,例子如下:
我這里查詢時(shí),并沒有
GET bank/_search
{
"query": {
"multi_match": {
"query": "Bates Street",
"type": "best_fields",
"fields": ["lastname", "address"]
}
},
"profile": "true"
}
按照dis_max的查詢的原理,查詢的結(jié)果應(yīng)該分?jǐn)?shù)都是一致的,首先因?yàn)檫@里我沒有設(shè)置tie_breaker,
tie_breaker 默認(rèn)就是0,其次不論在lastname字段還是address字段都沒有完全包括Bates Street這個(gè)短語,
所以應(yīng)該分?jǐn)?shù)都是一樣的,但是返回結(jié)果卻是和理論有點(diǎn)差別的。
返回結(jié)果:
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 385,
"relation" : "eq"
},
"max_score" : 6.5042877,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "13",
"_score" : 6.5042877,
"_source" : {
"account_number" : 13,
"balance" : 32838,
"firstname" : "Nanette",
"lastname" : "Bates",
"age" : 28,
"gender" : "F",
"address" : "789 Madison Street",
"employer" : "Quility",
"email" : "nanettebates@quility.com",
"city" : "Nogal",
"state" : "VA"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "6",
"_score" : 0.95495176,
"_source" : {
"account_number" : 6,
"balance" : 5686,
"firstname" : "Hattie",
"lastname" : "Bond",
"age" : 36,
"gender" : "M",
"address" : "671 Bristol Street",
"employer" : "Netagy",
"email" : "hattiebond@netagy.com",
"city" : "Dante",
"state" : "TN"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "32",
"_score" : 0.95495176,
"_source" : {
"account_number" : 32,
"balance" : 48086,
"firstname" : "Dillard",
"lastname" : "Mcpherson",
"age" : 34,
"gender" : "F",
"address" : "702 Quentin Street",
"employer" : "Quailcom",
"email" : "dillardmcpherson@quailcom.com",
"city" : "Veguita",
"state" : "IN"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "49",
"_score" : 0.95495176,
"_source" : {
"account_number" : 49,
"balance" : 29104,
"firstname" : "Fulton",
"lastname" : "Holt",
"age" : 23,
"gender" : "F",
"address" : "451 Humboldt Street",
"employer" : "Anocha",
"email" : "fultonholt@anocha.com",
"city" : "Sunriver",
"state" : "RI"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "51",
"_score" : 0.95495176,
"_source" : {
"account_number" : 51,
"balance" : 14097,
"firstname" : "Burton",
"lastname" : "Meyers",
"age" : 31,
"gender" : "F",
"address" : "334 River Street",
"employer" : "Bezal",
"email" : "burtonmeyers@bezal.com",
"city" : "Jacksonburg",
"state" : "MO"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "63",
"_score" : 0.95495176,
"_source" : {
"account_number" : 63,
"balance" : 6077,
"firstname" : "Hughes",
"lastname" : "Owens",
"age" : 30,
"gender" : "F",
"address" : "510 Sedgwick Street",
"employer" : "Valpreal",
"email" : "hughesowens@valpreal.com",
"city" : "Guilford",
"state" : "KS"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "87",
"_score" : 0.95495176,
"_source" : {
"account_number" : 87,
"balance" : 1133,
"firstname" : "Hewitt",
"lastname" : "Kidd",
"age" : 22,
"gender" : "M",
"address" : "446 Halleck Street",
"employer" : "Isologics",
"email" : "hewittkidd@isologics.com",
"city" : "Coalmont",
"state" : "ME"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "107",
"_score" : 0.95495176,
"_source" : {
"account_number" : 107,
"balance" : 48844,
"firstname" : "Randi",
"lastname" : "Rich",
"age" : 28,
"gender" : "M",
"address" : "694 Jefferson Street",
"employer" : "Netplax",
"email" : "randirich@netplax.com",
"city" : "Bellfountain",
"state" : "SC"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "138",
"_score" : 0.95495176,
"_source" : {
"account_number" : 138,
"balance" : 9006,
"firstname" : "Daniel",
"lastname" : "Arnold",
"age" : 39,
"gender" : "F",
"address" : "422 Malbone Street",
"employer" : "Ecstasia",
"email" : "danielarnold@ecstasia.com",
"city" : "Gardiner",
"state" : "MO"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "140",
"_score" : 0.95495176,
"_source" : {
"account_number" : 140,
"balance" : 26696,
"firstname" : "Cotton",
"lastname" : "Christensen",
"age" : 32,
"gender" : "M",
"address" : "878 Schermerhorn Street",
"employer" : "Prowaste",
"email" : "cottonchristensen@prowaste.com",
"city" : "Mayfair",
"state" : "LA"
}
}
]
},
"profile" : {
"shards" : [
{
"id" : "[pgvIy_S0QwiNETOTSEWFtw][bank][0]",
"searches" : [
{
"query" : [
{
"type" : "DisjunctionMaxQuery",
"description" : "((address:bates address:street) | (lastname:bates lastname:street))",
"time_in_nanos" : 382062,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 0,
"shallow_advance_count" : 0,
"set_min_competitive_score" : 0,
"next_doc" : 85039,
"match" : 0,
"next_doc_count" : 385,
"score_count" : 385,
"compute_max_score_count" : 0,
"compute_max_score" : 0,
"advance" : 17873,
"advance_count" : 1,
"score" : 83484,
"build_scorer_count" : 3,
"create_weight" : 68479,
"shallow_advance" : 0,
"create_weight_count" : 1,
"build_scorer" : 127187
},
"children" : [
{
"type" : "BooleanQuery",
"description" : "address:bates address:street",
"time_in_nanos" : 197939,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 0,
"shallow_advance_count" : 7,
"set_min_competitive_score" : 0,
"next_doc" : 0,
"match" : 0,
"next_doc_count" : 0,
"score_count" : 385,
"compute_max_score_count" : 7,
"compute_max_score" : 29714,
"advance" : 31586,
"advance_count" : 386,
"score" : 26437,
"build_scorer_count" : 3,
"create_weight" : 46463,
"shallow_advance" : 7704,
"create_weight_count" : 1,
"build_scorer" : 56035
},
"children" : [
{
"type" : "TermQuery",
"description" : "address:bates",
"time_in_nanos" : 31596,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 0,
"shallow_advance_count" : 0,
"set_min_competitive_score" : 0,
"next_doc" : 0,
"match" : 0,
"next_doc_count" : 0,
"score_count" : 0,
"compute_max_score_count" : 0,
"compute_max_score" : 0,
"advance" : 0,
"advance_count" : 0,
"score" : 0,
"build_scorer_count" : 2,
"create_weight" : 30651,
"shallow_advance" : 0,
"create_weight_count" : 1,
"build_scorer" : 945
}
},
{
"type" : "TermQuery",
"description" : "address:street",
"time_in_nanos" : 88777,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 0,
"shallow_advance_count" : 7,
"set_min_competitive_score" : 0,
"next_doc" : 0,
"match" : 0,
"next_doc_count" : 0,
"score_count" : 385,
"compute_max_score_count" : 7,
"compute_max_score" : 29349,
"advance" : 18294,
"advance_count" : 386,
"score" : 13448,
"build_scorer_count" : 4,
"create_weight" : 8860,
"shallow_advance" : 7345,
"create_weight_count" : 1,
"build_scorer" : 11481
}
}
]
},
{
"type" : "BooleanQuery",
"description" : "lastname:bates lastname:street",
"time_in_nanos" : 36076,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 0,
"shallow_advance_count" : 4,
"set_min_competitive_score" : 0,
"next_doc" : 0,
"match" : 0,
"next_doc_count" : 0,
"score_count" : 1,
"compute_max_score_count" : 4,
"compute_max_score" : 12292,
"advance" : 160,
"advance_count" : 2,
"score" : 4667,
"build_scorer_count" : 3,
"create_weight" : 13380,
"shallow_advance" : 493,
"create_weight_count" : 1,
"build_scorer" : 5084
},
"children" : [
{
"type" : "TermQuery",
"description" : "lastname:bates",
"time_in_nanos" : 22974,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 0,
"shallow_advance_count" : 4,
"set_min_competitive_score" : 0,
"next_doc" : 0,
"match" : 0,
"next_doc_count" : 0,
"score_count" : 1,
"compute_max_score_count" : 4,
"compute_max_score" : 9812,
"advance" : 87,
"advance_count" : 2,
"score" : 4623,
"build_scorer_count" : 4,
"create_weight" : 6008,
"shallow_advance" : 315,
"create_weight_count" : 1,
"build_scorer" : 2129
}
},
{
"type" : "TermQuery",
"description" : "lastname:street",
"time_in_nanos" : 4158,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 0,
"shallow_advance_count" : 0,
"set_min_competitive_score" : 0,
"next_doc" : 0,
"match" : 0,
"next_doc_count" : 0,
"score_count" : 0,
"compute_max_score_count" : 0,
"compute_max_score" : 0,
"advance" : 0,
"advance_count" : 0,
"score" : 0,
"build_scorer_count" : 2,
"create_weight" : 4021,
"shallow_advance" : 0,
"create_weight_count" : 1,
"build_scorer" : 137
}
}
]
}
]
}
],
"rewrite_time" : 8783,
"collector" : [
{
"name" : "SimpleTopScoreDocCollector",
"reason" : "search_top_hits",
"time_in_nanos" : 97660
}
]
}
],
"aggregations" : [ ]
}
]
}
}
most_fields
當(dāng)查詢多字段包含相同文本以不同方式分詞的時(shí)候此參數(shù)最有用,
GET bank/_search
{
"query": {
"multi_match": {
"query": "Bates Street",
"type": "most_fields",
"fields": ["lastname", "address"]
}
},
"profile": "true"
}
等價(jià)于:
GET bank/_search
{
"query": {
"bool": {
"should":[
{"match":{"lastname":"Bates Street"}},
{"match":{"address":"Bates Street"}}
]
}
},
"profile": "true"
}
這里就不寫返回結(jié)果了,結(jié)果和best_fields都一樣,沒啥意思,達(dá)不到理論的效果,也許是數(shù)據(jù)問題,如果看
詳解,點(diǎn)擊下面?zhèn)魉烷T
傳送門
cross_fields
當(dāng)結(jié)構(gòu)化的文檔中,多個(gè)字段應(yīng)該匹配的時(shí)候,此參數(shù)特別有用,例如,當(dāng)通過firstname和lastname查詢Nanette Bates的時(shí)候,最佳的匹配是Nanette 在一個(gè)字段,Bates在另一個(gè)字段。
一種處理這種查詢的簡單方式是將firstname字段和lastname索引到一個(gè)fullname字段。當(dāng)然這只能在索引的時(shí)候完成。
creoos_fields在查詢時(shí)通過采取term-centric方法來嘗試解決這個(gè)問題。首先將查詢字符串分詞為單獨(dú)的索引詞,然后在任意字段中查找索引詞。
查詢示例如下:
GET bank/_search
{
"query": {
"multi_match": {
"query": "Nanette Bates",
"type": "cross_fields",
"fields": ["firstname", "lastname"],
"operator": "and"
}
},
"profile": "true"
}
返回結(jié)果如下:
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 13.008575,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "13",
"_score" : 13.008575,
"_source" : {
"account_number" : 13,
"balance" : 32838,
"firstname" : "Nanette",
"lastname" : "Bates",
"age" : 28,
"gender" : "F",
"address" : "789 Madison Street",
"employer" : "Quility",
"email" : "nanettebates@quility.com",
"city" : "Nogal",
"state" : "VA"
}
}
]
},
"profile" : {
"shards" : [
{
"id" : "[pgvIy_S0QwiNETOTSEWFtw][bank][0]",
"searches" : [
{
"query" : [
{
"type" : "BooleanQuery",
"description" : "+(firstname:nanette | lastname:nanette) +(firstname:bates | lastname:bates)",
"time_in_nanos" : 191743,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 0,
"shallow_advance_count" : 0,
"set_min_competitive_score" : 0,
"next_doc" : 7352,
"match" : 0,
"next_doc_count" : 1,
"score_count" : 1,
"compute_max_score_count" : 0,
"compute_max_score" : 0,
"advance" : 3561,
"advance_count" : 1,
"score" : 2363,
"build_scorer_count" : 3,
"create_weight" : 86374,
"shallow_advance" : 0,
"create_weight_count" : 1,
"build_scorer" : 92093
},
"children" : [
{
"type" : "DisjunctionMaxQuery",
"description" : "(firstname:nanette | lastname:nanette)",
"time_in_nanos" : 111881,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 0,
"shallow_advance_count" : 3,
"set_min_competitive_score" : 0,
"next_doc" : 0,
"match" : 0,
"next_doc_count" : 0,
"score_count" : 1,
"compute_max_score_count" : 2,
"compute_max_score" : 2698,
"advance" : 633,
"advance_count" : 2,
"score" : 1486,
"build_scorer_count" : 4,
"create_weight" : 58990,
"shallow_advance" : 797,
"create_weight_count" : 1,
"build_scorer" : 47277
},
"children" : [
{
"type" : "TermQuery",
"description" : "firstname:nanette",
"time_in_nanos" : 25371,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 0,
"shallow_advance_count" : 3,
"set_min_competitive_score" : 0,
"next_doc" : 0,
"match" : 0,
"next_doc_count" : 0,
"score_count" : 1,
"compute_max_score_count" : 2,
"compute_max_score" : 2536,
"advance" : 506,
"advance_count" : 2,
"score" : 1419,
"build_scorer_count" : 3,
"create_weight" : 12138,
"shallow_advance" : 572,
"create_weight_count" : 1,
"build_scorer" : 8200
}
},
{
"type" : "TermQuery",
"description" : "lastname:nanette",
"time_in_nanos" : 1964,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 0,
"shallow_advance_count" : 0,
"set_min_competitive_score" : 0,
"next_doc" : 0,
"match" : 0,
"next_doc_count" : 0,
"score_count" : 0,
"compute_max_score_count" : 0,
"compute_max_score" : 0,
"advance" : 0,
"advance_count" : 0,
"score" : 0,
"build_scorer_count" : 2,
"create_weight" : 1813,
"shallow_advance" : 0,
"create_weight_count" : 1,
"build_scorer" : 151
}
}
]
},
{
"type" : "DisjunctionMaxQuery",
"description" : "(firstname:bates | lastname:bates)",
"time_in_nanos" : 30512,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 0,
"shallow_advance_count" : 3,
"set_min_competitive_score" : 0,
"next_doc" : 0,
"match" : 0,
"next_doc_count" : 0,
"score_count" : 1,
"compute_max_score_count" : 2,
"compute_max_score" : 432,
"advance" : 152,
"advance_count" : 1,
"score" : 156,
"build_scorer_count" : 3,
"create_weight" : 12599,
"shallow_advance" : 406,
"create_weight_count" : 1,
"build_scorer" : 16767
},
"children" : [
{
"type" : "TermQuery",
"description" : "firstname:bates",
"time_in_nanos" : 884,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 0,
"shallow_advance_count" : 0,
"set_min_competitive_score" : 0,
"next_doc" : 0,
"match" : 0,
"next_doc_count" : 0,
"score_count" : 0,
"compute_max_score_count" : 0,
"compute_max_score" : 0,
"advance" : 0,
"advance_count" : 0,
"score" : 0,
"build_scorer_count" : 1,
"create_weight" : 756,
"shallow_advance" : 0,
"create_weight_count" : 1,
"build_scorer" : 128
}
},
{
"type" : "TermQuery",
"description" : "lastname:bates",
"time_in_nanos" : 4932,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 0,
"shallow_advance_count" : 3,
"set_min_competitive_score" : 0,
"next_doc" : 0,
"match" : 0,
"next_doc_count" : 0,
"score_count" : 1,
"compute_max_score_count" : 2,
"compute_max_score" : 293,
"advance" : 97,
"advance_count" : 1,
"score" : 84,
"build_scorer_count" : 2,
"create_weight" : 1769,
"shallow_advance" : 178,
"create_weight_count" : 1,
"build_scorer" : 2511
}
}
]
}
]
}
],
"rewrite_time" : 459182,
"collector" : [
{
"name" : "SimpleTopScoreDocCollector",
"reason" : "search_top_hits",
"time_in_nanos" : 9741
}
]
}
],
"aggregations" : [ ]
}
]
}
}
phrase和phrase_prefix
短語和短語前綴類型和best_fields類型一樣,只不過使用的是match_phrase查詢或者match_phrase_prefix查
詢而不是match查詢
查詢示例如下:
GET bank/_search
{
"query": {
"multi_match": {
"query": "789 Madison S",
"type": "phrase_prefix",
"fields": ["address", "lastname"]
}
},
"profile": "true"
}
返回結(jié)果
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 325.35672,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "13",
"_score" : 325.35672,
"_source" : {
"account_number" : 13,
"balance" : 32838,
"firstname" : "Nanette",
"lastname" : "Bates",
"age" : 28,
"gender" : "F",
"address" : "789 Madison Street",
"employer" : "Quility",
"email" : "nanettebates@quility.com",
"city" : "Nogal",
"state" : "VA"
}
}
]
},
"profile" : {
"shards" : [
{
"id" : "[pgvIy_S0QwiNETOTSEWFtw][bank][0]",
"searches" : [
{
"query" : [
{
"type" : "DisjunctionMaxQuery",
"description" : """(address:"789 madison (schaefer schenck schenectady seton seaview sackett stuart story schermerhorn stockholm sumner sumpter sackman sedgwick stryker seigel sands saratoga street summit surf stillwell sunnyside schweikerts stewart stockton scott stratford school sheffield seeley square shale strickland seabring stuyvesant schroeders strong senator seagate strauss sandford sharon sapphire seba stone sullivan stoddard seacoast scholes)" | lastname:"789 madison (schmidt sanchez schneider santos serrano schroeder sherman sellers schultz shaffer shaw santana sims small sexton savage salazar salinas sheppard shepherd sharp simpson sanford snow sandoval singleton slater shepard scott santiago simon sloan saunders salas sharpe sears snider sampson short simmons smith skinner silva shields sargent shelton sanders shannon sawyer schwartz)")""",
"time_in_nanos" : 821485,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 1,
"shallow_advance_count" : 0,
"set_min_competitive_score" : 0,
"next_doc" : 52218,
"match" : 36513,
"next_doc_count" : 1,
"score_count" : 1,
"compute_max_score_count" : 0,
"compute_max_score" : 0,
"advance" : 40671,
"advance_count" : 1,
"score" : 16465,
"build_scorer_count" : 3,
"create_weight" : 535511,
"shallow_advance" : 0,
"create_weight_count" : 1,
"build_scorer" : 140107
},
"children" : [
{
"type" : "MultiPhraseQuery",
"description" : "address:\"789 madison (schaefer schenck schenectady seton seaview sackett stuart story schermerhorn stockholm sumner sumpter sackman sedgwick stryker seigel sands saratoga street summit surf stillwell sunnyside schweikerts stewart stockton scott stratford school sheffield seeley square shale strickland seabring stuyvesant schroeders strong senator seagate strauss sandford sharon sapphire seba stone sullivan stoddard seacoast scholes)\"",
"time_in_nanos" : 532724,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 1,
"shallow_advance_count" : 0,
"set_min_competitive_score" : 0,
"next_doc" : 52152,
"match" : 36439,
"next_doc_count" : 1,
"score_count" : 1,
"compute_max_score_count" : 0,
"compute_max_score" : 0,
"advance" : 40449,
"advance_count" : 1,
"score" : 16415,
"build_scorer_count" : 3,
"create_weight" : 270345,
"shallow_advance" : 0,
"create_weight_count" : 1,
"build_scorer" : 116924
}
},
{
"type" : "MultiPhraseQuery",
"description" : "lastname:\"789 madison (schmidt sanchez schneider santos serrano schroeder sherman sellers schultz shaffer shaw santana sims small sexton savage salazar salinas sheppard shepherd sharp simpson sanford snow sandoval singleton slater shepard scott santiago simon sloan saunders salas sharpe sears snider sampson short simmons smith skinner silva shields sargent shelton sanders shannon sawyer schwartz)\"",
"time_in_nanos" : 255124,
"breakdown" : {
"set_min_competitive_score_count" : 0,
"match_count" : 0,
"shallow_advance_count" : 0,
"set_min_competitive_score" : 0,
"next_doc" : 0,
"match" : 0,
"next_doc_count" : 0,
"score_count" : 0,
"compute_max_score_count" : 0,
"compute_max_score" : 0,
"advance" : 0,
"advance_count" : 0,
"score" : 0,
"build_scorer_count" : 2,
"create_weight" : 252146,
"shallow_advance" : 0,
"create_weight_count" : 1,
"build_scorer" : 2978
}
}
]
}
],
"rewrite_time" : 329671,
"collector" : [
{
"name" : "SimpleTopScoreDocCollector",
"reason" : "search_top_hits",
"time_in_nanos" : 22719
}
]
}
],
"aggregations" : [ ]
}
]
}
}
Common Terms Query
官方文檔好像并沒有這個(gè)用法的敘述,而且我所用的es7.9貌似也不支持此種用法,所以如果有想要了解的就直接
通過傳送門啊
傳送門文章來源:http://www.zghlxwxcb.cn/news/detail-468290.html
Intervals Query
Intervals Query即間隔查詢,根據(jù)匹配項(xiàng)的順序和鄰近程度返回文檔。interval查詢使用從一組小定義構(gòu)建的
匹配規(guī)則。然后將這些規(guī)則應(yīng)用于來自指定字段的術(shù)語。
這些定義產(chǎn)生最小間隔的序列,這些序列跨越文本正文中的術(shù)語。這些間隔可以由父源進(jìn)一步組合和過濾。
傳送門文章來源地址http://www.zghlxwxcb.cn/news/detail-468290.html
到了這里,關(guān)于elasticsearch全文檢索的文章就介紹完了。如果您還想了解更多內(nèi)容,請(qǐng)?jiān)谟疑辖撬阉鱐OY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!