01. 數(shù)據(jù)準(zhǔn)備
ElasticSearch 向 my_index 索引中索引了 12 條文檔:
PUT /my_index/_doc/1
{
"title": "文雅酒店",
"content": "青島",
"price": 556
}
PUT /my_index/_doc/2
{
"title": "金都嘉怡假日酒店",
"content": "北京",
"price": 337
}
PUT /my_index/_doc/3
{
"title": "金都欣欣酒店",
"content": "天津",
"price": 200
}
PUT /my_index/_doc/4
{
"title": "金都酒店",
"content": "上海",
"price": 300
}
PUT /my_index/_doc/5
{
"title": "自如酒店",
"content": "南京",
"price": 400
}
PUT /my_index/_doc/6
{
"title": "如家酒店",
"content": "杭州",
"price": 500
}
PUT /my_index/_doc/7
{
"title": "非常酒店",
"content": "合肥",
"price": 600
}
PUT /my_index/_doc/8
{
"title": "金都酒店",
"content": "淮北",
"price": 700
}
PUT /my_index/_doc/9
{
"title": "金都酒店",
"content": "淮南",
"price": 900
}
PUT /my_index/_doc/10
{
"title": "麗舍酒店",
"content": "阜陽(yáng)",
"price": 1000
}
PUT /my_index/_doc/11
{
"title": "文軒酒店",
"content": "蚌埠",
"price": 1020
}
PUT /my_index/_doc/12
{
"title": "大理酒店",
"content": "長(zhǎng)沙",
"price": 1100
}
02. ElasticSearch 如何查詢所有文檔?
ElasticSearch 查詢所有文檔
GET /my_index/_search
根據(jù)查詢結(jié)果可以看出,集群中總共有12個(gè)文檔,hits.total.value=12, 但是在 hits
數(shù)組中只有 10 個(gè)文檔。如何才能看到其他的文檔?
{
"took" : 688,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 12,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"title" : "金都嘉怡假日酒店",
"content" : "北京",
"price" : 337
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"title" : "金都欣欣酒店",
"content" : "天津",
"price" : 200
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"title" : "文雅酒店",
"content" : "青島",
"price" : 556
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "4",
"_score" : 1.0,
"_source" : {
"title" : "金都酒店",
"content" : "上海",
"price" : 300
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "5",
"_score" : 1.0,
"_source" : {
"title" : "自如酒店",
"content" : "南京",
"price" : 400
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "6",
"_score" : 1.0,
"_source" : {
"title" : "如家酒店",
"content" : "杭州",
"price" : 500
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "7",
"_score" : 1.0,
"_source" : {
"title" : "非常酒店",
"content" : "合肥",
"price" : 600
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "8",
"_score" : 1.0,
"_source" : {
"title" : "金都酒店",
"content" : "淮北",
"price" : 700
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "9",
"_score" : 1.0,
"_source" : {
"title" : "金都酒店",
"content" : "淮南",
"price" : 900
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "10",
"_score" : 1.0,
"_source" : {
"title" : "麗舍酒店",
"content" : "阜陽(yáng)",
"price" : 1000
}
}
]
}
}
03. ElasticSearch 如何指定搜索結(jié)果的條數(shù)?
Elasticsearch 接受 from
和 size
參數(shù):
from:顯示應(yīng)該跳過(guò)的初始結(jié)果數(shù)量,默認(rèn)是0
size:顯示應(yīng)該返回的結(jié)果數(shù)量,默認(rèn)是10
from 和 size 參數(shù)的默認(rèn)值分別為 0 和 10,因此如果不指定這兩個(gè)參數(shù),將返回前 10 條記錄,這也是為什么集群中總共有12個(gè)文檔,hits.total.value=12, 但是在 hits
數(shù)組中只有 10 個(gè)文檔的原因。
如果我們想返回更多的結(jié)果數(shù)量,可以通過(guò)size參數(shù)來(lái)指定:
GET /my_index/_search
{
"size": 15
}
集群中總共有12條文檔。size=15 會(huì)把集群中所有的文檔返回:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 12,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"title" : "金都嘉怡假日酒店",
"content" : "北京",
"price" : 337
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"title" : "金都欣欣酒店",
"content" : "天津",
"price" : 200
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"title" : "文雅酒店",
"content" : "青島",
"price" : 556
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "4",
"_score" : 1.0,
"_source" : {
"title" : "金都酒店",
"content" : "上海",
"price" : 300
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "5",
"_score" : 1.0,
"_source" : {
"title" : "自如酒店",
"content" : "南京",
"price" : 400
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "6",
"_score" : 1.0,
"_source" : {
"title" : "如家酒店",
"content" : "杭州",
"price" : 500
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "7",
"_score" : 1.0,
"_source" : {
"title" : "非常酒店",
"content" : "合肥",
"price" : 600
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "8",
"_score" : 1.0,
"_source" : {
"title" : "金都酒店",
"content" : "淮北",
"price" : 700
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "9",
"_score" : 1.0,
"_source" : {
"title" : "金都酒店",
"content" : "淮南",
"price" : 900
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "10",
"_score" : 1.0,
"_source" : {
"title" : "麗舍酒店",
"content" : "阜陽(yáng)",
"price" : 1000
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "11",
"_score" : 1.0,
"_source" : {
"title" : "文軒酒店",
"content" : "蚌埠",
"price" : 1020
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "12",
"_score" : 1.0,
"_source" : {
"title" : "大理酒店",
"content" : "長(zhǎng)沙",
"price" : 1100
}
}
]
}
}
04. ElasticSearch 分頁(yè)查詢方式有哪些?
使用 from 和 size 參數(shù)來(lái)實(shí)現(xiàn)分頁(yè)查詢。
使用 scroll 查詢來(lái)實(shí)現(xiàn)分頁(yè)查詢。
使用搜索后再次查詢的方式來(lái)實(shí)現(xiàn)分頁(yè)查詢。
05. ElasticSearch 如何實(shí)現(xiàn) from+size 分頁(yè)查詢?
在 ElasticSearch 中,可以使用 from 和 size 參數(shù)來(lái)進(jìn)行分頁(yè)搜索。 from 和 size 參數(shù)用來(lái)指定從哪個(gè)文檔開始,返回多少個(gè)文檔。具體命令如下:
GET /my_index/_search
{
"query": {
"match": {
"title": "酒店"
}
},
"from": 0, // 從第 1 條數(shù)據(jù)開始
"size": 3 // 返回 3 條數(shù)據(jù)
}
結(jié)果如下,總共有12條數(shù)據(jù),從第1條數(shù)據(jù)開始,返回3條數(shù)據(jù):
{
"took" : 19,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 12,
"relation" : "eq"
},
"max_score" : 0.075949445,
"hits" : [
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.075949445,
"_source" : {
"title" : "文雅酒店",
"content" : "青島",
"price" : 556
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "4",
"_score" : 0.075949445,
"_source" : {
"title" : "金都酒店",
"content" : "上海",
"price" : 300
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "5",
"_score" : 0.075949445,
"_source" : {
"title" : "自如酒店",
"content" : "南京",
"price" : 400
}
}
]
}
}
在上面的命令中,我們使用 from 參數(shù)指定從哪個(gè)文檔開始,使用 size 參數(shù)指定返回多少個(gè)文檔。例如,當(dāng) from=0 且 size=10 時(shí),返回的是第 1 到第 10 條數(shù)據(jù)。當(dāng) from=10 且 size=10 時(shí),返回的是第 11 到第 20 條數(shù)據(jù)。
06. ElasticSearch 如何實(shí)現(xiàn) searchAfter 分頁(yè)查詢?
Search After API 可以用于在 Elasticsearch 中處理大量數(shù)據(jù)。它允許您在不影響性能的情況下檢索大量數(shù)據(jù)。使用 Search After API,您可以在多個(gè)請(qǐng)求之間保持查詢上下文,并在每個(gè)請(qǐng)求中返回一定數(shù)量的結(jié)果。這樣,您就可以逐步處理大量數(shù)據(jù),而不必一次性將所有數(shù)據(jù)加載到內(nèi)存中。
Search After API 從指定的某個(gè)數(shù)據(jù)后面開始讀。這種方式不能隨機(jī)跳轉(zhuǎn)分頁(yè),只能一頁(yè)一頁(yè)地讀取數(shù)據(jù),而且必須用一個(gè)唯一且不重復(fù)的屬性對(duì)查詢數(shù)據(jù)進(jìn)行排序。
POST /my_index/_search
{
"size": 3,
"query": {
"match": {
"title": "酒店"
}
},
"sort": [
{
"price": "asc"
}
],
"track_total_hits": true
}
以上代碼表示從 my_index 索引中查詢 title 包含 酒店的數(shù)據(jù),每次返回 3 條數(shù)據(jù),并按照 price 字段升序排序。查詢結(jié)果中會(huì)返回一個(gè) sort 值,用于在后續(xù)請(qǐng)求中使用。同時(shí),設(shè)置 track_total_hits 參數(shù)為 true,表示計(jì)算總命中數(shù)。
查詢文檔的總命中數(shù) hits.total.value 為12,返回3條數(shù)據(jù):
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 12,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "3",
"_score" : null,
"_source" : {
"title" : "金都欣欣酒店",
"content" : "天津",
"price" : 200
},
"sort" : [
200
]
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "4",
"_score" : null,
"_source" : {
"title" : "金都酒店",
"content" : "上海",
"price" : 300
},
"sort" : [
300
]
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "2",
"_score" : null,
"_source" : {
"title" : "金都嘉怡假日酒店",
"content" : "北京",
"price" : 337
},
"sort" : [
337
]
}
]
}
}
接下來(lái),可以使用 sort 值來(lái)獲取下一頁(yè)數(shù)據(jù):
POST /my_index/_search
{
"size": 1000,
"query": {
"match": {
"title": "酒店"
}
},
"sort": [
{
"price": "asc"
}
],
"search_after": [337]
}
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 12,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "5",
"_score" : null,
"_source" : {
"title" : "自如酒店",
"content" : "南京",
"price" : 400
},
"sort" : [
400
]
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "6",
"_score" : null,
"_source" : {
"title" : "如家酒店",
"content" : "杭州",
"price" : 500
},
"sort" : [
500
]
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "1",
"_score" : null,
"_source" : {
"title" : "文雅酒店",
"content" : "青島",
"price" : 556
},
"sort" : [
556
]
}
]
}
}
07. ElasticSearch 如何實(shí)現(xiàn) scroll 分頁(yè)查詢?
Scroll API 可以用于在 Elasticsearch 中處理大量數(shù)據(jù)。它允許您在不影響性能的情況下檢索大量數(shù)據(jù)。使用 Scroll API,您可以在多個(gè)請(qǐng)求之間保持查詢上下文,并在每個(gè)請(qǐng)求中返回一定數(shù)量的結(jié)果。這樣,您就可以逐步處理大量數(shù)據(jù),而不必一次性將所有數(shù)據(jù)加載到內(nèi)存中。
第一個(gè)查詢會(huì)在內(nèi)存中保存一個(gè)歷史快照和光標(biāo)(scroll_id)來(lái)記錄當(dāng)前消息查詢的終止位置。下次查詢會(huì)從光標(biāo)記錄的位置往后進(jìn)行查詢。這種方式性能好,一般用于海量數(shù)據(jù)導(dǎo)出或者重建索引。但是 scroll_id 有過(guò)期時(shí)間,兩次查詢之間如果 scroll_id 過(guò)期了,第二次查詢會(huì)拋異?!罢也坏?“scroll_id”。
啟用游標(biāo)查詢可以通過(guò)在查詢的時(shí)候設(shè)置參數(shù) scroll
的值為我們期望的游標(biāo)查詢的過(guò)期時(shí)間。 游標(biāo)查詢的過(guò)期時(shí)間會(huì)在每次做查詢的時(shí)候刷新,所以這個(gè)時(shí)間只需要足夠處理當(dāng)前批的結(jié)果就可以了,而不是處理查詢結(jié)果的所有文檔的所需時(shí)間。 這個(gè)過(guò)期時(shí)間的參數(shù)很重要,因?yàn)楸3诌@個(gè)游標(biāo)查詢窗口需要消耗資源,所以我們期望如果不再需要維護(hù)這種資源就該早點(diǎn)兒釋放掉。 設(shè)置這個(gè)超時(shí)能夠讓 Elasticsearch 在稍后空閑的時(shí)候自動(dòng)釋放這部分資源。
① 執(zhí)行初始查詢,獲取scroll_id,其中,scroll參數(shù)指定了scroll查詢的有效時(shí)間,這里設(shè)置為1分鐘,size 表示每次返回7條數(shù)據(jù)。
POST /my_index/_search?scroll=1m
{
"size": 7,
"query": {
"match": {
"title": "酒店"
}
}
}
執(zhí)行上述查詢后,查詢結(jié)果中會(huì)返回一個(gè) scroll_id,用于在后續(xù)請(qǐng)求中使用,類似于以下內(nèi)容:
{
"_scroll_id" : "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAACQVUWZFFwRElpblJROU9lZV9LeXI5MUpPQQ==",
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 12,
"relation" : "eq"
},
"max_score" : 0.06382885,
"hits" : [
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.06382885,
"_source" : {
"title" : "文雅酒店",
"content" : "青島",
"price" : 556
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "4",
"_score" : 0.06382885,
"_source" : {
"title" : "金都酒店",
"content" : "上海",
"price" : 300
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "5",
"_score" : 0.06382885,
"_source" : {
"title" : "自如酒店",
"content" : "南京",
"price" : 400
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "6",
"_score" : 0.06382885,
"_source" : {
"title" : "如家酒店",
"content" : "杭州",
"price" : 500
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "7",
"_score" : 0.06382885,
"_source" : {
"title" : "非常酒店",
"content" : "合肥",
"price" : 600
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "9",
"_score" : 0.06382885,
"_source" : {
"title" : "金都酒店",
"content" : "淮南",
"price" : 900
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "8",
"_score" : 0.06382885,
"_source" : {
"title" : "金都酒店",
"content" : "淮北",
"price" : 700
}
}
]
}
}
② 使用scroll_id獲取下一頁(yè)數(shù)據(jù):
POST /_search/scroll
{
"scroll": "1m",
"scroll_id": "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAACQVUWZFFwRElpblJROU9lZV9LeXI5MUpPQQ=="
}
執(zhí)行上述查詢后,會(huì)返回下一頁(yè)數(shù)據(jù)和一個(gè)新的scroll_id:
{
"_scroll_id" : "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAACQVUWZFFwRElpblJROU9lZV9LeXI5MUpPQQ==",
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 12,
"relation" : "eq"
},
"max_score" : 0.06382885,
"hits" : [
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "10",
"_score" : 0.06382885,
"_source" : {
"title" : "麗舍酒店",
"content" : "阜陽(yáng)",
"price" : 1000,
"uploadTime" : 1678073241
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "11",
"_score" : 0.06382885,
"_source" : {
"title" : "文軒酒店",
"content" : "蚌埠",
"price" : 1020
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "12",
"_score" : 0.06382885,
"_source" : {
"title" : "大理酒店",
"content" : "長(zhǎng)沙",
"price" : 1100
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "3",
"_score" : 0.05390298,
"_source" : {
"title" : "金都欣欣酒店",
"content" : "天津",
"price" : 200
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "2",
"_score" : 0.046648744,
"_source" : {
"title" : "金都嘉怡假日酒店",
"content" : "北京",
"price" : 337
}
}
]
}
}
③ 重復(fù)步驟②,直到所有數(shù)據(jù)都被檢索完畢
POST /_search/scroll
{
"scroll": "1m",
"scroll_id": "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAACQVUWZFFwRElpblJROU9lZV9LeXI5MUpPQQ=="
}
{
"_scroll_id" : "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAACQVUWZFFwRElpblJROU9lZV9LeXI5MUpPQQ==",
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 12,
"relation" : "eq"
},
"max_score" : 0.06382885,
"hits" : [ ]
}
}
④ 當(dāng)所有數(shù)據(jù)都被檢索完畢后,需要使用clear_scroll API來(lái)清除scroll_id。
DELETE /_search/scroll
{
"scroll_id": [
"DXF1ZXJ5QW5kRmV0Y2gBAAAAAAACQVUWZFFwRElpblJROU9lZV9LeXI5MUpPQQ==",
"DXF1ZXJ5QW5kRmV0Y2gBAAAAAAACQVUWZFFwRElpblJROU9lZV9LeXI5MUpPQQ=="
]
}
注意,scroll查詢會(huì)占用Elasticsearch的資源,因此在使用時(shí)需要注意性能問(wèn)題。同時(shí),scroll查詢也不適用于實(shí)時(shí)數(shù)據(jù)的查詢,因?yàn)閟croll查詢只能查詢到在scroll查詢開始時(shí)已經(jīng)存在的數(shù)據(jù)。
08. ElasticSearch 深分頁(yè)是什么?
ElasticSearch 深分頁(yè)是指在搜索結(jié)果中,需要跳過(guò)大量的文檔才能到達(dá)目標(biāo)文檔的情況。這種情況通常發(fā)生在需要訪問(wèn)大量文檔的搜索結(jié)果中,例如搜索結(jié)果有數(shù)百萬(wàn)個(gè)文檔,但只需要訪問(wèn)其中的前幾個(gè)文檔。這個(gè)查詢的實(shí)現(xiàn)原理類似于mysql中的limit。比如查詢10001條數(shù)據(jù),需要把前10000條取出來(lái)過(guò)濾,最后得到數(shù)據(jù)。
在 ElasticSearch 中,深分頁(yè)可能會(huì)導(dǎo)致性能問(wèn)題,因?yàn)槊看翁^(guò)大量文檔時(shí),ElasticSearch 都需要執(zhí)行一次查詢,并且需要將查詢結(jié)果中的所有文檔加載到內(nèi)存中,這會(huì)占用大量的 CPU 和內(nèi)存資源。
為了避免這種情況,可以使用 ElasticSearch 的 Scroll API 或 Search After API 來(lái)進(jìn)行分頁(yè)查詢。這些 API 可以在不加載所有文檔的情況下,快速地獲取搜索結(jié)果中的指定文檔。
09. ElasticSearch 分頁(yè)查詢的最大限制是多少?
當(dāng)查詢頁(yè)很深或者查詢的數(shù)據(jù)量很大時(shí),就會(huì)發(fā)生深分頁(yè)。ElasticSearch 分頁(yè)查詢的最大限制是 10000 條數(shù)據(jù),當(dāng)查詢條數(shù)超過(guò)10000時(shí),會(huì)報(bào)錯(cuò)。
GET /my_index/_search
{
"query": {
"match": {
"title": "酒店"
}
},
"from": 0,
"size": 10001
}
查詢結(jié)果會(huì)報(bào)錯(cuò):Result window is too large, from + size must be less than or equal to: [10000] but was [10001]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting.
也就是說(shuō)我們最多只能分頁(yè)查詢10000條數(shù)據(jù)。
10. ElasticSearch 如何解除分頁(yè)查詢的限制?
max_result_window 屬性控制從Elasticsearch中檢索文檔的最大數(shù)量,默認(rèn)情況下,它的值為10000??梢酝ㄟ^(guò)修改 index.max_result_window 參數(shù)來(lái)增加搜索結(jié)果的最大數(shù)量。如果您需要檢索更多的文檔,請(qǐng)?jiān)黾觤ax_result_window的值。但是,需要注意的是,增加max_result_window的值可能會(huì)影響Elasticsearch的性能。
第一種辦法:在kibana中執(zhí)行,解除索引最大查詢數(shù)的限制
PUT /my_index/_settings
{
"index.max_result_window":200000
}
第二種辦法:在創(chuàng)建索引的時(shí)候加上
PUT /my_index
{
"settings": {
"index": {
"max_result_window": 10000
}
}
}
11. ElasticSearch 查詢文檔總命中數(shù)最大限制為多少?
ElasticSearch中可以根據(jù)搜索結(jié)果中的 hits.total.value 值獲取查詢文檔的總命中數(shù), 但最大返回條數(shù)是有限制的,默認(rèn)情況下最大為 10000 條。數(shù)據(jù)量不大的情況下這個(gè)數(shù)值沒問(wèn)題。但是當(dāng)數(shù)據(jù)超出 10000 的時(shí)候,這個(gè) hits.total.value 將不會(huì)增長(zhǎng)了,固定為 10000,這個(gè)時(shí)候的匹配文檔數(shù)量統(tǒng)計(jì)就不準(zhǔn)了。
如集群中總共有30000條文檔,查詢所有時(shí) hits.total.value 的值卻為10000:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 10000,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
// ...
]
}
}
12. ElasticSearch 如何解除查詢文檔總命中數(shù)的限制?
Elasticsearch 的 track_total_hits 參數(shù)用于控制查詢時(shí)是否計(jì)算總命中數(shù),如果想要統(tǒng)計(jì)準(zhǔn)確的匹配文檔數(shù),需要使用參數(shù) track_total_hits 來(lái)開啟精確匹配。默認(rèn)情況下會(huì)計(jì)算前10000條數(shù)據(jù)的總命中數(shù),如果想解除這個(gè)限制,需要將track_total_hits 參數(shù)設(shè)置為true。
track_total_hits 參數(shù)有三種取值:
true:計(jì)算總命中數(shù)。
false:不計(jì)算總命中數(shù)。
數(shù)字:只計(jì)算前 n 條數(shù)據(jù)的總命中數(shù)。
① 計(jì)算總命中數(shù):
GET /my_index/_search
{
"query": {
"match": {
"title": "酒店"
}
},
"track_total_hits": true
}
查詢文檔的總命中數(shù) hits.total.value 值為12,文檔列表 hits.hits 中10條文檔(from=0,size=10)
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 12,
"relation" : "eq"
},
"max_score" : 0.06382885,
"hits" : [
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.06382885,
"_source" : {
"title" : "文雅酒店",
"content" : "青島",
"price" : 556
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "4",
"_score" : 0.06382885,
"_source" : {
"title" : "金都酒店",
"content" : "上海",
"price" : 300
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "5",
"_score" : 0.06382885,
"_source" : {
"title" : "自如酒店",
"content" : "南京",
"price" : 400
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "6",
"_score" : 0.06382885,
"_source" : {
"title" : "如家酒店",
"content" : "杭州",
"price" : 500
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "7",
"_score" : 0.06382885,
"_source" : {
"title" : "非常酒店",
"content" : "合肥",
"price" : 600
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "9",
"_score" : 0.06382885,
"_source" : {
"title" : "金都酒店",
"content" : "淮南",
"price" : 900
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "8",
"_score" : 0.06382885,
"_source" : {
"title" : "金都酒店",
"content" : "淮北",
"price" : 700
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "10",
"_score" : 0.06382885,
"_source" : {
"title" : "麗舍酒店",
"content" : "阜陽(yáng)",
"price" : 1000,
"uploadTime" : 1678073241
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "11",
"_score" : 0.06382885,
"_source" : {
"title" : "文軒酒店",
"content" : "蚌埠",
"price" : 1020
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "12",
"_score" : 0.06382885,
"_source" : {
"title" : "大理酒店",
"content" : "長(zhǎng)沙",
"price" : 1100
}
}
]
}
}
② 不計(jì)算總命中數(shù):
GET /my_index/_search
{
"query": {
"match": {
"title": "酒店"
}
},
"track_total_hits": false
}
查詢結(jié)果中不返回總命中數(shù) hits.total.value ,文檔列表 hits.hits 中10條文檔(from=0,size=10)
{
"took" : 8,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"max_score" : 0.06382885,
"hits" : [
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.06382885,
"_source" : {
"title" : "文雅酒店",
"content" : "青島",
"price" : 556
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "4",
"_score" : 0.06382885,
"_source" : {
"title" : "金都酒店",
"content" : "上海",
"price" : 300
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "5",
"_score" : 0.06382885,
"_source" : {
"title" : "自如酒店",
"content" : "南京",
"price" : 400
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "6",
"_score" : 0.06382885,
"_source" : {
"title" : "如家酒店",
"content" : "杭州",
"price" : 500
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "7",
"_score" : 0.06382885,
"_source" : {
"title" : "非常酒店",
"content" : "合肥",
"price" : 600
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "9",
"_score" : 0.06382885,
"_source" : {
"title" : "金都酒店",
"content" : "淮南",
"price" : 900
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "8",
"_score" : 0.06382885,
"_source" : {
"title" : "金都酒店",
"content" : "淮北",
"price" : 700
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "10",
"_score" : 0.06382885,
"_source" : {
"title" : "麗舍酒店",
"content" : "阜陽(yáng)",
"price" : 1000,
"uploadTime" : 1678073241
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "11",
"_score" : 0.06382885,
"_source" : {
"title" : "文軒酒店",
"content" : "蚌埠",
"price" : 1020
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "12",
"_score" : 0.06382885,
"_source" : {
"title" : "大理酒店",
"content" : "長(zhǎng)沙",
"price" : 1100
}
}
]
}
}
③ 只計(jì)算前5條數(shù)據(jù)的總命中數(shù):
GET /my_index/_search
{
"query": {
"match": {
"title": "酒店"
}
},
"track_total_hits": 5
}
前5條數(shù)據(jù)的總命中數(shù) hits.total.value 值為5,文檔列表 hits.hits 中10條文檔(from=0,size=10)
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 5,
"relation" : "gte"
},
"max_score" : 0.06382885,
"hits" : [
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.06382885,
"_source" : {
"title" : "文雅酒店",
"content" : "青島",
"price" : 556
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "4",
"_score" : 0.06382885,
"_source" : {
"title" : "金都酒店",
"content" : "上海",
"price" : 300
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "5",
"_score" : 0.06382885,
"_source" : {
"title" : "自如酒店",
"content" : "南京",
"price" : 400
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "6",
"_score" : 0.06382885,
"_source" : {
"title" : "如家酒店",
"content" : "杭州",
"price" : 500
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "7",
"_score" : 0.06382885,
"_source" : {
"title" : "非常酒店",
"content" : "合肥",
"price" : 600
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "9",
"_score" : 0.06382885,
"_source" : {
"title" : "金都酒店",
"content" : "淮南",
"price" : 900
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "8",
"_score" : 0.06382885,
"_source" : {
"title" : "金都酒店",
"content" : "淮北",
"price" : 700
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "10",
"_score" : 0.06382885,
"_source" : {
"title" : "麗舍酒店",
"content" : "阜陽(yáng)",
"price" : 1000,
"uploadTime" : 1678073241
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "11",
"_score" : 0.06382885,
"_source" : {
"title" : "文軒酒店",
"content" : "蚌埠",
"price" : 1020
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "12",
"_score" : 0.06382885,
"_source" : {
"title" : "大理酒店",
"content" : "長(zhǎng)沙",
"price" : 1100
}
}
]
}
}
④ 計(jì)算前15條文檔的總命中數(shù):
GET /my_index/_search
{
"query": {
"match": {
"title": "酒店"
}
},
"track_total_hits": 15
}
前15條數(shù)據(jù)的總命中數(shù) hits.total.value 值為12,文檔列表 hits.hits 中10條文檔(from=0,size=10)文章來(lái)源:http://www.zghlxwxcb.cn/news/detail-401588.html
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 12,
"relation" : "eq"
},
"max_score" : 0.06382885,
"hits" : [
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.06382885,
"_source" : {
"title" : "文雅酒店",
"content" : "青島",
"price" : 556
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "4",
"_score" : 0.06382885,
"_source" : {
"title" : "金都酒店",
"content" : "上海",
"price" : 300
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "5",
"_score" : 0.06382885,
"_source" : {
"title" : "自如酒店",
"content" : "南京",
"price" : 400
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "6",
"_score" : 0.06382885,
"_source" : {
"title" : "如家酒店",
"content" : "杭州",
"price" : 500
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "7",
"_score" : 0.06382885,
"_source" : {
"title" : "非常酒店",
"content" : "合肥",
"price" : 600
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "9",
"_score" : 0.06382885,
"_source" : {
"title" : "金都酒店",
"content" : "淮南",
"price" : 900
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "8",
"_score" : 0.06382885,
"_source" : {
"title" : "金都酒店",
"content" : "淮北",
"price" : 700
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "10",
"_score" : 0.06382885,
"_source" : {
"title" : "麗舍酒店",
"content" : "阜陽(yáng)",
"price" : 1000,
"uploadTime" : 1678073241
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "11",
"_score" : 0.06382885,
"_source" : {
"title" : "文軒酒店",
"content" : "蚌埠",
"price" : 1020
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "12",
"_score" : 0.06382885,
"_source" : {
"title" : "大理酒店",
"content" : "長(zhǎng)沙",
"price" : 1100
}
}
]
}
}
13. ElasticSearch 分頁(yè)查詢的性能優(yōu)化有哪些?
盡量減少查詢的字段,只查詢需要的字段。
盡量減少查詢的數(shù)據(jù)量,只查詢需要的數(shù)據(jù)。
使用 scroll 查詢或者搜索后再次查詢的方式來(lái)避免過(guò)多的分頁(yè)查詢。
使用索引優(yōu)化技術(shù),如分片、副本等來(lái)提高查詢性能。文章來(lái)源地址http://www.zghlxwxcb.cn/news/detail-401588.html
14. SpringBoo整合ES實(shí)現(xiàn):from+size 分頁(yè)查詢?
GET /my_index/_search
{
"query": {
"match": {
"title": "酒店"
}
},
"from": 0, // 從第 1 條數(shù)據(jù)開始
"size": 3 // 返回 3 條數(shù)據(jù)
}
@Slf4j
@Service
public class ElasticSearchImpl {
@Autowired
private RestHighLevelClient restHighLevelClient;
public void searchUser() throws IOException {
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
// query 查詢
MatchQueryBuilder matchQueryBuilder = new MatchQueryBuilder("title","酒店");
searchSourceBuilder.query(matchQueryBuilder);
// 分頁(yè)查詢
int page = 1; // 第1頁(yè)
int pageSize = 3; // 每頁(yè)返回3條數(shù)據(jù)
searchSourceBuilder.from((page-1)*pageSize);
searchSourceBuilder.size(pageSize);
SearchRequest searchRequest = new SearchRequest(new String[]{"my_index"},searchSourceBuilder);
SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
// 搜索結(jié)果
SearchHits searchHits = searchResponse.getHits();
SearchHit[] hits = searchHits.getHits();
for (SearchHit hit : hits) {
// hits.hits._source:匹配的文檔的原始數(shù)據(jù)
String sourceAsString = hit.getSourceAsString();
}
System.out.println(searchResponse);
}
}
15. SpringBoo整合ES實(shí)現(xiàn):searchAfetr 分頁(yè)查詢?
POST /my_index/_search
{
"size": 3,
"query": {
"match": {
"title": "酒店"
}
},
"sort": [
{
"price": "asc"
}
],
"track_total_hits": true
}
POST /my_index/_search
{
"size": 1000,
"query": {
"match": {
"title": "酒店"
}
},
"sort": [
{
"price": "asc"
}
],
"search_after": [337]
}
@Slf4j
@Service
public class ElasticSearchImpl {
@Autowired
private RestHighLevelClient restHighLevelClient;
public void searchUser() throws IOException {
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
// query 查詢
MatchQueryBuilder matchQueryBuilder = new MatchQueryBuilder("title","酒店");
searchSourceBuilder.query(matchQueryBuilder);
// 計(jì)算總命中數(shù):track_total_hits
searchSourceBuilder.trackTotalHits(true);
// 每次返回3條數(shù)據(jù)
searchSourceBuilder.size(3);
// 設(shè)置排序字段
searchSourceBuilder.sort(SortBuilders.fieldSort("price").order(SortOrder.ASC));
SearchRequest searchRequest = new SearchRequest(new String[]{"my_index"},searchSourceBuilder);
SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
List<Map<String, Object>> result = new ArrayList<>();
while (searchResponse.getHits().getHits()!=null && searchResponse.getHits().getHits().length>0){
SearchHit[] hits = searchResponse.getHits().getHits();
for (SearchHit hit : hits) {
Map<String, Object> sourceAsMap = hit.getSourceAsMap();
result.add(sourceAsMap);
}
// 取得最后一條數(shù)據(jù)的排序值sort,下次查詢時(shí)將從這個(gè)地方開始取數(shù)
Object[] lastNum = hits[hits.length - 1].getSortValues();
searchSourceBuilder.searchAfter(lastNum);
searchRequest.source(searchSourceBuilder);
// 做下次查詢
searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
}
System.out.println(result);
}
}
16. SpringBoo整合ES實(shí)現(xiàn):scroll 分頁(yè)查詢?
@Slf4j
@Service
public class ElasticSearchImpl {
@Autowired
private RestHighLevelClient restHighLevelClient;
public void search() throws IOException {
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
// query 查詢
MatchQueryBuilder matchQueryBuilder = new MatchQueryBuilder("title","酒店");
searchSourceBuilder.query(matchQueryBuilder);
// 計(jì)算總命中數(shù):track_total_hits
searchSourceBuilder.trackTotalHits(true);
// 每次返回7條數(shù)據(jù)
searchSourceBuilder.size(7);
// 設(shè)置排序字段
searchSourceBuilder.sort(SortBuilders.fieldSort("price").order(SortOrder.ASC));
SearchRequest searchRequest = new SearchRequest(new String[]{"my_index"},searchSourceBuilder);
// 指定游標(biāo)的過(guò)期時(shí)間
searchRequest.scroll(TimeValue.timeValueMinutes(1L));
SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
// 獲取 scrollId
String scrollId = searchResponse.getScrollId();
SearchHit[] searchHits = searchResponse.getHits().getHits();
List<Map<String, Object>> result = new ArrayList<>();
for (SearchHit hit: searchHits) {
result.add(hit.getSourceAsMap());
}
while (true) {
// 根據(jù) scrollId 查詢下一頁(yè)數(shù)據(jù)
SearchScrollRequest scrollRequest = new SearchScrollRequest(scrollId);
// 指定游標(biāo)的過(guò)期時(shí)間
scrollRequest.scroll(TimeValue.timeValueMinutes(1L));
SearchResponse scrollResp = restHighLevelClient.scroll(scrollRequest, RequestOptions.DEFAULT);
SearchHit[] hits = scrollResp.getHits().getHits();
if (hits != null && hits.length > 0) {
for (SearchHit hit : hits) {
result.add(hit.getSourceAsMap());
}
} else {
break;
}
}
System.out.println(result);
// After checking, we delete the id stored in the cache. After scrolling, clear the scrolling context
ClearScrollRequest clearScrollRequest = new ClearScrollRequest();
clearScrollRequest.addScrollId(scrollId);
ClearScrollResponse clearScrollResponse = restHighLevelClient.clearScroll(clearScrollRequest, RequestOptions.DEFAULT);
boolean succeeded = clearScrollResponse.isSucceeded();
System.out.println(succeeded);
restHighLevelClient.close();
}
}
到了這里,關(guān)于ElasticSearch系列 - SpringBoot整合ES:實(shí)現(xiàn)分頁(yè)搜索 from+size、search after、scroll的文章就介紹完了。如果您還想了解更多內(nèi)容,請(qǐng)?jiān)谟疑辖撬阉鱐OY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!