前言
ES在進(jìn)行普通的查詢時,默認(rèn)只會查詢出來10條數(shù)據(jù)。我們通過設(shè)置es中的size可以將最終的查詢結(jié)果從10增加到10000。如果需要查詢數(shù)據(jù)量大于es的翻頁限制或者需要將es的數(shù)據(jù)進(jìn)行導(dǎo)出又當(dāng)如何?
Elasticsearch提供了一種稱為"滾動查詢"(Scrolling)
的機(jī)制,用于處理大型數(shù)據(jù)集的分頁查詢。滾動查詢允許在持續(xù)的時間段內(nèi)保持一個活動的搜索上下文,然后使用滾動ID進(jìn)行迭代
檢索結(jié)果。滾動查詢和關(guān)系型數(shù)據(jù)庫中的游標(biāo)有點類似,因此也叫游標(biāo)查詢
1. 滾動查詢的一般步驟
1.1 發(fā)起初始搜索請求,返回命中結(jié)果和滾動ID
scroll=5m
表示每個滾動查詢的有效時間為5分鐘
POST /your_index/_search?scroll=5m
{
"size": 100, // 每次返回的結(jié)果數(shù)量
"query": { ... } // 查詢條件
}
命中結(jié)果:
{
"_scroll_id": "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAD4WYm9laVYtZndUQlNsdDcwakFMNjU1QQ==",
"hits": {
"total": {
"value": 10000,
"relation": "eq"
},
"hits": [ ... ] // 檢索到的文檔
}
}
示例:
1.2 使用滾動ID檢索下一頁結(jié)果
POST /_search/scroll
{
"scroll": "5m",
"scroll_id": "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAD4WYm9laVYtZndUQlNsdDcwakFMNjU1QQ=="
}
示例:
POST /_search/scroll
{
"scroll": "5m",
"scroll_id": "FGluY2x1ZGVfY29udGV4dF91dWlkDXF1ZXJ5QW5kRmV0Y2gBFDJPRXc0WWdCY1BLWlo1MTk4MmR3AAAAAAAAAXYWcWgwSW5CQUtScEd2T2QtRGtYaWliQQ=="
}
1.4 重復(fù)執(zhí)行直到?jīng)]有檢索結(jié)果返回
Elasticsearch將返回下一頁結(jié)果和一個新的滾動ID??梢愿鶕?jù)需要重復(fù)這個步驟,直到?jīng)]有更多結(jié)果為止
1.5 清除滾動上下文釋放資源
滾動查詢結(jié)束后,您可以通過發(fā)送一個清除滾動上下文的請求來釋放資源:文章來源:http://www.zghlxwxcb.cn/news/detail-511674.html
DELETE /_search/scroll
{
"scroll_id": [
"DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAD4WYm9laVYtZndUQlNsdDcwakFMNjU1QQ=="
]
}
以上為滾動查詢進(jìn)行分頁檢索的基本過程。在每個滾動請求中,都需要提供先前滾動請求返回的滾動ID
。這樣Elasticsearch才能夠維護(hù)搜索上下文并返回正確的結(jié)果文章來源地址http://www.zghlxwxcb.cn/news/detail-511674.html
2.Java Elasticsearch客戶端執(zhí)行滾動查詢
public static void main(String[] args) {
long start = System.currentTimeMillis();
//構(gòu)建es HttpHost對象
HttpHost httpHost1 = new HttpHost("192.168.1.1", 9200, "http");
// 滾動時間窗口
long scrollTime = 1L;
// 每次返回的文檔數(shù)量
int batchSize = 20000;
//索引名
String indexName = "你的索引名稱";
try (RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(httpHost1))) {
//構(gòu)建查詢請求
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(QueryBuilders.boolQuery());
searchSourceBuilder.size(batchSize);
//設(shè)置查詢返回字段
String[] includes = {};
searchSourceBuilder.fetchSource(includes, null);
// 滾動查詢請求
SearchRequest searchRequest = new SearchRequest(indexName);
searchRequest.source(searchSourceBuilder);
//設(shè)置請求滾動時間窗口時間
searchRequest.scroll(TimeValue.timeValueMinutes(scrollTime));
//執(zhí)行首次檢索
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
//首次檢索返回scrollId,用于下一次的滾動查詢
String scrollId = searchResponse.getScrollId();
//獲取首次檢索命中結(jié)果
SearchHit[] searchHits = searchResponse.getHits().getHits();
//計數(shù)
int count = 0;
// 處理第一批結(jié)果
for (SearchHit hit : searchHits) {
// 處理單個文檔
JSONObject dataJson = (JSONObject) JSON.parse(hit.getSourceAsString());
System.out.println("====對首次請求的進(jìn)行處理,當(dāng)前計數(shù):" + count++);
}
// 處理滾動結(jié)果
while (searchHits != null && searchHits.length > 0) {
SearchScrollRequest scrollRequest = new SearchScrollRequest(scrollId);
scrollRequest.scroll(TimeValue.timeValueMinutes(scrollTime));
searchResponse = client.scroll(scrollRequest, RequestOptions.DEFAULT);
scrollId = searchResponse.getScrollId();
searchHits = searchResponse.getHits().getHits();
for (SearchHit hit : searchHits) {
JSONObject dataJson = (JSONObject) JSON.parse(hit.getSourceAsString());
System.out.println("====滾動查詢,當(dāng)前計數(shù):" + count++);
}
}
// 清理滾動上下文
ClearScrollRequest clearScrollRequest = new ClearScrollRequest();
clearScrollRequest.addScrollId(scrollId);
ClearScrollResponse clearScrollResponse = client.clearScroll(clearScrollRequest, RequestOptions.DEFAULT);
boolean succeeded = clearScrollResponse.isSucceeded();
long end = System.currentTimeMillis();
System.out.println("共執(zhí)行時間:" + (end - start) / 1000 + " s");
} catch (Exception e) {
System.out.println("===error==" + e.getMessage());
e.printStackTrace();
}
}
3. SpringDataElasticsearch滾動查詢
import org.elasticsearch.action.search .*;
import org.elasticsearch.client .*;
import org.elasticsearch.common.unit .*;
import org.elasticsearch.index.query .*;
import org.elasticsearch.search .*;
import org.elasticsearch.search.builder .*;
import org.springframework.beans.factory.annotation .*;
import org.springframework.data.elasticsearch.core .*;
import org.springframework.data.elasticsearch.core.query .*;
public class ScrollSearchExample {
@Autowired
private ElasticsearchOperations elasticsearchOperations;
public void performScrollSearch() {
String scrollTime = "1m"; // 滾動時間窗口
int batchSize = 100; // 每次返回的文檔數(shù)量
QueryBuilder queryBuilder = QueryBuilders.matchQuery("field", "value");
NativeSearchQueryBuilder searchQuery = new NativeSearchQueryBuilder();
searchQuery.withQuery(queryBuilder).withPageable(PageRequest.of(0, batchSize)).build();
SearchResponse searchResponse = elasticsearchOperations.startScroll(
scrollTime,
searchQuery,
YourEntityClass.class,
IndexCoordinates.of("your_index")
);
String scrollId = searchResponse.getScrollId();
SearchHits<YourEntityClass> searchHits = searchResponse.getSearchHits();
// 處理第一批結(jié)果
for (SearchHit<YourEntityClass> hit : searchHits) {
YourEntityClass entity = hit.getContent();
// 處理單個文檔
}
// 處理滾動結(jié)果
while (searchHits != null && searchHits.hasSearchHits()) {
searchResponse = elasticsearchOperations.continueScroll(scrollId, scrollTime, YourEntityClass.class);
scrollId = searchResponse.getScrollId();
searchHits = searchResponse.getSearchHits();
for (SearchHit<YourEntityClass> hit : searchHits) {
YourEntityClass entity = hit.getContent();
// 處理單個文檔
}
}
// 清理滾動上下文
elasticsearchOperations.clearScroll(scrollId);
}
}
到了這里,關(guān)于Elasticsearch“滾動查詢“(Scrolling)的機(jī)制的與Java使用ES Client 調(diào)用滾動查詢的文章就介紹完了。如果您還想了解更多內(nèi)容,請在右上角搜索TOY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!