国产无码综合区,色欲AV无码国产永久播放,无码天堂亚洲国产AV,国产日韩欧美女同一区二区

Elasticsearch Search Scroll API（滾動查詢）

2年前作者：riabai分類：Toy博客閱讀(17)違法舉報

這篇具有很好參考價值的文章主要介紹了Elasticsearch Search Scroll API（滾動查詢）。希望對大家有所幫助。如果存在錯誤或未考慮完全的地方，請大家不吝賜教，您也可以點(diǎn)擊"舉報違法"按鈕提交疑問。

參考：Elasticsearch Search Scroll API（滾動查詢） - 簡書

Elasticsearch 中，傳統(tǒng)的分頁查詢使用from+size的模式，from就是頁碼，從 0 開始。默認(rèn)情況下，當(dāng)(from+1)*size大于 10000 時，也就是已查詢的總數(shù)據(jù)量大于 10000 時，會出現(xiàn)異常。

如下，用循環(huán)模擬一個連續(xù)分頁查詢：

public void search() {
        // 記錄頁碼
        int page = 0;
        // 記錄已經(jīng)查詢到總數(shù)據(jù)量
        long total = 0;

        while (true) {
            NativeSearchQuery nativeSearchQuery = new NativeSearchQueryBuilder()
                    // 設(shè)置分頁
                    .withPageable(PageRequest.of(page, 1000))
                    .withSort(new FieldSortBuilder("commentCount").order(SortOrder.DESC))
                    .build();

            SearchHits<Book> searchHits = elasticsearchRestTemplate.search(nativeSearchQuery, Book.class);
            if (!searchHits.hasSearchHits()) {
                break;
            }
            for (SearchHit<Book> searchHit : searchHits.getSearchHits()) {
                Book book = searchHit.getContent();
            }
            page++;
            System.out.println(page);
            System.out.println(total += searchHits.getSearchHits().size());
        }
}

最終當(dāng) page 等于 10 時會拋出如下異常：

Elasticsearch Search Scroll API（滾動查詢）

Caused by: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=Result window is too large, from + size must be less than or equal to: [10000] but was [11000]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting.]]

從異常信息中，我們可以發(fā)現(xiàn)官方給我們提供了兩種方案來解決這個問題：

1、max_result_window

將 Elasticsearch 配置參數(shù)index.max_result_window修改為大于 100000 的值，對應(yīng)的 RESTful API 如下：

PUT book/_settings
{
    "index": {
        "max_result_window": 1000000
    }
}

雖然可以通過修改index.max_result_window來解決查詢時數(shù)據(jù)量的限制，但是這并不是不推薦的做法，當(dāng)數(shù)據(jù)量達(dá)到百萬、千萬級別時，使用from+size模式查詢時性能會越來越差，每次查詢的耗時也會越來越久，嚴(yán)重影響體驗，同時對 CPU 和內(nèi)存的消耗也很大的。

2、scroll api

如果需要查詢大量的數(shù)據(jù)，可以考慮使用 Search Scroll API，這是一種更加高效的方式。

如果直接使用 Java Client，可以參考官方的 API 文檔：
https://www.elastic.co/guide/en/elasticsearch/client/java-rest/7.9/java-rest-high-search-scroll.html

我們這里還是和 SpringBoot 整合去使用，其實(shí)核心的用法都是很類似的。如下同樣模擬一個連分頁查詢：

public void scrollSearch() {
        NativeSearchQuery nativeSearchQuery = new NativeSearchQueryBuilder()
                .withSort(new FieldSortBuilder("commentCount").order(SortOrder.DESC))
                .build();
        // 設(shè)置每頁數(shù)據(jù)量
        nativeSearchQuery.setMaxResults(1000);
    
        long scrollTimeInMillis = 60 * 1000;
        // 第一次查詢
        SearchScrollHits<Book> searchScrollHits = elasticsearchRestTemplate.searchScrollStart(scrollTimeInMillis, nativeSearchQuery, Book.class, IndexCoordinates.of("book"));
        String scrollId = searchScrollHits.getScrollId();

        while (searchScrollHits.hasSearchHits()) {
            System.out.println(total += searchScrollHits.getSearchHits().size());

            for (SearchHit<Book> searchHit : searchScrollHits.getSearchHits()) {
                Book book = searchHit.getContent();
            }
            // 后續(xù)查詢
            searchScrollHits = elasticsearchRestTemplate.searchScrollContinue(scrollId, scrollTimeInMillis, Book.class, IndexCoordinates.of("book"));
            scrollId = searchScrollHits.getScrollId();
        }

        List<String> scrollIds = new ArrayList<>();
        scrollIds.add(scrollId);
        // 清除 scroll
        elasticsearchRestTemplate.searchScrollClear(scrollIds);
}

以下幾點(diǎn)需要注意：

setMaxResults(1000)用來設(shè)置查詢時每頁的數(shù)據(jù)量，我這里使用 Elasticsearch7.9 有這個方法，如果其它舊版本沒有這個方法，可以使用PageRequest.of(0, 1000)來設(shè)置，注意頁碼要為 0。
第一次查詢使用searchScrollStart()，后續(xù)查詢使用searchScrollContinue()，查詢結(jié)果中都攜帶了一個scrollId。
除了第一次查詢外，后續(xù)的查詢都需要攜帶scrollId，可以理解為游標(biāo)，用它來控制分頁。和from+size模式中頁碼是一個作用。
scrollTimeInMillis，表示查詢結(jié)果中scrollId的有效時間，單位毫秒，可根據(jù)實(shí)際情況設(shè)置。
查詢結(jié)束后，需要使用searchScrollClear()清除 scroll。
在from+size分頁查詢模式中，我們可以指定任意合理的頁碼，實(shí)現(xiàn)跳頁查詢；但使用scroll api就無法實(shí)現(xiàn)跳頁查詢了，因為除了第一次查詢外的其它查詢都要依賴上一次查詢返回的scrollId，這一點(diǎn)需要注意。

原文中可能會空查一次，少許修改代碼，如下：文章來源地址http://www.zghlxwxcb.cn/news/detail-418916.html

void searchScroll(){
        NativeSearchQuery query = new NativeSearchQuery(QueryBuilders.matchAllQuery());
        query.setMaxResults(1);//設(shè)置每頁數(shù)據(jù)量
        query.addSort(Sort.by(Sort.Direction.DESC,"age"));

        long scrollTimeInMillis=5_000;
        long currentTotal=0;
        int pageNo=1;
        List<String> scrollIdList = new ArrayList<>();

        //scroll一共有三個方法：searchScrollStart（第一次查詢）、searchScrollContinue（第二次到最后一次）、searchScrollClear（查詢完成后執(zhí)行）

        //第一次查詢使用：searchScrollStart
        SearchScrollHits<People> searchScrollHits = this.elasticsearchRestTemplate.searchScrollStart(scrollTimeInMillis, query, People.class, IndexCoordinates.of("people_index"));
        String scrollId = searchScrollHits.getScrollId();
        scrollIdList.add(scrollId);
        System.out.println("scrollId:"+scrollId);

        long totalHits = searchScrollHits.getTotalHits();
        currentTotal=searchScrollHits.getSearchHits().size();
        System.out.println("totalHits:"+totalHits);

        List<People> list = searchScrollHits.get().map(SearchHit::getContent).collect(Collectors.toList());
        System.out.println("============pageNo:==========="+pageNo);
        for (People people : list) {
            System.out.println(people);
        }

        while (currentTotal<totalHits){
            SearchScrollHits<People> searchScrollHitsContinue = elasticsearchRestTemplate.searchScrollContinue(scrollId, scrollTimeInMillis, People.class, IndexCoordinates.of("people_index"));
            scrollId=searchScrollHitsContinue.getScrollId();
            scrollIdList.add(scrollId);
            pageNo++;
            if(searchScrollHitsContinue.hasSearchHits()){
                currentTotal+=searchScrollHitsContinue.getSearchHits().size();
                List<People> peopleList = searchScrollHitsContinue.get().map(SearchHit::getContent).collect(Collectors.toList());
                System.out.println("============pageNo:==========="+pageNo);
                for (People people : peopleList) {
                    System.out.println(people);
                }
            }else{
                System.out.println("============pageNo not hasSearchHits===========");
                break;
            }
        }
        System.out.println(scrollIdList);
        elasticsearchRestTemplate.searchScrollClear(scrollIdList);
    }

到了這里，關(guān)于Elasticsearch Search Scroll API（滾動查詢）的文章就介紹完了。如果您還想了解更多內(nèi)容，請在右上角搜索TOY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章，希望大家以后多多支持TOY模板網(wǎng)！

本文來自互聯(lián)網(wǎng)用戶投稿，該文觀點(diǎn)僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務(wù)，不擁有所有權(quán)，不承擔(dān)相關(guān)法律責(zé)任。如若轉(zhuǎn)載，請注明出處：如若內(nèi)容造成侵權(quán)/違法違規(guī)/事實(shí)不符，請點(diǎn)擊違法舉報進(jìn)行投訴反饋，一經(jīng)查實(shí)，立即刪除！

分享到：

領(lǐng)支付寶紅包贊助服務(wù)器費(fèi)用

Elasticsearch Search API之(Request Body Search 查詢主體)(1)
“failed”:0 }, “hits”:{ “total”:1, “max_score”:0.2876821, “hits”:[ { “_index”:“map_highlighting_01”, “_type”:“_doc”, “_id”:“erYsbmcBeEynCj5VqVTI”, “_score”:0.2876821, “_source”:{ “context”:“城中西路可以受理外地二代身份證的辦理。” }, “highlight”:{ // @1 “context”:[ “城中西
2024年04月13日
瀏覽(28)
ElasticSearch6.x版本的Scroll滾動查詢講解及Kibana和SpringBoot實(shí)操演示
ElasticSearch中在進(jìn)行普通的查詢時，默認(rèn)只會查詢出來10條數(shù)據(jù) 。我們通過設(shè)置ElasticSearch中的 size 可以將最終的查詢結(jié)果從 10 增加到 10000 。但這時候如果我們需要查詢的數(shù)據(jù)大于10000條怎么辦呢？這時候有兩種方法：深度分頁和滾動查詢。在這里我們優(yōu)選選擇滾動查詢
2024年01月17日
瀏覽(23)
Elasticsearch From/Size、Scroll、Search After對比
Elasticsearch From/Size、Scroll、Search After對比可以使用from和size參數(shù)對結(jié)果進(jìn)行分頁。from參數(shù)定義要獲取的第一個結(jié)果的偏移量。 size 參數(shù)允許您配置要返回的最大匹配數(shù)。簡單來說，需要查詢from + size 的條數(shù)時，coordinate node就向該index的其余的shards 發(fā)送同樣的請求，等匯總到（
2023年04月08日
瀏覽(36)
ElasticSearch系列 - SpringBoot整合ES：實(shí)現(xiàn)分頁搜索 from+size、search after、scroll
01. 數(shù)據(jù)準(zhǔn)備 ElasticSearch 向 my_index 索引中索引了 12 條文檔： 02. ElasticSearch 如何查詢所有文檔？ ElasticSearch 查詢所有文檔根據(jù)查詢結(jié)果可以看出，集群中總共有12個文檔，hits.total.value=12，但是在 hits 數(shù)組中只有 10 個文檔。如何才能看到其他的文檔？ 03. ElasticSearch 如何指定搜
2023年04月08日
瀏覽(28)
Elasticsearch：Async search API
當(dāng)我們想要執(zhí)行持續(xù)時間較長的查詢時，執(zhí)行異步操作是一個很好的選擇。在這篇文章中，我們將學(xué)習(xí)如何管理異步查詢。異步操作由?async search API?執(zhí)行。異步搜索 API 具有與 _search API 相同的參數(shù)，因此你無需構(gòu)建特殊查詢。在我之前的文章 “Elasticsearch：異步搜索 - as
2023年04月08日
瀏覽(24)
Elastic Search的RestFul API入門：如何進(jìn)行ES的查詢-search
在這篇教學(xué)文章中，我們將深入探討Elasticsearch的search功能。這是一個非常強(qiáng)大且靈活的功能，它允許我們對存儲在Elasticsearch中的數(shù)據(jù)進(jìn)行各種復(fù)雜的查詢和分析。本章的目標(biāo)是讓讀者理解如何進(jìn)行Elasticsearch的搜索，以及如何在搜索過程中自主調(diào)整搜索參數(shù)，從而靈活地控制
2024年02月03日
瀏覽(27)
java使用ElasticSearch的scroll查詢，高效的解決es查詢數(shù)量的限制。
（1）首先我們要明白es的查詢機(jī)制：ES的搜索是分2個階段進(jìn)行的，即 Query階段和Fetch階段。 Query階段比較輕量級，通過查詢倒排索引，獲取滿足查詢結(jié)果的文檔ID列表。 Fetch階段比較重，需要將每個分片的查詢結(jié)果取回，在協(xié)調(diào)結(jié)點(diǎn)進(jìn)行全局排序。通過From+size這種方式分批
2024年02月03日
瀏覽(42)
Elasticsearch ES 簡單查詢 Query String Search 入門
嘗試了text類型排序需要特別處理下. \\\"reason\\\" : \\\"Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [name] in order to load field data by uninverting the inverted index.
2024年02月16日
瀏覽(27)
elasticsearch 深度分頁查詢 Search_after（圖文教程）
前言這是我在這個網(wǎng)站整理的筆記,有錯誤的地方請指出，關(guān)注我，接下來還會持續(xù)更新。作者：神的孩子都在歌唱 search_after 是 Elasticsearch 提供的一種分頁查詢方式，它可以用來在已經(jīng)排序的結(jié)果集中進(jìn)行分頁查詢。 search_after查詢步驟如下（下面有具體的例子幫助理解）：
2024年04月11日
瀏覽(25)
ElasticSearch7.3學(xué)習(xí)(二十二)----Text字段排序、Scroll分批查詢場景解析
場景：數(shù)據(jù)庫中按照某個字段排序，sql只需寫order by 字段名即可，如果es對一個 text field 進(jìn)行排序，es中無法排序。因為文檔入倒排索引表時，分詞存入，es無法知道此字段的真實(shí)值。這樣的結(jié)果往往不準(zhǔn)確，因為分詞后是多個單詞，再排序就不是我們想要的結(jié)果了。通常有兩
2024年02月08日
瀏覽(19)