国产 无码 综合区,色欲AV无码国产永久播放,无码天堂亚洲国产AV,国产日韩欧美女同一区二区

【ElasticSearch】ES自動(dòng)補(bǔ)全查詢與Java接口實(shí)現(xiàn)

這篇具有很好參考價(jià)值的文章主要介紹了【ElasticSearch】ES自動(dòng)補(bǔ)全查詢與Java接口實(shí)現(xiàn)。希望對(duì)大家有所幫助。如果存在錯(cuò)誤或未考慮完全的地方,請(qǐng)大家不吝賜教,您也可以點(diǎn)擊"舉報(bào)違法"按鈕提交疑問。


自動(dòng)補(bǔ)全就是當(dāng)用戶在搜索框輸入字符時(shí),我們應(yīng)該提示出與該字符有關(guān)的搜索項(xiàng)。

【ElasticSearch】ES自動(dòng)補(bǔ)全查詢與Java接口實(shí)現(xiàn),ElasticSearch,elasticsearch,大數(shù)據(jù),搜索引擎

1、安裝拼音分詞器

要實(shí)現(xiàn)根據(jù)字母做補(bǔ)全,就必須對(duì)文檔按照拼音分詞。GitHub上有相關(guān)插件,地址:https://github.com/medcl/elasticsearch-analysis-pinyin,下載和ES對(duì)應(yīng)的版本。

安裝步驟:

  • 解壓

【ElasticSearch】ES自動(dòng)補(bǔ)全查詢與Java接口實(shí)現(xiàn),ElasticSearch,elasticsearch,大數(shù)據(jù),搜索引擎

  • 上傳到虛擬機(jī)中,elasticsearch的plugin目錄

【ElasticSearch】ES自動(dòng)補(bǔ)全查詢與Java接口實(shí)現(xiàn),ElasticSearch,elasticsearch,大數(shù)據(jù),搜索引擎

docker volume ls
docker inspect volumeXXX
  • 重啟elasticsearch的容器
docker restar es
  • 測(cè)試

【ElasticSearch】ES自動(dòng)補(bǔ)全查詢與Java接口實(shí)現(xiàn),ElasticSearch,elasticsearch,大數(shù)據(jù),搜索引擎

但上面的拼音分詞器也有很明顯的缺陷,那就是沒有進(jìn)行詞條切割,且漢字沒了。

2、自定義分詞器

ES的分詞器由三部分組成:

character filters:在tokenizer之前對(duì)文本進(jìn)行處理。例如刪除字符、替換字符

tokenizer:將文本按照一定的規(guī)則切割成詞條(term)。例如keyword,就是不分詞;還有ik_smart

tokenizer filter:將tokenizer輸出的詞條做進(jìn)一步處理。例如大小寫轉(zhuǎn)換、同義詞處理、拼音處理等

舉個(gè)例子:

【ElasticSearch】ES自動(dòng)補(bǔ)全查詢與Java接口實(shí)現(xiàn),ElasticSearch,elasticsearch,大數(shù)據(jù),搜索引擎
在創(chuàng)建索引庫時(shí), 可以通過settings來配置自定義的analyzer(分詞器):

PUT /test
{
  "settings": {
    "analysis": {
      "analyzer": { // 自定義分詞器
        "my_analyzer": {  // 分詞器名稱
          "tokenizer": "ik_max_word",
          "filter": "pinyin"
        }
      }
     }
  }
}
# 不需要替換或者刪除,就不加character filter了

再改下定義tokenizer filter時(shí)的各種屬性:

PUT /test
{
  "settings": {
    "analysis": {
      "analyzer": { // 自定義分詞器
        "my_analyzer": {  // 分詞器名稱
          "tokenizer": "ik_max_word",
          "filter": "py"
        }
      },
      "filter": { // 自定義tokenizer filter
        "py": { // 過濾器名稱
          "type": "pinyin", // 過濾器類型,這里是pinyin
		  "keep_full_pinyin": false,  //解決全分為單個(gè)字的問題
          "keep_joined_full_pinyin": true,  //全拼
          "keep_original": true,  //是否保留中文
          "limit_first_letter_length": 16,
          "remove_duplicated_term": true,
          "none_chinese_pinyin_tokenize": false
        }
      }
    }
  }
}

【ElasticSearch】ES自動(dòng)補(bǔ)全查詢與Java接口實(shí)現(xiàn),ElasticSearch,elasticsearch,大數(shù)據(jù),搜索引擎

測(cè)試下效果:

【ElasticSearch】ES自動(dòng)補(bǔ)全查詢與Java接口實(shí)現(xiàn),ElasticSearch,elasticsearch,大數(shù)據(jù),搜索引擎

插入兩條name同音文檔的后,搜索:

【ElasticSearch】ES自動(dòng)補(bǔ)全查詢與Java接口實(shí)現(xiàn),ElasticSearch,elasticsearch,大數(shù)據(jù),搜索引擎

但此時(shí)搜一下中文看看:

【ElasticSearch】ES自動(dòng)補(bǔ)全查詢與Java接口實(shí)現(xiàn),ElasticSearch,elasticsearch,大數(shù)據(jù),搜索引擎

很明顯有問題,別人問獅子,你連同音詞虱子都返回了。


拼音分詞器適合在創(chuàng)建倒排索引時(shí)使用,但不能在搜索的時(shí)候使用。

創(chuàng)建倒排索引時(shí),如下圖:

【ElasticSearch】ES自動(dòng)補(bǔ)全查詢與Java接口實(shí)現(xiàn),ElasticSearch,elasticsearch,大數(shù)據(jù),搜索引擎

但當(dāng)使用拼音分詞器來搜索時(shí):

【ElasticSearch】ES自動(dòng)補(bǔ)全查詢與Java接口實(shí)現(xiàn),ElasticSearch,elasticsearch,大數(shù)據(jù),搜索引擎

要解決這個(gè)問題,可以使用兩個(gè)分詞器:

"analyzer": "my_analyzer",   # 創(chuàng)建倒排索引時(shí)使用
"search_analyzer": "ik_smart" # 搜索時(shí)使用

PUT /test
{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "ik_max_word", "filter": "py"
        }
      },
      "filter": {
        "py": { ... }
      }
    }
  },
  "mappings": {
    "properties": {
      "name": {
        "type": "text",
        "analyzer": "my_analyzer",
        "search_analyzer": "ik_smart"  //!!!!
      }
    }
  }
}

【ElasticSearch】ES自動(dòng)補(bǔ)全查詢與Java接口實(shí)現(xiàn),ElasticSearch,elasticsearch,大數(shù)據(jù),搜索引擎

重新測(cè)試下搜索中文的場(chǎng)景:

【ElasticSearch】ES自動(dòng)補(bǔ)全查詢與Java接口實(shí)現(xiàn),ElasticSearch,elasticsearch,大數(shù)據(jù),搜索引擎
小結(jié):

【ElasticSearch】ES自動(dòng)補(bǔ)全查詢與Java接口實(shí)現(xiàn),ElasticSearch,elasticsearch,大數(shù)據(jù),搜索引擎

3、completion suggester查詢

ES提供Completion Suggester查詢 來實(shí)現(xiàn)自動(dòng)補(bǔ)全功能,該查詢會(huì)匹配以用戶輸入內(nèi)容開頭的詞條并返回。此時(shí),文檔中字段的類型也有特殊要求:

  • 參與補(bǔ)全查詢的字段必須是completion類型。
  • 字段的內(nèi)容一般是用來補(bǔ)全的多個(gè)詞條形成的數(shù)組
// 創(chuàng)建索引庫
PUT test
{
  "mappings": {
    "properties": {
      "title":{
        "type": "completion"   //注意字段類型為completion
      }
    }
  }
}

// 示例數(shù)據(jù)
POST test/_doc
{
  "title": ["Sony", "WH-1000XM3"]
}
POST test/_doc
{
  "title": ["SK-II", "PITERA"]
}
POST test/_doc
{
  "title": ["Nintendo", "switch"]
}

completion suggester查詢語法:

// 自動(dòng)補(bǔ)全查詢
GET /test/_search
{
  "suggest": { //查詢類型,用suggest 
    "title_suggest": { //給你的suggest查詢起個(gè)名
      "text": "s", // 用戶輸入的關(guān)鍵字
      "completion": {
        "field": "title", // 補(bǔ)全查詢的字段
        "skip_duplicates": true, // 跳過重復(fù)的
        "size": 10 // 獲取前10條結(jié)果
      }
    }
  }
}

運(yùn)行DSL:

【ElasticSearch】ES自動(dòng)補(bǔ)全查詢與Java接口實(shí)現(xiàn),ElasticSearch,elasticsearch,大數(shù)據(jù),搜索引擎

4、hotel索引庫更新

看下之前的索引庫的結(jié)構(gòu):

【ElasticSearch】ES自動(dòng)補(bǔ)全查詢與Java接口實(shí)現(xiàn),ElasticSearch,elasticsearch,大數(shù)據(jù),搜索引擎
執(zhí)行DSL來更新hotel索引庫:

PUT /hotel
{
  "settings": {
    "analysis": {
      "analyzer": {
        "text_anlyzer": {  //定義第一個(gè)分詞器
          "tokenizer": "ik_max_word",  //切割用ik_max
          "filter": "py"  //轉(zhuǎn)換用拼音
        },
        "completion_analyzer": {   //定義第二個(gè)分詞器,用于自動(dòng)補(bǔ)全,不分詞,直接轉(zhuǎn)拼音
          "tokenizer": "keyword",  //分詞用keyword,因?yàn)閰⑴c自動(dòng)補(bǔ)全的是一個(gè)個(gè)詞條,這些詞條放在數(shù)組當(dāng)中,本身就是個(gè)詞條
          "filter": "py"
        }
      },
      "filter": {  //定義上面的拼音filter
        "py": {
          "type": "pinyin",
          "keep_full_pinyin": false,
          "keep_joined_full_pinyin": true,
          "keep_original": true,
          "limit_first_letter_length": 16,
          "remove_duplicated_term": true,
          "none_chinese_pinyin_tokenize": false
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "id":{
        "type": "keyword"
      },
      "name":{
        "type": "text",
        "analyzer": "text_anlyzer",  //用來創(chuàng)建倒排索引時(shí)分詞
        "search_analyzer": "ik_smart",  //用來全文檢索
        "copy_to": "all"
      },
      "address":{
        "type": "keyword",
        "index": false
      },
      "price":{
        "type": "integer"
      },
      "score":{
        "type": "integer"
      },
      "brand":{
        "type": "keyword",
        "copy_to": "all"
      },
      "city":{
        "type": "keyword"
      },
      "starName":{
        "type": "keyword"
      },
      "business":{
        "type": "keyword",
        "copy_to": "all"
      },
      "location":{
        "type": "geo_point"
      },
      "pic":{
        "type": "keyword",
        "index": false
      },
      "all":{
        "type": "text",
        "analyzer": "text_anlyzer",  //倒排索引分詞
        "search_analyzer": "ik_smart"  //搜索分詞
      },
      "suggestion":{   //新加這個(gè)字段,用來做自動(dòng)補(bǔ)全
          "type": "completion",  //類型為completion
          "analyzer": "completion_analyzer"  //不分詞,直接轉(zhuǎn)拼音
      }
    }
  }
}

上面實(shí)現(xiàn)了:

  • 修改hotel索引庫結(jié)構(gòu),設(shè)置自定義拼音分詞器
  • 修改索引庫的name、all字段,使用自定義分詞器
  • 索引庫添加一個(gè)新字段suggestion,類型為completion類型,使用自定義的分詞器

5、代碼修改

上面索引庫更新后,上一節(jié)中的代碼也要發(fā)生修改:

  • 給HotelDoc類添加suggestion字段,內(nèi)容包含brand、business
  • 重新導(dǎo)入數(shù)據(jù)到hotel庫
//HotelDoc類修改
@Data
@NoArgsConstructor
public class HotelDoc {
    private Long id;
    private String name;
    private String address;
    private Integer price;
    private Integer score;
    private String brand;
    private String city;
    private String starName;
    private String business;
    private String location;
    private String pic;
    //距離
    private Object distance;
    //是否充廣告
    private Boolean isAD;
    //ES中的completion,后面存數(shù)組,這里可以對(duì)應(yīng)成List
    private List<String> suggestion;

    public HotelDoc(Hotel hotel) {
        this.id = hotel.getId();
        this.name = hotel.getName();
        this.address = hotel.getAddress();
        this.price = hotel.getPrice();
        this.score = hotel.getScore();
        this.brand = hotel.getBrand();
        this.city = hotel.getCity();
        this.starName = hotel.getStarName();
        this.business = hotel.getBusiness();
        this.location = hotel.getLatitude() + ", " + hotel.getLongitude();
        this.pic = hotel.getPic();
        this.suggestion = Arrays.asList(this.brand,this.business);

    }
}

注意上面的Array.asList方法,使suggestion字段內(nèi)容包含brand、business

//運(yùn)行從MySQL讀數(shù)據(jù),插入文檔到ES的單元測(cè)試代碼
@SpringBootTest
public class HotelDocumentTest {

    @Resource
    IHotelService iHotelService;

    private RestHighLevelClient client;

    @Test
    void testInit(){

        System.out.println(client);
    }

    @BeforeEach
    void setUp(){
        this.client = new RestHighLevelClient(RestClient.builder(
                HttpHost.create("http://10.4.130.220:9200")
        ));
    }

    @AfterEach
    void tearDown() throws IOException {
        this.client.close();
    }


    @Test
    void testBulk() throws IOException {
        List<Hotel> hotels = iHotelService.list();
        BulkRequest request = new BulkRequest();
        for(Hotel hotel : hotels){
            HotelDoc hotelDoc = new HotelDoc(hotel);
            request.add(new IndexRequest("hotel")
                    .id(hotelDoc.getId().toString())
                    .source(JSON.toJSONString(hotelDoc),XContentType.JSON)
            );
        }
        client.bulk(request,RequestOptions.DEFAULT);



    }


}

查看下文檔數(shù)據(jù):

【ElasticSearch】ES自動(dòng)補(bǔ)全查詢與Java接口實(shí)現(xiàn),ElasticSearch,elasticsearch,大數(shù)據(jù),搜索引擎

在suggestion字段發(fā)現(xiàn)有數(shù)據(jù)的商業(yè)區(qū)有多個(gè):

【ElasticSearch】ES自動(dòng)補(bǔ)全查詢與Java接口實(shí)現(xiàn),ElasticSearch,elasticsearch,大數(shù)據(jù),搜索引擎

修改下HotelDoc的有參構(gòu)造,加判斷邏輯,business字段有斜杠/時(shí),分開后再放入suggestion

@Data
@NoArgsConstructor
public class HotelDoc {
    private Long id;
    private String name;
    private String address;
    private Integer price;
    private Integer score;
    private String brand;
    private String city;
    private String starName;
    private String business;
    private String location;
    private String pic;
    //距離
    private Object distance;
    //是否充廣告
    private Boolean isAD;
    //ES中的completion,后面存數(shù)組,這里可以對(duì)應(yīng)成List
    private List<String> suggestion;

    public HotelDoc(Hotel hotel) {
        this.id = hotel.getId();
        this.name = hotel.getName();
        this.address = hotel.getAddress();
        this.price = hotel.getPrice();
        this.score = hotel.getScore();
        this.brand = hotel.getBrand();
        this.city = hotel.getCity();
        this.starName = hotel.getStarName();
        this.business = hotel.getBusiness();
        this.location = hotel.getLatitude() + ", " + hotel.getLongitude();
        this.pic = hotel.getPic();
        if(this.business.contains("/")){
            //此時(shí)business有多個(gè)值,需要分開后放入suggestion
            String[] arr = this.business.split("/");
            //添加元素
            this.suggestion = new ArrayList<>();
            Collections.addAll(this.suggestion,arr);
            this.suggestion.add(this.brand);
        }else{
            this.suggestion = Arrays.asList(this.brand,this.business);
        }

    }
}


重新運(yùn)行單元測(cè)試,插入文檔數(shù)據(jù),可以看到切割完成了:

【ElasticSearch】ES自動(dòng)補(bǔ)全查詢與Java接口實(shí)現(xiàn),ElasticSearch,elasticsearch,大數(shù)據(jù),搜索引擎

執(zhí)行自動(dòng)補(bǔ)全查詢的DSL,比如搜索h:

【ElasticSearch】ES自動(dòng)補(bǔ)全查詢與Java接口實(shí)現(xiàn),ElasticSearch,elasticsearch,大數(shù)據(jù),搜索引擎

6、RestAPI實(shí)現(xiàn)自動(dòng)補(bǔ)全

Java代碼對(duì)比DSL來看,查詢的實(shí)現(xiàn)是:

【ElasticSearch】ES自動(dòng)補(bǔ)全查詢與Java接口實(shí)現(xiàn),ElasticSearch,elasticsearch,大數(shù)據(jù),搜索引擎

對(duì)響應(yīng)結(jié)果的處理是:

【ElasticSearch】ES自動(dòng)補(bǔ)全查詢與Java接口實(shí)現(xiàn),ElasticSearch,elasticsearch,大數(shù)據(jù),搜索引擎

在單元測(cè)試中使用一下先:

@Test
    void testSuggest() throws IOException {
        //1、準(zhǔn)備Request
        SearchRequest request = new SearchRequest("hotel");
        //2、準(zhǔn)備DSL
        request.source()
                .suggest(new SuggestBuilder().addSuggestion(
                        "mySuggestion",
                        SuggestBuilders.completionSuggestion("suggestion")
                                .prefix("h")  //搜索的關(guān)鍵字,這里用prefix,即前置,給方法起名很靈性
                                .skipDuplicates(true)
                                .size(10)
                ));
        //3、發(fā)起請(qǐng)求
        SearchResponse response = client.search(request,RequestOptions.DEFAULT);
        //4、解析結(jié)果
        Suggest suggest = response.getSuggest();
        //4.1 根據(jù)不全查詢名稱,獲取查詢結(jié)果
        CompletionSuggestion suggestion = suggest.getSuggestion("mySuggestion");
        //4.2 獲取options
        List<CompletionSuggestion.Entry.Option> options = suggestion.getOptions();
        //4.3 遍歷
        for (CompletionSuggestion.Entry.Option option : options) {
            String text = option.getText().toString();
            System.out.println(text);

        }
    }

運(yùn)行得到以h開頭的所有結(jié)果:

【ElasticSearch】ES自動(dòng)補(bǔ)全查詢與Java接口實(shí)現(xiàn),ElasticSearch,elasticsearch,大數(shù)據(jù),搜索引擎

7、需求:搜索框?qū)崿F(xiàn)自動(dòng)補(bǔ)全

看下前端頁面,每當(dāng)在輸入框鍵入時(shí),前端會(huì)發(fā)送ajax請(qǐng)求:

【ElasticSearch】ES自動(dòng)補(bǔ)全查詢與Java接口實(shí)現(xiàn),ElasticSearch,elasticsearch,大數(shù)據(jù),搜索引擎

接下來完善這個(gè)接口,實(shí)現(xiàn)搜索框的自動(dòng)補(bǔ)全:

  • controller接口定義
import cn.itcast.hotel.domain.dto.RequestParams;
import cn.itcast.hotel.domain.vo.PageResult;
import cn.itcast.hotel.service.IHotelService;
import org.springframework.web.bind.annotation.*;

import javax.annotation.Resource;
import java.util.List;
import java.util.Map;
@RestController
@RequestMapping("/hotel")
public class HotelSearchController {

    @Resource
    IHotelService hotelService;

    @GetMapping("/suggestion")
    public List<String> getSuggestions(@RequestParam("key") String prefix){
        return hotelService.getSuggestion(prefix);
    }



}
  • Service接口
public interface IHotelService extends IService<Hotel> {
 
    List<String> getSuggestion(String prefix);
}
  • 接口實(shí)現(xiàn)
@Override
public List<String> getSuggestion(String prefix) {
    try {
        SearchRequest request = new SearchRequest("hotel");
        request.source().suggest(new SuggestBuilder().addSuggestion(
                "mySuggestion",
                SuggestBuilders.completionSuggestion("suggestion")
                        .prefix(prefix)
                        .skipDuplicates(true)
                        .size(15)
        ));
        SearchResponse response = client.search(request, RequestOptions.DEFAULT);
        Suggest suggest = response.getSuggest();
        CompletionSuggestion mySuggestion = suggest.getSuggestion("mySuggestion");
        List<CompletionSuggestion.Entry.Option> options = mySuggestion.getOptions();
        return options.stream()
                .map(t -> t.getText().toString())
                .collect(Collectors.toList());
    } catch (IOException e) {
        throw new RuntimeException();
    }
}



  • 關(guān)于client這個(gè)Bean,再補(bǔ)充下:
@SpringBootApplication
public class HotelDemoApplication {

    public static void main(String[] args) {
        SpringApplication.run(HotelDemoApplication.class, args);
    }

    @Bean
    public RestHighLevelClient client(){
        return new RestHighLevelClient(RestClient.builder(
                HttpHost.create("http://10.4.130.220:9200")
        ));
    }

}

重啟服務(wù),看下效果:

【ElasticSearch】ES自動(dòng)補(bǔ)全查詢與Java接口實(shí)現(xiàn),ElasticSearch,elasticsearch,大數(shù)據(jù),搜索引擎

自動(dòng)補(bǔ)全成功實(shí)現(xiàn)?。?span toymoban-style="hidden">文章來源地址http://www.zghlxwxcb.cn/news/detail-557260.html

到了這里,關(guān)于【ElasticSearch】ES自動(dòng)補(bǔ)全查詢與Java接口實(shí)現(xiàn)的文章就介紹完了。如果您還想了解更多內(nèi)容,請(qǐng)?jiān)谟疑辖撬阉鱐OY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!

本文來自互聯(lián)網(wǎng)用戶投稿,該文觀點(diǎn)僅代表作者本人,不代表本站立場(chǎng)。本站僅提供信息存儲(chǔ)空間服務(wù),不擁有所有權(quán),不承擔(dān)相關(guān)法律責(zé)任。如若轉(zhuǎn)載,請(qǐng)注明出處: 如若內(nèi)容造成侵權(quán)/違法違規(guī)/事實(shí)不符,請(qǐng)點(diǎn)擊違法舉報(bào)進(jìn)行投訴反饋,一經(jīng)查實(shí),立即刪除!

領(lǐng)支付寶紅包贊助服務(wù)器費(fèi)用

相關(guān)文章

覺得文章有用就打賞一下文章作者

支付寶掃一掃打賞

博客贊助

微信掃一掃打賞

請(qǐng)作者喝杯咖啡吧~博客贊助

支付寶掃一掃領(lǐng)取紅包,優(yōu)惠每天領(lǐng)

二維碼1

領(lǐng)取紅包

二維碼2

領(lǐng)紅包