国产无码综合区,色欲AV无码国产永久播放,无码天堂亚洲国产AV,国产日韩欧美女同一区二区

<del id="bajob"><b id="bajob"></b></del>

chatglm實現(xiàn)基于知識庫問答的應(yīng)用

2年前作者：Charlotte_jc分類：Toy博客閱讀(19)違法舉報

這篇具有很好參考價值的文章主要介紹了chatglm實現(xiàn)基于知識庫問答的應(yīng)用。希望對大家有所幫助。如果存在錯誤或未考慮完全的地方，請大家不吝賜教，您也可以點擊"舉報違法"按鈕提交疑問。

背景

目前由于ChatGPT橫空出世，互聯(lián)網(wǎng)如雨后春筍冒出了非常多的類ChatGPT的大型語言模型。但是對于這些語言模型，我們應(yīng)該如何將它應(yīng)用到我們實際的生產(chǎn)中需要一個更加成熟的解決方案。

介紹

本文旨在通過介紹ChatGLM的使用來講述如何將一個開源的語言模型應(yīng)用于智能問答，知識庫問答的場景中，通過一系列實操例子來理解整個應(yīng)用思路。

前期準(zhǔn)備

一個開源語言模型，這里推薦ChatGLM-6B，開源的、支持中英雙語的對話語言模型，并且要求的顯存內(nèi)存非常低，可以在個人PC中輕松部署。
python3.8+
milvus，向量索引庫
pytorch以及運行ChatGLM-6B所需要的CUDA和NVIDIA驅(qū)動
…

基于文檔的知識庫問答

實現(xiàn)步驟

清洗知識庫文檔，將文檔向量化并存入向量數(shù)據(jù)庫
用戶提問
將用戶提問向量化并查詢向量數(shù)據(jù)庫得到匹配的N條知識
將匹配的知識構(gòu)建prompt，并通過langchain處理用戶的問題
調(diào)用llm搭配prompt回答用戶的問題

向量索引

我們首先需要定義一個向量索引庫，在這里我選用的是milvus作為向量索引庫來實現(xiàn)我們的文檔向量索引和相似度匹配的工作

為了更方便的部署，這里我采用了docker-compose來啟動milvus服務(wù)。

大家可以在milvus的官方文檔中看到最新版本的部署方式Install Milvus Standalone with Docker Compose
嫌麻煩也可以直接復(fù)制使用下面的yaml文件

version: '3.5'

services:
  etcd:
    container_name: milvus-etcd
    image: quay.io/coreos/etcd:v3.5.0
    environment:
      - ETCD_AUTO_COMPACTION_MODE=revision
      - ETCD_AUTO_COMPACTION_RETENTION=1000
      - ETCD_QUOTA_BACKEND_BYTES=4294967296
      - ETCD_SNAPSHOT_COUNT=50000
    volumes:
      - ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/etcd:/etcd
    command: etcd -advertise-client-urls=http://127.0.0.1:2379 -listen-client-urls http://0.0.0.0:2379 --data-dir /etcd

  minio:
    container_name: milvus-minio
    image: minio/minio:RELEASE.2023-03-20T20-16-18Z
    environment:
      MINIO_ACCESS_KEY: minioadmin
      MINIO_SECRET_KEY: minioadmin
    volumes:
      - ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/minio:/minio_data
    command: minio server /minio_data
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"]
      interval: 30s
      timeout: 20s
      retries: 3

  standalone:
    container_name: milvus-standalone
    image: milvusdb/milvus:v2.2.5
    command: ["milvus", "run", "standalone"]
    environment:
      ETCD_ENDPOINTS: etcd:2379
      MINIO_ADDRESS: minio:9000
    volumes:
      - ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/milvus:/var/lib/milvus
    ports:
      - "19530:19530"
      - "9091:9091"
    depends_on:
      - "etcd"
      - "minio"

networks:
  default:
    name: milvus

當(dāng)我們創(chuàng)建好docker-compose.yml文件之后就可以使用命令行docker-compose up -d來啟動milvus服務(wù)。

接下來就是文檔預(yù)處理

文檔預(yù)處理

當(dāng)我們收集到足夠的文檔之后，我們需要對文檔進行一些清洗，方便我們之后的向量匹配更加精準(zhǔn)。
這里，我們需要完成以下步驟：

連接milvus向量庫
創(chuàng)建對應(yīng)的connection
遍歷讀取文檔
文檔預(yù)處理
文檔內(nèi)容轉(zhuǎn)向量
存入向量庫

為此，我們編寫代碼如下

import os
import re
import jieba
import torch
import pandas as pd
from pymilvus import utility
from pymilvus import connections, CollectionSchema, FieldSchema, Collection, DataType
from transformers import AutoTokenizer, AutoModel

connections.connect(
    alias="default",
    host='localhost',
    port='19530'
)

# 定義集合名稱和維度
collection_name = "document"
dimension = 768
docs_folder = "./knowledge/"

tokenizer = AutoTokenizer.from_pretrained("bert-base-chinese")
model = AutoModel.from_pretrained("bert-base-chinese")


# 獲取文本的向量
def get_vector(text):
    input_ids = tokenizer(text, padding=True, truncation=True, return_tensors="pt")["input_ids"]
    with torch.no_grad():
        output = model(input_ids)[0][:, 0, :].numpy()
    return output.tolist()[0]


def create_collection():
    # 定義集合字段
    fields = [
        FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, auto_id=True, description="primary id"),
        FieldSchema(name="title", dtype=DataType.VARCHAR, max_length=50),
        FieldSchema(name="content", dtype=DataType.VARCHAR, max_length=10000),
        FieldSchema(name="vector", dtype=DataType.FLOAT_VECTOR, dim=dimension),
    ]

    # 定義集合模式
    schema = CollectionSchema(fields=fields, description="collection schema")

    # 創(chuàng)建集合

    if utility.has_collection(collection_name):
    	# 如果你想繼續(xù)添加新的文檔可以直接 return。但你想要重新創(chuàng)建collection，就可以執(zhí)行下面的代碼
        # return
        utility.drop_collection(collection_name)
        collection = Collection(name=collection_name, schema=schema, using='default', shards_num=2)
        # 創(chuàng)建索引
        default_index = {"index_type": "IVF_FLAT", "params": {"nlist": 2048}, "metric_type": "IP"}
        collection.create_index(field_name="vector", index_params=default_index)
        print(f"Collection {collection_name} created successfully")
    else:
        collection = Collection(name=collection_name, schema=schema, using='default', shards_num=2)
        # 創(chuàng)建索引
        default_index = {"index_type": "IVF_FLAT", "params": {"nlist": 2048}, "metric_type": "IP"}
        collection.create_index(field_name="vector", index_params=default_index)
        print(f"Collection {collection_name} created successfully")


def init_knowledge():
    collection = Collection(collection_name)
    # 遍歷指定目錄下的所有文件，并導(dǎo)入到 Milvus 集合中
    docs = []
    for root, dirs, files in os.walk(docs_folder):
        for file in files:
            # 只處理以 .txt 結(jié)尾的文本文件
            if file.endswith(".txt"):
                file_path = os.path.join(root, file)
                with open(file_path, "r", encoding="utf-8") as f:
                    content = f.read()
                # 對文本進行清洗處理
                content = re.sub(r"\s+", " ", content)
                title = os.path.splitext(file)[0]
                # 分詞
                words = jieba.lcut(content)
                # 將分詞后的文本重新拼接成字符串
                content = " ".join(words)
                # 獲取文本向量
                vector = get_vector(title + content)
                docs.append({"title": title, "content": content, "vector": vector})

    # 將文本內(nèi)容和向量通過 DataFrame 一起導(dǎo)入集合中
    df = pd.DataFrame(docs)
    collection.insert(df)
    print("Documents inserted successfully")


if __name__ == "__main__":
    create_collection()
    init_knowledge()

可以看到，我們創(chuàng)建了一個名為document的collection。它包含四個字段id，title，content，vector其中vector儲存的是content轉(zhuǎn)化的向量。（當(dāng)然，我們只是簡單的實現(xiàn)了一個最原始的向量索引，如果你想要之后的匹配更加精準(zhǔn)更加高效，你可以考慮將大文檔按照段落切割并分別轉(zhuǎn)化為向量，并且相互關(guān)聯(lián)上。）

于此同時，我們采用了jieba作為分詞庫，對文本進行清洗，還使用了正則去除了文檔中不必要的一些特殊符號。這些操作可以讓我們向量匹配更加精準(zhǔn)。

當(dāng)這些步驟全部執(zhí)行完畢之后，我們就可以進行用戶提問匹配向量庫的操作了。

用戶提問匹配知識庫

首先，我們需要將用戶提供的查詢向量轉(zhuǎn)換為blob對象，以便與數(shù)據(jù)庫中的向量進行比較。我們在上個步驟實現(xiàn)了get_vector方法來將文本轉(zhuǎn)為向量，現(xiàn)在可以繼續(xù)調(diào)用該方法來實現(xiàn)。

其次我們需要將問題轉(zhuǎn)化的向量用來查找向量庫，并得出最為匹配的幾個結(jié)果。編寫代碼如下：

import torch
from document_preprocess import get_vector
from pymilvus import Collection

collection = Collection("document")  # Get an existing collection.
collection.load()
DEVICE = "cuda" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu"


# 定義查詢函數(shù)
def search_similar_text(input_text):
    # 將輸入文本轉(zhuǎn)換為向量
    input_vector = get_vector(input_text)
	# 查詢前三個最匹配的向量ID
    similarity = collection.search(
        data=[input_vector],
        anns_field="vector",
        param={"metric_type": "IP", "params": {"nprobe": 10}, "offset": 0},
        limit=3,
        expr=None,
        consistency_level="Strong"
    )
    ids = similarity[0].ids
    # 通過ID查詢出對應(yīng)的知識庫文檔
    res = collection.query(
        expr=f"id in {ids}",
        offset=0,
        limit=3,
        output_fields=["id", "content", "title"],
        consistency_level="Strong"
    )
    print(res)
    return res


if __name__ == "__main__":
	question = input('Please enter your question: ')
    search_similar_text(question)

上面我們通過向量索引庫計算查詢出了與問題最為接近的文檔并打印了出來，接下來就到了最終的獲取模型回答的環(huán)節(jié)了。

通過提示模板獲取準(zhǔn)確回答

在這一步，我們需要加載ChatGLM的預(yù)訓(xùn)練模型，并獲取回答。

from transformers import AutoModel, AutoTokenizer
from knowledge_query import search_similar_text


tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm-6b-int4", trust_remote_code=True)
model = AutoModel.from_pretrained("THUDM/chatglm-6b-int4", trust_remote_code=True).half().cuda()
model = model.eval()


def predict(input, max_length=2048, top_p=0.7, temperature=0.95, history=[]):
	res = search_similar_text(input)
	prompt_template = f"""基于以下已知信息，簡潔和專業(yè)的來回答用戶的問題。
如果無法從中得到答案，請說 "當(dāng)前會話僅支持解決一個類型的問題，請清空歷史信息重試"，不允許在答案中添加編造成分，答案請使用中文。

已知內(nèi)容:
{res}

問題:
{input}
"""
	query = prompt_template
	for response, history in model.stream_chat(tokenizer, query, history, max_length=max_length, top_p=top_p,
	                                           temperature=temperature):
	    chatbot[-1] = (parse_text(input), parse_text(response))
	
	    yield chatbot, history

上面使用了提示模板的方式，將我們查詢出來的文檔作為提示內(nèi)容交給模型進行推理回答。到此，我們就簡單實現(xiàn)了一個基于知識庫的問答應(yīng)用。

如果你想在web上像chatgpt一樣提問，也可以豐富一下上面的代碼

from transformers import AutoModel, AutoTokenizer
import gradio as gr
import mdtex2html

from knowledge_query import search_similar_text

tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm-6b-int4", trust_remote_code=True)
model = AutoModel.from_pretrained("THUDM/chatglm-6b-int4", trust_remote_code=True).half().cuda()
model = model.eval()
is_knowledge = True

"""Override Chatbot.postprocess"""


def postprocess(self, y):
    if y is None:
        return []
    for i, (message, response) in enumerate(y):
        y[i] = (
            None if message is None else mdtex2html.convert((message)),
            None if response is None else mdtex2html.convert(response),
        )
    return y


gr.Chatbot.postprocess = postprocess


def parse_text(text):
    """copy from https://github.com/GaiZhenbiao/ChuanhuChatGPT/"""
    lines = text.split("\n")
    lines = [line for line in lines if line != ""]
    count = 0
    for i, line in enumerate(lines):
        if "```" in line:
            count += 1
            items = line.split('`')
            if count % 2 == 1:
                lines[i] = f'<pre><code class="language-{items[-1]}">'
            else:
                lines[i] = f'<br></code></pre>'
        else:
            if i > 0:
                if count % 2 == 1:
                    line = line.replace("`", "\`")
                    line = line.replace("<", "&lt;")
                    line = line.replace(">", "&gt;")
                    line = line.replace(" ", "&nbsp;")
                    line = line.replace("*", "&ast;")
                    line = line.replace("_", "&lowbar;")
                    line = line.replace("-", "&#45;")
                    line = line.replace(".", "&#46;")
                    line = line.replace("!", "&#33;")
                    line = line.replace("(", "&#40;")
                    line = line.replace(")", "&#41;")
                    line = line.replace("$", "&#36;")
                lines[i] = "<br>"+line
    text = "".join(lines)
    return text


def predict(input, chatbot, max_length, top_p, temperature, history):
    global is_knowledge

    chatbot.append((parse_text(input), ""))
    query = input
    if is_knowledge:
        res = search_similar_text(input)
        prompt_template = f"""基于以下已知信息，簡潔和專業(yè)的來回答用戶的問題。
如果無法從中得到答案，請說 "當(dāng)前會話僅支持解決一個類型的問題，請清空歷史信息重試"，不允許在答案中添加編造成分，答案請使用中文。

已知內(nèi)容:
{res}

問題:
{input}
"""
        query = prompt_template
        is_knowledge = False
    for response, history in model.stream_chat(tokenizer, query, history, max_length=max_length, top_p=top_p,
                                               temperature=temperature):
        chatbot[-1] = (parse_text(input), parse_text(response))

        yield chatbot, history


def reset_user_input():
    return gr.update(value='')


def reset_state():
    global is_knowledge

    is_knowledge = False
    return [], []


with gr.Blocks() as demo:
    gr.HTML("""<h1 align="center">ChatGLM</h1>""")

    chatbot = gr.Chatbot()
    with gr.Row():
        with gr.Column(scale=4):
            with gr.Column(scale=12):
                user_input = gr.Textbox(show_label=False, placeholder="Input...", lines=10).style(
                    container=False)
            with gr.Column(min_width=32, scale=1):
                submitBtn = gr.Button("Submit", variant="primary")
        with gr.Column(scale=1):
            emptyBtn = gr.Button("Clear History")
            max_length = gr.Slider(0, 4096, value=2048, step=1.0, label="Maximum length", interactive=True)
            top_p = gr.Slider(0, 1, value=0.7, step=0.01, label="Top P", interactive=True)
            temperature = gr.Slider(0, 1, value=0.95, step=0.01, label="Temperature", interactive=True)

    history = gr.State([])

    submitBtn.click(predict, [user_input, chatbot, max_length, top_p, temperature, history], [chatbot, history],
                    show_progress=True)
    submitBtn.click(reset_user_input, [], [user_input])

    emptyBtn.click(reset_state, outputs=[chatbot, history], show_progress=True)

demo.queue().launch(share=False, inbrowser=True)

把ChatGLM中的web_demo代碼簡單改寫，我們就得到了一個一模一樣的前端應(yīng)用，不同的是它現(xiàn)在可以基于我們的知識庫來回答問題。

小結(jié)

上述內(nèi)容僅僅介紹了最簡單的通過向量索引庫加AI模型加提示工程來實現(xiàn)知識庫問答的方案，其中向量索引和文檔的處理非常原始與粗糙，想要實現(xiàn)更加精準(zhǔn)的匹配還需要根據(jù)實際文檔內(nèi)容和場景來進行修改。

相關(guān)代碼已上傳github knowledge_with_chatglm感興趣的同學(xué)可以 clone 下來跑一跑

使用langchain改進代碼

langchain最為目前非?；鸬拈_源庫，用于知識庫問答也能極大的增加開發(fā)效率并且降低工作量。例如上述的文檔預(yù)處理和用戶提問匹配知識庫兩個步驟，我們用了很多代碼編寫來實現(xiàn)這個功能。但是當(dāng)我們使用langchain之后就變得簡單起來，下面給出代碼示例：

from langchain.vectorstores import Milvus
from langchain.document_loaders import DirectoryLoader, TextLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.embeddings import HuggingFaceEmbeddings

# 加載文件夾中的所有txt類型的文件
loader = DirectoryLoader('./knowledge/', glob='**/*.txt', show_progress=True, loader_cls=TextLoader,
                         loader_kwargs={"encoding": "utf-8"})
documents = loader.load()

# 初始化加載器
text_splitter = CharacterTextSplitter(chunk_size=100, chunk_overlap=0)
# 切割加載的 document
split_docs = text_splitter.split_documents(documents)

embeddings = HuggingFaceEmbeddings(model_name="shibing624/text2vec-base-chinese")
vector_db = Milvus.from_documents(split_docs, embeddings, connection_args={"host": "127.0.0.1", "port": "19530"},
                                  collection_name="langchain_knowledge", drop_old=True)

通過上述代碼可以看到，langchain為我們封裝好了非常多的工具。例如DirectoryLoader和TextLoader可以直接讓我們加載文檔，配合CharacterTextSplitter可以將加載的文檔分割成設(shè)定好的一片一片的集合。與此同時使用langchain提供的向量數(shù)據(jù)庫工具，可以輕松將文檔向量化并持久化儲存。這樣僅僅六行代碼我們就完成了之前幾十行代碼才能完成的工作，且不必考慮如何創(chuàng)建字段，維護數(shù)據(jù)庫等。文章來源地址http://www.zghlxwxcb.cn/news/detail-463164.html

到了這里，關(guān)于chatglm實現(xiàn)基于知識庫問答的應(yīng)用的文章就介紹完了。如果您還想了解更多內(nèi)容，請在右上角搜索TOY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章，希望大家以后多多支持TOY模板網(wǎng)！

本文來自互聯(lián)網(wǎng)用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務(wù)，不擁有所有權(quán)，不承擔(dān)相關(guān)法律責(zé)任。如若轉(zhuǎn)載，請注明出處：如若內(nèi)容造成侵權(quán)/違法違規(guī)/事實不符，請點擊違法舉報進行投訴反饋，一經(jīng)查實，立即刪除！

分享到：

領(lǐng)支付寶紅包贊助服務(wù)器費用

AIGC：【LLM（四）】——LangChain+ChatGLM:本地知識庫問答方案
LangChain+ChatGLM項目(https://github.com/chatchat-space/langchain-ChatGLM)實現(xiàn)原理如下圖所示 (與基于文檔的問答大同小異，過程包括：1 加載文檔 - 2 讀取文檔 - 3/4文檔分割 - 5/6 文本向量化 - 8/9 問句向量化 - 10 在文檔向量中匹配出與問句向量最相似的top k個 - 11/12/13 匹配出的文本作為上下
2024年02月13日
瀏覽(90)
ai聊天問答知識庫機器人源碼，基于gpt實現(xiàn)的本地知識庫問答實現(xiàn)，聊天對話效果，發(fā)送回復(fù)以及流式輸出...
現(xiàn)在基于gpt做自己項目的問答機器人，效果非常的好?？梢园炎约旱奈臋n上傳上去，讓機器人根據(jù)文檔來進行回答。想要實現(xiàn)智能AI問答功能，現(xiàn)在大部分都是基于向量數(shù)據(jù)庫的形式。整體的流程就是：上傳文檔===openai向量接口 ==== 存入向量數(shù)據(jù)庫訪客咨詢：? 咨詢問題
2024年02月10日
瀏覽(37)
開源大模型ChatGLM2-6B 2. 跟著LangChain參考文檔搭建LLM+知識庫問答系統(tǒng)
租用了1臺GPU服務(wù)器，系統(tǒng) ubuntu20，Tesla V100-16GB （GPU服務(wù)器已經(jīng)關(guān)機結(jié)束租賃了） SSH地址：* 端口：17520 SSH賬戶：root 密碼：Jaere7pa 內(nèi)網(wǎng)： 3389 ，外網(wǎng)：17518 VNC地址：* 端口：17519 VNC用戶名：root 密碼：Jaere7pa 硬件需求，ChatGLM-6B和ChatGLM2-6B相當(dāng)。量化等級?? ?最低 GPU 顯存 F
2024年02月03日
瀏覽(32)
【大模型應(yīng)用開發(fā)教程】04_大模型開發(fā)整體流程 & 基于個人知識庫的問答助手項目流程架構(gòu)解析
項目倉庫地址項目學(xué)習(xí)地址定義將開發(fā)以LLM為功能核心，通過LLM的強大理解能力和生成能力，結(jié)合特殊的數(shù)據(jù)或業(yè)務(wù)邏輯來提供獨特功能的應(yīng)用。核心點通過調(diào)用 API 或開源模型來實現(xiàn)核心的理解與生成通過 Prompt Enginnering 來實現(xiàn)大語言模型的控制在大模型開發(fā)中，我們
2024年02月05日
瀏覽(27)
LLMs之RAG：LangChain-Chatchat(一款中文友好的全流程本地知識庫問答應(yīng)用)的簡介(支持 FastChat 接入的ChatGLM-2/LLaMA-2等多款主流LLMs+多款embe
LLMs之RAG：LangChain-Chatchat(一款中文友好的全流程本地知識庫問答應(yīng)用)的簡介(支持?FastChat 接入的ChatGLM-2/LLaMA-2等多款主流LLMs+多款embedding模型m3e等+多種TextSplitter分詞器)、安裝(鏡像部署【AutoDL云平臺/Docker鏡像】，離線私有部署+支持RTX3090 ，支持FAISS/Milvus/PGVector向量庫，基于
2024年02月08日
瀏覽(25)
【ChatGLM】基于 ChatGLM-6B + langchain 實現(xiàn)本地化知識庫檢索與智能答案生成: 中文 LangChain 項目的實現(xiàn)開源工作
? 目錄【ChatGLM】基于 ChatGLM-6B + langchain 實現(xiàn)本地化知識庫檢索與智能答案生成: 中文 LangChain 項目的實現(xiàn)開源工作 1.克隆源代碼：
2024年02月11日
瀏覽(55)
Chinese-LangChain：基于ChatGLM-6b+langchain實現(xiàn)本地化知識庫檢索與智能答案生成
Chinese-LangChain：中文langchain項目，基于ChatGLM-6b+langchain實現(xiàn)本地化知識庫檢索與智能答案生成 https://github.com/yanqiangmiffy/Chinese-LangChain 俗稱：小必應(yīng)，Q.Talk，強聊，QiangTalk ?? 2023/04/19 引入ChuanhuChatGPT皮膚 ?? 2023/04/19 增加web search功能，需要確保網(wǎng)絡(luò)暢通！ ?? 2023/04/18 webui增加知
2024年02月06日
瀏覽(33)
基于LangChain+LLM的本地知識庫問答：從企業(yè)單文檔問答到批量文檔問答
過去半年，隨著ChatGPT的火爆，直接帶火了整個LLM這個方向，然LLM畢竟更多是基于過去的經(jīng)驗數(shù)據(jù)預(yù)訓(xùn)練而來，沒法獲取最新的知識，以及各企業(yè)私有的知識為了獲取最新的知識，ChatGPT plus版集成了bing搜索的功能，有的模型則會調(diào)用一個定位于 “鏈接各種AI模型、工具”的
2024年02月07日
瀏覽(46)
【LangChain學(xué)習(xí)】基于PDF文檔構(gòu)建問答知識庫（二）創(chuàng)建項目
這里我們使用到 fastapi 作為項目的web框架，它是一個快速（高性能）的 web 框架，上手簡單。我們在IDE中，左側(cè)選擇 FastAPI ，右側(cè)選擇創(chuàng)建一個新的虛擬環(huán)境。 ?創(chuàng)建成功，會有一個main.py，這是項目的入口文件。 ?我們運行一下看看有沒有報錯，沒問題的話，那么我們整合
2024年02月13日
瀏覽(103)
【LangChain學(xué)習(xí)】基于PDF文檔構(gòu)建問答知識庫（一）前期準(zhǔn)備
這系列主要介紹如何使用LangChain大模型，結(jié)合ChatGPT3.5，基于PDF文檔構(gòu)建專屬的問答知識庫。 LangChain 和 OpenAI 本身可支持 Nodejs 和 Python 兩個版本，筆者后續(xù)的介紹主要用到Python版本，如果有需要Nodejs版本的同學(xué)，也可以給我留言，因為Nodejs版本我也實現(xiàn)了。 Python 版本為 ≥
2024年02月13日
瀏覽(102)