国产 无码 综合区,色欲AV无码国产永久播放,无码天堂亚洲国产AV,国产日韩欧美女同一区二区

基于Llama2和LangChain構(gòu)建本地化定制化知識庫AI聊天機器人

這篇具有很好參考價值的文章主要介紹了基于Llama2和LangChain構(gòu)建本地化定制化知識庫AI聊天機器人。希望對大家有所幫助。如果存在錯誤或未考慮完全的地方,請大家不吝賜教,您也可以點擊"舉報違法"按鈕提交疑問。

參考:

本項目?https://github.com/PromtEngineer/localGPT

模型?https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGML

云端知識庫項目:基于GPT-4和LangChain構(gòu)建云端定制化PDF知識庫AI聊天機器人_Entropy-Go的博客-CSDN博客?

1. 摘要

????????相比OpenAI的LLM ChatGPT模型必須網(wǎng)絡連接并通過API key云端調(diào)用模型,擔心數(shù)據(jù)隱私安全。基于Llama2和LangChain構(gòu)建本地化定制化知識庫AI聊天機器人,是將訓練好的LLM大語言模型本地化部署,在沒有網(wǎng)絡連接的情況下對你的文件提問。100%私有化本地化部署,任何時候都不會有數(shù)據(jù)離開您的運行環(huán)境。你可以在沒有網(wǎng)絡連接的情況下獲取文件和提問!????????

????????介紹一款尖端應用,使用戶能夠在沒有互聯(lián)網(wǎng)連接的情況下利用語言模型的功能。這款先進工具作為一個不可或缺的資源,幫助用戶在超越傳統(tǒng)語言模型工具(如ChatGPT)的限制之外獲取信息。

????????這個應用的一個關鍵優(yōu)勢在于數(shù)據(jù)控制的保留。當處理需要保持在組織內(nèi)部或具有最高機密性的個人文件時,這個功能尤為重要,消除了通過第三方渠道傳輸信息的需求。

????????將個人文件無縫集成到系統(tǒng)中非常簡單,確保用戶體驗流暢。無論是文本、PDF、CSV還是Excel文件,用戶都可以方便地提供所需查詢的信息。該應用程序快速處理這些文檔,有效地創(chuàng)建了一個全面的數(shù)據(jù)庫供模型利用,實現(xiàn)準確而深入的回答。

????????這種方法的一個顯著優(yōu)勢在于其高效的資源利用。與替代方法中資源密集型的重新訓練過程不同,這個應用程序中的文檔攝取要求更少的計算資源。這種效率優(yōu)化可以實現(xiàn)簡化的用戶體驗,節(jié)省時間和計算資源。

????????體驗這個技術奇跡的無與倫比的能力,使用戶能夠在離線狀態(tài)下充分發(fā)揮語言模型的潛力。迎接信息獲取的新時代,提高生產(chǎn)力,拓展可能性。擁抱這個強大的工具,釋放您的數(shù)據(jù)的真正潛力。

2. 準備工作

2.1 Meta's Llama 2 7b Chat GGML

These files are GGML format model files for?Meta's Llama 2 7b Chat.

GGML files are for CPU + GPU inference using?llama.cpp?and libraries and UIs which support this format

2.2 安裝Conda

CentOS 上快速安裝包管理工具Conda_Entropy-Go的博客-CSDN博客

2.3 升級gcc

CentOS gcc介紹及快速升級_Entropy-Go的博客-CSDN博客

3. 克隆或下載項目localGPT

git clone https://github.com/PromtEngineer/localGPT.git

4. 安裝依賴包

4.1 Conda安裝并激活

conda create -n localGPT
conda activate localGPT

4.2 安裝依賴包

如果Conda環(huán)境變量正常設置,直接pip install

pip install -r requirements.txt

否則會使用系統(tǒng)自帶的python,可以使用Conda的安裝的絕對路徑執(zhí)行,后續(xù)都必須使用Conda的python

whereis conda
conda: /root/miniconda3/bin/conda /root/miniconda3/condabin/conda
/root/miniconda3/bin/pip install -r requirements.txt

安裝時如遇下面問題,參考2.3 gcc升級,建議升級至gcc 11

ERROR: Could not build wheels for llama-cpp-python, hnswlib, lxml, which is required to install pyproject.toml-based project

5. 添加文檔為知識庫

5.1 文檔目錄以及模板文檔

可以替換成需要的文檔

~localGPT/SOURCE_DOCUMENTS/constitution.pdf

注入前驗證下help,如前面提到,建議直接使用Conda絕對路徑的python

/root/miniconda3/bin/python ingest.py --help
Usage: ingest.py [OPTIONS]

Options:
  --device_type [cpu|cuda|ipu|xpu|mkldnn|opengl|opencl|ideep|hip|ve|fpga|ort|xla|lazy|vulkan|mps|meta|hpu|mtia]
                                  Device to run on. (Default is cuda)
  --help                          Show this message and exit.

5.2 開始注入文檔

默認使用cuda/GPU

/root/miniconda3/bin/python ingest.py

可以指定CPU

/root/miniconda3/bin/python ingest.py --device_type cpu

首次注入時,會下載對應的矢量數(shù)據(jù)DB,矢量數(shù)據(jù)DB會存放到??/root/localGPT/DB

首次注入過程

/root/miniconda3/bin/python ingest.py
2023-08-18 09:36:55,389 - INFO - ingest.py:122 - Loading documents from /root/localGPT/SOURCE_DOCUMENTS
all files: ['constitution.pdf']
2023-08-18 09:36:55,398 - INFO - ingest.py:34 - Loading document batch
2023-08-18 09:36:56,818 - INFO - ingest.py:131 - Loaded 1 documents from /root/localGPT/SOURCE_DOCUMENTS
2023-08-18 09:36:56,818 - INFO - ingest.py:132 - Split into 72 chunks of text
2023-08-18 09:36:57,994 - INFO - SentenceTransformer.py:66 - Load pretrained SentenceTransformer: hkunlp/instructor-large
Downloading (…)c7233/.gitattributes: 100%|███████████████████████████████████████████████████████████████████████████| 1.48k/1.48k [00:00<00:00, 4.13MB/s]
Downloading (…)_Pooling/config.json: 100%|████████████████████████████████████████████████████████████████████████████████| 270/270 [00:00<00:00, 915kB/s]
Downloading (…)/2_Dense/config.json: 100%|████████████████████████████████████████████████████████████████████████████████| 116/116 [00:00<00:00, 380kB/s]
Downloading pytorch_model.bin: 100%|█████████████████████████████████████████████████████████████████████████████████| 3.15M/3.15M [00:01<00:00, 2.99MB/s]
Downloading (…)9fb15c7233/README.md: 100%|████████████████████████████████████████████████████████████████████████████| 66.3k/66.3k [00:00<00:00, 359kB/s]
Downloading (…)b15c7233/config.json: 100%|███████████████████████████████████████████████████████████████████████████| 1.53k/1.53k [00:00<00:00, 5.70MB/s]
Downloading (…)ce_transformers.json: 100%|████████████████████████████████████████████████████████████████████████████████| 122/122 [00:00<00:00, 485kB/s]
Downloading pytorch_model.bin: 100%|█████████████████████████████████████████████████████████████████████████████████| 1.34G/1.34G [03:15<00:00, 6.86MB/s]
Downloading (…)nce_bert_config.json: 100%|██████████████████████████████████████████████████████████████████████████████| 53.0/53.0 [00:00<00:00, 109kB/s]
Downloading (…)cial_tokens_map.json: 100%|███████████████████████████████████████████████████████████████████████████| 2.20k/2.20k [00:00<00:00, 8.96MB/s]
Downloading spiece.model: 100%|████████████████████████████████████████████████████████████████████████████████████████| 792k/792k [00:00<00:00, 3.46MB/s]
Downloading (…)c7233/tokenizer.json: 100%|███████████████████████████████████████████████████████████████████████████| 2.42M/2.42M [00:00<00:00, 3.01MB/s]
Downloading (…)okenizer_config.json: 100%|███████████████████████████████████████████████████████████████████████████| 2.41k/2.41k [00:00<00:00, 9.75MB/s]
Downloading (…)15c7233/modules.json: 100%|███████████████████████████████████████████████████████████████████████████████| 461/461 [00:00<00:00, 1.92MB/s]
load INSTRUCTOR_Transformer
2023-08-18 09:40:26,658 - INFO - instantiator.py:21 - Created a temporary directory at /tmp/tmp47gnnhwi
2023-08-18 09:40:26,658 - INFO - instantiator.py:76 - Writing /tmp/tmp47gnnhwi/_remote_module_non_scriptable.py
max_seq_length ?512
2023-08-18 09:40:30,076 - INFO - __init__.py:88 - Running Chroma using direct local API.
2023-08-18 09:40:30,248 - WARNING - __init__.py:43 - Using embedded DuckDB with persistence: data will be stored in: /root/localGPT/DB
2023-08-18 09:40:30,252 - INFO - ctypes.py:22 - Successfully imported ClickHouse Connect C data optimizations
2023-08-18 09:40:30,257 - INFO - json_impl.py:45 - Using python library for writing JSON byte strings
2023-08-18 09:40:30,295 - INFO - duckdb.py:454 - No existing DB found in /root/localGPT/DB, skipping load
2023-08-18 09:40:30,295 - INFO - duckdb.py:466 - No existing DB found in /root/localGPT/DB, skipping load
2023-08-18 09:40:32,800 - INFO - duckdb.py:414 - Persisting DB to disk, putting it in the save folder: /root/localGPT/DB
2023-08-18 09:40:32,813 - INFO - duckdb.py:414 - Persisting DB to disk, putting it in the save folder: /root/localGPT/DB

項目文件列表

ls
ACKNOWLEDGEMENT.md  CONTRIBUTING.md  ingest.py   localGPT_UI.py  README.md            run_localGPT.py
constants.py        DB               LICENSE     __pycache__     requirements.txt     SOURCE_DOCUMENTS
constitution.pdf    Dockerfile       localGPTUI  pyproject.toml  run_localGPT_API.py 

6. 運行知識庫AI聊天機器人

現(xiàn)在可以和你的本地化知識庫開始對話聊天了!

6.1 命令行方式運行提問

?首次運行時,會下載對應的默認模型 ~/localGPT/constants.py?

# model link: https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGML

MODEL_ID = "TheBloke/Llama-2-7B-Chat-GGML"

MODEL_BASENAME = "llama-2-7b-chat.ggmlv3.q4_0.bin"

模型會下載到?/root/.cache/huggingface/hub/models--TheBloke--Llama-2-7B-Chat-GGML

直接運行

/root/miniconda3/bin/python run_localGPT.py

對話輸入

支持英文,中文需要加utf-8進行處理

Enter a query:

對話記錄

/root/miniconda3/bin/python run_localGPT.py
2023-08-18 09:43:02,433 - INFO - run_localGPT.py:180 - Running on: cuda
2023-08-18 09:43:02,433 - INFO - run_localGPT.py:181 - Display Source Documents set to: False
2023-08-18 09:43:02,676 - INFO - SentenceTransformer.py:66 - Load pretrained SentenceTransformer: hkunlp/instructor-large
load INSTRUCTOR_Transformer
max_seq_length ?512
2023-08-18 09:43:05,301 - INFO - __init__.py:88 - Running Chroma using direct local API.
2023-08-18 09:43:05,317 - WARNING - __init__.py:43 - Using embedded DuckDB with persistence: data will be stored in: /root/localGPT/DB
2023-08-18 09:43:05,328 - INFO - ctypes.py:22 - Successfully imported ClickHouse Connect C data optimizations
2023-08-18 09:43:05,336 - INFO - json_impl.py:45 - Using python library for writing JSON byte strings
2023-08-18 09:43:05,402 - INFO - duckdb.py:460 - loaded in 72 embeddings
2023-08-18 09:43:05,405 - INFO - duckdb.py:472 - loaded in 1 collections
2023-08-18 09:43:05,406 - INFO - duckdb.py:89 - collection with name langchain already exists, returning existing collection
2023-08-18 09:43:05,406 - INFO - run_localGPT.py:45 - Loading Model: TheBloke/Llama-2-7B-Chat-GGML, on: cuda
2023-08-18 09:43:05,406 - INFO - run_localGPT.py:46 - This action can take a few minutes!
2023-08-18 09:43:05,406 - INFO - run_localGPT.py:50 - Using Llamacpp for GGML quantized models
Downloading (…)chat.ggmlv3.q4_0.bin: 100%|███████████████████████████████████████████████████████████████████████████| 3.79G/3.79G [09:53<00:00, 6.39MB/s]
llama.cpp: loading model from /root/.cache/huggingface/hub/models--TheBloke--Llama-2-7B-Chat-GGML/snapshots/b616819cd4777514e3a2d9b8be69824aca8f5daf/llama-2-7b-chat.ggmlv3.q4_0.bin
llama_model_load_internal: format ? ? = ggjt v3 (latest)
llama_model_load_internal: n_vocab ? ?= 32000
llama_model_load_internal: n_ctx ? ? ?= 2048
llama_model_load_internal: n_embd ? ? = 4096
llama_model_load_internal: n_mult ? ? = 256
llama_model_load_internal: n_head ? ? = 32
llama_model_load_internal: n_layer ? ?= 32
llama_model_load_internal: n_rot ? ? ?= 128
llama_model_load_internal: ftype ? ? ?= 2 (mostly Q4_0)
llama_model_load_internal: n_ff ? ? ? = 11008
llama_model_load_internal: n_parts ? ?= 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size = ? ?0.07 MB
llama_model_load_internal: mem required ?= 5407.71 MB (+ 1026.00 MB per state)
llama_new_context_with_model: kv self size ?= 1024.00 MB
AVX = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 1 | AVX512_VNNI = 1 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 |

Enter a query:

或者添加參數(shù)--show_sources,回答時顯示引用章節(jié)信息

/root/miniconda3/bin/python run_localGPT.py --show_sources

對話記錄:

/root/miniconda3/bin/python run_localGPT.py --show_sources
2023-08-18 10:03:55,466 - INFO - run_localGPT.py:180 - Running on: cuda
2023-08-18 10:03:55,466 - INFO - run_localGPT.py:181 - Display Source Documents set to: True
2023-08-18 10:03:55,708 - INFO - SentenceTransformer.py:66 - Load pretrained SentenceTransformer: hkunlp/instructor-large
load INSTRUCTOR_Transformer
max_seq_length ?512
2023-08-18 10:03:58,302 - INFO - __init__.py:88 - Running Chroma using direct local API.
2023-08-18 10:03:58,307 - WARNING - __init__.py:43 - Using embedded DuckDB with persistence: data will be stored in: /root/localGPT/DB
2023-08-18 10:03:58,312 - INFO - ctypes.py:22 - Successfully imported ClickHouse Connect C data optimizations
2023-08-18 10:03:58,318 - INFO - json_impl.py:45 - Using python library for writing JSON byte strings
2023-08-18 10:03:58,372 - INFO - duckdb.py:460 - loaded in 72 embeddings
2023-08-18 10:03:58,373 - INFO - duckdb.py:472 - loaded in 1 collections
2023-08-18 10:03:58,373 - INFO - duckdb.py:89 - collection with name langchain already exists, returning existing collection
2023-08-18 10:03:58,374 - INFO - run_localGPT.py:45 - Loading Model: TheBloke/Llama-2-7B-Chat-GGML, on: cuda
2023-08-18 10:03:58,374 - INFO - run_localGPT.py:46 - This action can take a few minutes!
2023-08-18 10:03:58,374 - INFO - run_localGPT.py:50 - Using Llamacpp for GGML quantized models
llama.cpp: loading model from /root/.cache/huggingface/hub/models--TheBloke--Llama-2-7B-Chat-GGML/snapshots/b616819cd4777514e3a2d9b8be69824aca8f5daf/llama-2-7b-chat.ggmlv3.q4_0.bin
llama_model_load_internal: format ? ? = ggjt v3 (latest)
llama_model_load_internal: n_vocab ? ?= 32000
llama_model_load_internal: n_ctx ? ? ?= 2048
llama_model_load_internal: n_embd ? ? = 4096
llama_model_load_internal: n_mult ? ? = 256
llama_model_load_internal: n_head ? ? = 32
llama_model_load_internal: n_layer ? ?= 32
llama_model_load_internal: n_rot ? ? ?= 128
llama_model_load_internal: ftype ? ? ?= 2 (mostly Q4_0)
llama_model_load_internal: n_ff ? ? ? = 11008
llama_model_load_internal: n_parts ? ?= 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size = ? ?0.07 MB
llama_model_load_internal: mem required ?= 5407.71 MB (+ 1026.00 MB per state)
llama_new_context_with_model: kv self size ?= 1024.00 MB
AVX = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 1 | AVX512_VNNI = 1 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 |

Enter a query: how many times could president act, and how many years as max?

llama_print_timings: ? ? ? ?load time = 19737.32 ms
llama_print_timings: ? ? ?sample time = ? 101.14 ms / ? 169 runs ? ( ? ?0.60 ms per token, ?1671.02 tokens per second)
llama_print_timings: prompt eval time = 19736.91 ms / ? 925 tokens ( ? 21.34 ms per token, ? ?46.87 tokens per second)
llama_print_timings: ? ? ? ?eval time = 36669.35 ms / ? 168 runs ? ( ?218.27 ms per token, ? ? 4.58 tokens per second)
llama_print_timings: ? ? ? total time = 56849.80 ms


> Question:
how many times could president act, and how many years as max?

> Answer:
?The answer to this question can be found in Amendment XXII and Amendment XXIII of the US Constitution. According to these amendments, a person cannot be elected President more than twice, and no person can hold the office of President for more than two years of a term to which someone else was elected President. However, if the President is unable to discharge their powers and duties due to incapacity, the Vice President will continue to act as President until Congress determines the issue.
In summary, a person can be elected President at most twice, and they cannot hold the office for more than two years of a term to which someone else was elected President. If the President becomes unable to discharge their powers and duties, the Vice President will continue to act as President until Congress makes a determination.
----------------------------------SOURCE DOCUMENTS---------------------------

> /root/localGPT/SOURCE_DOCUMENTS/constitution.pdf:
Amendment ?XXII.

Amendment ?XXIII.

Passed by Congress March 21, 1947. Ratified February 27,

Passed by Congress June 16, 1960. Ratified March 29, 1961.

951.

SECTION 1

...

SECTION 2

....

----------------------------------SOURCE DOCUMENTS---------------------------

Enter a query: exit

6.2 Web UI方式運行提問

6.2.1 啟動服務器端API

可以使用Web UI方式運行,啟動服務器端API在5110端口上進行監(jiān)聽服務

http://127.0.0.1:5110

/root/miniconda3/bin/python run_localGPT_API.py

如果執(zhí)行過程遇到下面問題,還是代碼中的python沒有使用Conda PATH下面的python導致的。

/root/miniconda3/bin/python run_localGPT_API.py
load INSTRUCTOR_Transformer
max_seq_length ?512
The directory does not exist
run_langest_commands ['python', 'ingest.py']
Traceback (most recent call last):
? File "/root/localGPT/run_localGPT_API.py", line 56, in <module>
? ? raise FileNotFoundError(
FileNotFoundError: No files were found inside SOURCE_DOCUMENTS, please put a starter file inside before starting the API!

可以修改~/localGPT/run_localGPT_API.py中的python為Conda下的路徑

run_langest_commands = ["python", "ingest.py"]

修改為

run_langest_commands = ["/root/miniconda3/bin/python", "ingest.py"]

運行過程

看到?INFO:werkzeug:? 表示啟動成功,窗口可以保留座位debug用途

/root/miniconda3/bin/python run_localGPT_API.py
load INSTRUCTOR_Transformer
max_seq_length ?512
WARNING:chromadb:Using embedded DuckDB with persistence: data will be stored in: /root/localGPT/DB
llama.cpp: loading model from /root/.cache/huggingface/hub/models--TheBloke--Llama-2-7B-Chat-GGML/snapshots/b616819cd4777514e3a2d9b8be69824aca8f5daf/llama-2-7b-chat.ggmlv3.q4_0.bin
llama_model_load_internal: format ? ? = ggjt v3 (latest)
llama_model_load_internal: n_vocab ? ?= 32000
llama_model_load_internal: n_ctx ? ? ?= 2048
llama_model_load_internal: n_embd ? ? = 4096
llama_model_load_internal: n_mult ? ? = 256
llama_model_load_internal: n_head ? ? = 32
llama_model_load_internal: n_layer ? ?= 32
llama_model_load_internal: n_rot ? ? ?= 128
llama_model_load_internal: ftype ? ? ?= 2 (mostly Q4_0)
llama_model_load_internal: n_ff ? ? ? = 11008
llama_model_load_internal: n_parts ? ?= 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size = ? ?0.07 MB
llama_model_load_internal: mem required ?= 5407.71 MB (+ 1026.00 MB per state)
llama_new_context_with_model: kv self size ?= 1024.00 MB
AVX = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 1 | AVX512_VNNI = 1 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 |
?* Serving Flask app 'run_localGPT_API'
?* Debug mode: on
INFO:werkzeug:WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
?* Running on http://127.0.0.1:5110
INFO:werkzeug:Press CTRL+C to quit
INFO:werkzeug: * Restarting with watchdog (inotify)

6.2.2 啟動服務器端UI

重新打開一個新的命令行終端,運行~/localGPT/localGPTUI/localGPTUI.py,啟動服務器端UI在5111端口上進行監(jiān)聽服務

http://127.0.0.1:5111

/root/miniconda3/bin/python localGPTUI.py

如需局域網(wǎng)訪問,修改localGPTUI.py,127.0.0.1 -> 0.0.0.0

parser.add_argument("--host", type=str, default="0.0.0.0",
                        help="Host to run the UI on. Defaults to 127.0.0.1. "
                             "Set to 0.0.0.0 to make the UI externally "
                             "accessible from other devices.")

運行記錄

/root/miniconda3/bin/python localGPTUI.py?
?* Serving Flask app 'localGPTUI'
?* Debug mode: on
WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
?* Running on all addresses (0.0.0.0)
?* Running on http://127.0.0.1:5111
?* Running on http://IP:5111

端口使用情況

netstat -nltp | grep 511
tcp        0      0 127.0.0.1:5110          0.0.0.0:*               LISTEN      57479/python
tcp        0      0 0.0.0.0:5111            0.0.0.0:*               LISTEN      21718/python

6.2.3 瀏覽器訪問Web UI

本機:?http://127.0.0.1:5111?

局域網(wǎng):?http://IP:5111

基于Llama2和LangChain構(gòu)建本地化定制化知識庫AI聊天機器人,隨筆,langchain,人工智能,llama

網(wǎng)頁端可以進行自由對話,支持中文輸入。

使用截圖

基于Llama2和LangChain構(gòu)建本地化定制化知識庫AI聊天機器人,隨筆,langchain,人工智能,llama

6.3 更換本地文檔為知識庫

6.3.1 命令行方式

直接將文檔添加到 ~/localGPT/SOURCE_DOCUMENTS/

會自動觸發(fā)更新適量數(shù)據(jù)庫,等更新好之后,就可以正常進行提問對話。

6.3.2 Web UI方式

上傳文件

1.?要上傳文檔以供應用程序攝取作為其新的知識庫,請單擊upload按鈕。

2.?選擇要用作新知識庫的文檔進行對話。

3.然后,系統(tǒng)會提示您選擇將文檔添加到知識庫,用您剛剛選擇的文檔重置知識庫,或者取消上傳。

4.?當文檔被輸入到矢量數(shù)據(jù)庫中作為新的知識庫時,會有很短的等待時間。?

基于Llama2和LangChain構(gòu)建本地化定制化知識庫AI聊天機器人,隨筆,langchain,人工智能,llama

正在注入中文文檔到知識庫

正在生成回復

結(jié)果返回

基于Llama2和LangChain構(gòu)建本地化定制化知識庫AI聊天機器人,隨筆,langchain,人工智能,llama

7.常見問題Troubleshooting

7.1 中文文檔注入

修改run_localGPT_API.py

max_ctx_size = 4096

修改ingest.py

text_splitter = RecursiveCharacterTextSplitter(chunk_size=800, chunk_overlap=200)

7.2 網(wǎng)頁打開后,問題無回復,response.status_code = 504, 304

如果環(huán)境使用了代理,在運行服務器端UI前,先去掉代理后再運行

unset http_proxy
unset https_proxy
unset ftp_proxy
/root/miniconda3/bin/python localGPTUI.py

7.3 locaGPT如何工作的

Selecting the right local models and the power of?LangChain?you can run the entire pipeline locally, without any data leaving your environment, and with reasonable performance.

  • ingest.py?uses?LangChain?tools to parse the document and create embeddings locally using?InstructorEmbeddings. It then stores the result in a local vector database using?Chroma?vector store.
  • run_localGPT.py?uses a local LLM to understand questions and create answers. The context for the answers is extracted from the local vector store using a similarity search to locate the right piece of context from the docs.
  • You can replace this local LLM with any other LLM from the HuggingFace. Make sure whatever LLM you select is in the HF format.

7.4 怎么選擇不同的LLM大語言模型

The following will provide instructions on how you can select a different LLM model to create your response:

  1. Open up?constants.py?in the editor of your choice.

  2. Change the?MODEL_ID?and?MODEL_BASENAME. If you are using a quantized model (GGML,?GPTQ), you will need to provide?MODEL_BASENAME. For unquatized models, set?MODEL_BASENAME?to?NONE

  3. There are a number of example models from HuggingFace that have already been tested to be run with the original trained model (ending with HF or have a .bin in its "Files and versions"), and quantized models (ending with GPTQ or have a .no-act-order or .safetensors in its "Files and versions").

  4. For models that end with HF or have a .bin inside its "Files and versions" on its HuggingFace page.

    • Make sure you have a model_id selected. For example ->?MODEL_ID = "TheBloke/guanaco-7B-HF"
    • If you go to its HuggingFace?repo?and go to "Files and versions" you will notice model files that end with a .bin extension.
    • Any model files that contain .bin extensions will be run with the following code where the?# load the LLM for generating Natural Language responses?comment is found.
    • MODEL_ID = "TheBloke/guanaco-7B-HF"
  5. For models that contain GPTQ in its name and or have a .no-act-order or .safetensors extension inside its "Files and versions on its HuggingFace page.

    • Make sure you have a model_id selected. For example -> model_id =?"TheBloke/wizardLM-7B-GPTQ"

    • You will also need its model basename file selected. For example ->?model_basename = "wizardLM-7B-GPTQ-4bit.compat.no-act-order.safetensors"

    • If you go to its HuggingFace?repo?and go to "Files and versions" you will notice a model file that ends with a .safetensors extension.

    • Any model files that contain no-act-order or .safetensors extensions will be run with the following code where the?# load the LLM for generating Natural Language responses?comment is found.

    • MODEL_ID = "TheBloke/WizardLM-7B-uncensored-GPTQ"

      MODEL_BASENAME = "WizardLM-7B-uncensored-GPTQ-4bit-128g.compat.no-act-order.safetensors"

  6. Comment out all other instances of?MODEL_ID="other model names",?MODEL_BASENAME=other base model names, and?llm = load_model(args*)

7.5 更多問題參考

Issues · PromtEngineer/localGPT · GitHub文章來源地址http://www.zghlxwxcb.cn/news/detail-712783.html

到了這里,關于基于Llama2和LangChain構(gòu)建本地化定制化知識庫AI聊天機器人的文章就介紹完了。如果您還想了解更多內(nèi)容,請在右上角搜索TOY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關文章,希望大家以后多多支持TOY模板網(wǎng)!

本文來自互聯(lián)網(wǎng)用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權(quán),不承擔相關法律責任。如若轉(zhuǎn)載,請注明出處: 如若內(nèi)容造成侵權(quán)/違法違規(guī)/事實不符,請點擊違法舉報進行投訴反饋,一經(jīng)查實,立即刪除!

領支付寶紅包贊助服務器費用

相關文章

  • AIGC生成式代碼——Code Llama 簡介、部署、測試、應用、本地化

    AIGC生成式代碼——Code Llama 簡介、部署、測試、應用、本地化

    ????????本文介紹了CodeLlama的 簡介、本地化部署、測試和應用實戰(zhàn)方案 ,幫助學習大語言模型的同學們更好地應用CodeLlama。我們詳細講解了如何將CodeLlama部署到實際應用場景中,并通過實例演示了如何使用CodeLlama進行代碼生成和優(yōu)化。最后,總結(jié)了CodeLlama的應用實戰(zhàn)經(jīng)驗

    2024年02月05日
    瀏覽(38)
  • LangChain 本地化方案 - 使用 ChatYuan-large-v2 作為 LLM 大語言模型

    LangChain 本地化方案 - 使用 ChatYuan-large-v2 作為 LLM 大語言模型

    ChatYuan-large-v2 是一個開源的支持中英雙語的功能型對話語言大模型,與其他 LLM 不同的是模型十分輕量化,并且在輕量化的同時效果相對還不錯,僅僅通過 0.7B 參數(shù)量就可以實現(xiàn) 10B 模型的基礎效果,正是其如此的輕量級,使其可以在普通顯卡、 CPU 、甚至手機上進行推理,而

    2024年02月16日
    瀏覽(32)
  • 【個人筆記本】本地化部署詳細流程 LLaMA中文模型:Chinese-LLaMA-Alpaca-2

    不推薦小白,環(huán)境配置比較復雜 下載原始模型:Chinese-LLaMA-Alpaca-2 linux部署llamacpp環(huán)境 使用llamacpp將Chinese-LLaMA-Alpaca-2模型轉(zhuǎn)換為gguf模型 windows部署Text generation web UI 環(huán)境 使用Text generation web UI 加載模型并進行對話 筆記本環(huán)境: 操作系統(tǒng):win11 CPU:AMD R7535HS GPU:筆記本4060顯卡

    2024年02月08日
    瀏覽(110)
  • LLMs之Vicuna:在Linux服務器系統(tǒng)上實Vicuna-7B本地化部署(基于facebookresearch的GitHub)進行模型權(quán)重合并(llama-7b模型與delta模型權(quán)重)、模型部

    LLMs之Vicuna:在Linux服務器系統(tǒng)上實Vicuna-7B本地化部署(基于facebookresearch的GitHub)進行模型權(quán)重合并(llama-7b模型與delta模型權(quán)重)、模型部

    LLMs之Vicuna:在Linux服務器系統(tǒng)上實Vicuna-7B本地化部署(基于facebookresearch的GitHub)進行模型權(quán)重合并(llama-7b模型與delta模型權(quán)重)、模型部署且實現(xiàn)模型推理全流程步驟的圖文教程(非常詳細) 導讀 :因為Vicuna的訓練成本很低,據(jù)說只需要$300左右,所以,還是有必要嘗試本地化部署

    2024年02月06日
    瀏覽(88)
  • 中文大語言模型 Llama-2 7B(或13B) 本地化部署 (國內(nèi)云服務器、GPU單卡16GB、中文模型、WEB頁面TextUI、簡單入門)

    中文大語言模型 Llama-2 7B(或13B) 本地化部署 (國內(nèi)云服務器、GPU單卡16GB、中文模型、WEB頁面TextUI、簡單入門)

    ? ? ? ? 本文目的是讓大家先熟悉模型的部署,簡單入門;所以只需要很小的算力,單臺服務器 單GPU顯卡(顯存不低于12GB),操作系統(tǒng)需要安裝 Ubuntu 18.04。 ? ? ? ? 準備一臺服務器 單張英偉達GPU顯卡(顯存不低于12GB),操作系統(tǒng)需要安裝 Ubuntu 18.04 (具體安裝過程忽略)

    2024年02月08日
    瀏覽(21)
  • 基于GitHub代碼庫訓練模型本地化AI代碼自動補全 - Tabby Windows10

    基于GitHub代碼庫訓練模型本地化AI代碼自動補全 - Tabby Windows10

    參考: https://github.com/TabbyML/tabby 已經(jīng)有好幾款類似強勁的代碼補全工具,如GitHub Copilot,Codeium等,為什么還要選擇Tabby? Tabby除了和其他工具一樣支持聯(lián)網(wǎng)直接使用之外, 還支持本地化部署 。 即對內(nèi)部代碼安全性要求很高時,可以采取Tabby項目模型的本地化部署,不用擔心本

    2024年02月02日
    瀏覽(84)
  • 基于GitHub代碼庫訓練模型本地化AI代碼自動補全 - Tabby Linux Debian/CentOS

    基于GitHub代碼庫訓練模型本地化AI代碼自動補全 - Tabby Linux Debian/CentOS

    參考: https://github.com/TabbyML/tabby Docker | Tabby Linux Debian上快速安裝Docker并運行_Entropy-Go的博客-CSDN博客 Tabby - 本地化AI代碼自動補全 - Windows10_Entropy-Go的博客-CSDN博客 已經(jīng)有好幾款類似強勁的代碼補全工具,如GitHub Copilot,Codeium等,為什么還要選擇Tabby? Tabby除了和其他工具一樣支

    2024年02月05日
    瀏覽(50)
  • 詳解dedecms織夢遠程圖片本地化https鏈接圖片無法本地化怎么解決

    最近有朋友遇到發(fā)布文章時候文章里面帶https的站外圖片無法本地化,以下是解決辦法: 找到? dede//inc/inc_archives_functions.php文件里面GetCurContent($body)這個函數(shù),里面 這一段改為: 第二步: 這一段改為: 搞定,這樣發(fā)文章就可以把https的遠程圖片也本地化了 以上就是本文的全

    2024年02月02日
    瀏覽(32)
  • Remix本地化,加載本地合約文件,本地鏈接Remix

    Remix本地化,加載本地合約文件,本地鏈接Remix

    智能合約IDE,在線的比較卡,而且切換網(wǎng)絡面臨文件丟失的風險,選擇本地搭建Solidity本地編輯環(huán)境,Remix-IDE + Remixd組合,加載本地合約代碼。這里用到兩個工具: Remix IDE(本地IDE)+ Remixd (鏈接) Remix IDE 項目源碼:https://github.com/ethereum/remix-project 介紹: Remix IDE是一個本地部署運

    2024年02月13日
    瀏覽(27)
  • Excalidraw本地化部署

    Excalidraw本地化部署

    1 - Excalidraw介紹 Excalidraw是一個開源、小巧易用的手寫風格的框圖畫板軟件。 ?excalidraw官網(wǎng)地址:https://excalidraw.com/? 2 - Excalidraw本地化安裝(git方式) 2-1安裝部署 在terminal中,輸入: 安裝完成后,在terminal中,進入項目文件 2-2 安裝依賴環(huán)境 - nodeJS NodeJS下載地址: nodejs下載

    2024年02月14日
    瀏覽(67)

覺得文章有用就打賞一下文章作者

支付寶掃一掃打賞

博客贊助

微信掃一掃打賞

請作者喝杯咖啡吧~博客贊助

支付寶掃一掃領取紅包,優(yōu)惠每天領

二維碼1

領取紅包

二維碼2

領紅包