国产无码综合区,色欲AV无码国产永久播放,无码天堂亚洲国产AV,国产日韩欧美女同一区二区

部署Llama2的方法（Linux）

2年前作者：快樂(lè)啊啊啊啊啊分類(lèi)：Toy博客閱讀(15)違法舉報(bào)

這篇具有很好參考價(jià)值的文章主要介紹了部署Llama2的方法（Linux）。希望對(duì)大家有所幫助。如果存在錯(cuò)誤或未考慮完全的地方，請(qǐng)大家不吝賜教，您也可以點(diǎn)擊"舉報(bào)違法"按鈕提交疑問(wèn)。

Llama2,一款開(kāi)源大語(yǔ)言模型。Github倉(cāng)庫(kù)地址：

facebookresearch/llama: Inference code for LLaMA models (github.com)z???????zhttps://github.com/facebookresearch/llama

?中文地址：

GitHub - FlagAlpha/Llama2-Chinese: Llama中文社區(qū)，最好的中文Llama大模型，完全開(kāi)源可商用Llama中文社區(qū)，最好的中文Llama大模型，完全開(kāi)源可商用. Contribute to FlagAlpha/Llama2-Chinese development by creating an account on GitHub.https://github.com/FlagAlpha/Llama2-Chinese接下來(lái)將分享在Linux系統(tǒng)中部署這款模型的方法。一開(kāi)始嘗試了Windows，但Llama2在Windows系統(tǒng)中無(wú)法使用GPU運(yùn)行，如果想使用GPU，可以考慮另一款Llama2開(kāi)源模型LLC-LLM：

https://mlc.ai/mlc-llm/docs/get_started/try_out.htmlhttps://mlc.ai/mlc-llm/docs/get_started/try_out.html

?Linux系統(tǒng)中的部署方法

我實(shí)在autodl的算力平臺(tái)上部署的，在部署前查閱文檔，發(fā)現(xiàn)13B和70B對(duì)算力要求過(guò)高，于是選擇7B進(jìn)行嘗試。

部署Llama2的方法（Linux）,linux,windows,運(yùn)維

?1. 克隆github倉(cāng)庫(kù)

git clone https://github.com/facebookresearch/llama.git

2. 進(jìn)入Llama文件夾

cd llama

3. 配置依賴(lài)

pip install -e .

4. demo代碼

但是我發(fā)現(xiàn)單單運(yùn)行官方給出的demo會(huì)遇到HTTPError,大意是說(shuō)你沒(méi)有限權(quán)訪(fǎng)問(wèn)meta-llama/Llama-2-7b-chat-hf，因?yàn)檫@個(gè)模型是現(xiàn)場(chǎng)從huggingface(一款開(kāi)源模型網(wǎng)站）上下下來(lái)的。所以，要對(duì)代碼進(jìn)行一些小小的修改。（以下為官方demo）https://huggingface.co/

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained('meta-llama/Llama-2-7b-chat-hf',device_map='auto',torch_dtype=torch.float16,load_in_8bit=True)
model =model.eval()
tokenizer = AutoTokenizer.from_pretrained('meta-llama/Llama-2-7b-chat-hf',use_fast=False)
tokenizer.pad_token = tokenizer.eos_token
input_ids = tokenizer(['<s>Human: 介紹一下中國(guó)\n</s><s>Assistant: '], return_tensors="pt",add_special_tokens=False).input_ids.to('cuda')        
generate_input = {
    "input_ids":input_ids,
    "max_new_tokens":512,
    "do_sample":True,
    "top_k":50,
    "top_p":0.95,
    "temperature":0.3,
    "repetition_penalty":1.3,
    "eos_token_id":tokenizer.eos_token_id,
    "bos_token_id":tokenizer.bos_token_id,
    "pad_token_id":tokenizer.pad_token_id
}
generate_ids  = model.generate(**generate_input)
text = tokenizer.decode(generate_ids[0])
print(text)

以下為修改（第3/6行）

model = AutoModelForCausalLM.from_pretrained('meta-llama/Llama-2-7b-chat-hf',device_map='auto',torch_dtype=torch.float16,load_in_8bit=True,use_auth_token="你的Token")
tokenizer = AutoTokenizer.from_pretrained('meta-llama/Llama-2-7b-chat-hf',use_fast=False,use_auth_token="你的Token")

我們需要添加一個(gè)Token，獲取的方式為：進(jìn)入以下網(wǎng)址，然后注冊(cè)登錄一系列操作以后生成Token，貌似只有write的token才能生效。

Hugging Face – The AI community builg the future.We’re on a journey to advance and democratize artificial intelligence through open source and open science.https://huggingface.co/settings/tokens

如果報(bào)錯(cuò)說(shuō)你沒(méi)有Transformer包，或者accelerate包，pip即可！

pip install transformers
pip install accelerate

模型下載完成?。ㄐ枰纫欢螘r(shí)間）,PS:Llama-2-7b-chat-hf是主要用于聊天的，還有Llama-2-7b-hf，Llama-2-13b-chat-hf，Llama-2-13b-hf，等等版本，大家可以根據(jù)自己的需求下載。

??????? 部署Llama2的方法（Linux）,linux,windows,運(yùn)維

?然后運(yùn)行就可以輸出結(jié)果了文章來(lái)源地址http://www.zghlxwxcb.cn/news/detail-702035.html

到了這里，關(guān)于部署Llama2的方法（Linux）的文章就介紹完了。如果您還想了解更多內(nèi)容，請(qǐng)?jiān)谟疑辖撬阉鱐OY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章，希望大家以后多多支持TOY模板網(wǎng)！

本文來(lái)自互聯(lián)網(wǎng)用戶(hù)投稿，該文觀點(diǎn)僅代表作者本人，不代表本站立場(chǎng)。本站僅提供信息存儲(chǔ)空間服務(wù)，不擁有所有權(quán)，不承擔(dān)相關(guān)法律責(zé)任。如若轉(zhuǎn)載，請(qǐng)注明出處：如若內(nèi)容造成侵權(quán)/違法違規(guī)/事實(shí)不符，請(qǐng)點(diǎn)擊違法舉報(bào)進(jìn)行投訴反饋，一經(jīng)查實(shí)，立即刪除！

分享到：

領(lǐng)支付寶紅包贊助服務(wù)器費(fèi)用

llama.cpp LLM模型 windows cpu安裝部署；運(yùn)行LLaMA2模型測(cè)試
參考： https://www.listera.top/ji-xu-zhe-teng-xia-chinese-llama-alpaca/ https://blog.csdn.net/qq_38238956/article/details/130113599 cmake windows安裝參考：https://blog.csdn.net/weixin_42357472/article/details/131314105 1、下載： 2、編譯 3、測(cè)試運(yùn)行參考： https://zhuanlan.zhihu.com/p/638427280 模型下載： https://huggingface.co/nya
2024年02月16日
瀏覽(29)
Windows11下私有化部署大語(yǔ)言模型實(shí)戰(zhàn) langchain+llama2
CPU：銳龍5600X 顯卡：GTX3070 內(nèi)存：32G 注：硬件配置僅為博主的配置，不是最低要求配置，也不是推薦配置。該配置下計(jì)算速度約為40tokens/s。實(shí)測(cè)核顯筆記本（i7-1165g7）也能跑，速度3tokens/s。 Windows系統(tǒng)版本：Win11專(zhuān)業(yè)版23H2 Python版本：3.11 Cuda版本：12.3.2 VS版本：VS2022 17.8.3 lan
2024年02月03日
瀏覽(1177)
大模型部署手記（11）LLaMa2+Chinese-LLaMA-Plus-2-7B+Windows+llama.cpp+中文對(duì)話(huà)
組織機(jī)構(gòu)：Meta（Facebook）代碼倉(cāng)：GitHub - facebookresearch/llama: Inference code for LLaMA models 模型：LIama-2-7b-hf、Chinese-LLaMA-Plus-2-7B ? 下載：使用huggingface.co和百度網(wǎng)盤(pán)下載硬件環(huán)境：暗影精靈7Plus Windows版本：Windows 11家庭中文版 Insider Preview 22H2 內(nèi)存 32G GPU顯卡：Nvidia GTX 3080 Laptop （1
2024年02月03日
瀏覽(26)
大模型部署手記（13）LLaMa2+Chinese-LLaMA-Plus-2-7B+Windows+LangChain+摘要問(wèn)答
組織機(jī)構(gòu)：Meta（Facebook）代碼倉(cāng)：GitHub - facebookresearch/llama: Inference code for LLaMA models 模型：chinese-alpaca-2-7b-hf、text2vec-large-chinese 下載：使用百度網(wǎng)盤(pán)和huggingface.co下載硬件環(huán)境：暗影精靈7Plus Windows版本：Windows 11家庭中文版 Insider Preview 22H2 內(nèi)存 32G GPU顯卡：Nvidia GTX 3080 Laptop
2024年02月04日
瀏覽(21)
大模型部署手記（9）LLaMa2+Chinese-LLaMA-Plus-7B+Windows+llama.cpp+中文文本補(bǔ)齊
組織機(jī)構(gòu)：Meta（Facebook）代碼倉(cāng)：GitHub - facebookresearch/llama: Inference code for LLaMA models 模型：llama-2-7b、Chinese-LLaMA-Plus-7B（chinese_llama_plus_lora_7b） ? 下載：使用download.sh下載硬件環(huán)境：暗影精靈7Plus Windows版本：Windows 11家庭中文版 Insider Preview 22H2 內(nèi)存 32G GPU顯卡：Nvidia GTX 3080 La
2024年02月03日
瀏覽(24)
大模型部署手記（10）LLaMa2+Chinese-LLaMA-Plus-7B+Windows+llama.cpp+中英文對(duì)話(huà)
組織機(jī)構(gòu)：Meta（Facebook）代碼倉(cāng)：GitHub - facebookresearch/llama: Inference code for LLaMA models 模型：llama-2-7b、llama-2-7b-chat（后來(lái)證明無(wú)法實(shí)現(xiàn)中文轉(zhuǎn)換）、Chinese-LLaMA-Plus-7B（chinese_llama_plus_lora_7b） ? 下載：使用download.sh下載硬件環(huán)境：暗影精靈7Plus Windows版本：Windows 11家庭中文版
2024年02月04日
瀏覽(23)
【運(yùn)維】手把手教你在Linux/Windows系統(tǒng)使用Nginx部署多個(gè)前端項(xiàng)目【詳細(xì)操作】
??????? 需求:項(xiàng)目上線(xiàn)需要將前端的前臺(tái)和后臺(tái)部署在服務(wù)器上提供用戶(hù)進(jìn)行使用，部署在不同的服務(wù)器直接在服務(wù)器安裝nginx即可。但是在內(nèi)網(wǎng)安裝還是有點(diǎn)麻煩，因?yàn)樾枰?lián)網(wǎng)，如果是內(nèi)網(wǎng)可以參考Linux安裝Nginx并部署前端項(xiàng)目【內(nèi)/外網(wǎng)-保姆級(jí)教程】_MXin5的博客-CSDN博
2024年02月08日
瀏覽(30)
大模型Llama2部署，基于text-generation-webui、Llama2-Chinese
參考安裝教程：傻瓜式！一鍵部署llama2+chatglm2，集成所有環(huán)境和微調(diào)功能，本地化界面操作！ Github地址：GitHub - oobabooga/text-generation-webui: A Gradio web UI for Large Language Models. Supports transformers, GPTQ, llama.cpp (ggml/gguf), Llama models. 模型下載地址：meta-llama/Llama-2-13b-chat-hf at main 遇到的問(wèn)
2024年02月08日
瀏覽(24)
【個(gè)人開(kāi)發(fā)】llama2部署實(shí)踐（四）——llama服務(wù)接口調(diào)用方式
response.json() 返回如下：代碼demo 如果是openai1.0的版本以上，End！
2024年04月09日
瀏覽(23)
Linux系統(tǒng)中實(shí)現(xiàn)便捷運(yùn)維管理和遠(yuǎn)程訪(fǎng)問(wèn)的1Panel部署方法解析
1Panel 是一個(gè)現(xiàn)代化、開(kāi)源的 Linux 服務(wù)器運(yùn)維管理面板。高效管理,通過(guò) Web 端輕松管理 Linux 服務(wù)器，包括主機(jī)監(jiān)控、文件管理、數(shù)據(jù)庫(kù)管理、容器管理等下面我們介紹在Linux 本地安裝1Panel 并結(jié)合cpolar 內(nèi)網(wǎng)穿透工具實(shí)現(xiàn)遠(yuǎn)程訪(fǎng)問(wèn)1Panel 管理界面## 1. Linux 安裝1Panel執(zhí)行如下命令一
2024年02月09日
瀏覽(26)

<i id="mhtta"></i>

<del id="mhtta"><pre id="mhtta"><sup id="mhtta"></sup></pre></del>