国产无码综合区,色欲AV无码国产永久播放,无码天堂亚洲国产AV,国产日韩欧美女同一区二区

手把手帶你實現(xiàn)ChatGLM2-6B的P-Tuning微調(diào)

2年前作者：stay_foolish12分類：Toy博客閱讀(32)違法舉報

這篇具有很好參考價值的文章主要介紹了手把手帶你實現(xiàn)ChatGLM2-6B的P-Tuning微調(diào)。希望對大家有所幫助。如果存在錯誤或未考慮完全的地方，請大家不吝賜教，您也可以點擊"舉報違法"按鈕提交疑問。

參考文獻：chatglm2ptuning

注意問題1：AttributeError: ‘Seq2SeqTrainer’ object has no attribute 'is_deepspeed_enabl
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
可能是版本太高，可以參考chatglm2的環(huán)境
手把手帶你實現(xiàn)ChatGLM2-6B的P-Tuning微調(diào),深度學習,大模型

1. ChatGLM2-6B的P-Tuning微調(diào)

ChatGLM2-6B：https://github.com/THUDM/ChatGLM2-6B
模型地址：https://huggingface.co/THUDM/chatglm2-6b

詳細步驟同：ChatGLM-6B的P-Tuning微調(diào)詳細步驟及結果驗證

注：ChatGLM2-6B官網(wǎng)給的環(huán)境P-Tuning微調(diào)報錯 (python3.8.10/3.10.6 + torch 2.0.1 + transformers 4.30.2)，

AttributeError: ‘Seq2SeqTrainer’ object has no attribute 'is_deepspeed_enabl
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

手把手帶你實現(xiàn)ChatGLM2-6B的P-Tuning微調(diào),深度學習,大模型

應該是transformers版本太高了，用ChatGLM-6B環(huán)境（ChatGLM-6B部署教程）即可，即

Python 3.8.10
CUDA Version: 12.0
torch                2.0.1
transformers         4.27.1

2. 模型微調(diào)情況

2.1 數(shù)據(jù)集&調(diào)參

訓練集：412條
驗證集：83條
max_source_length：3500
max_target_length：180

問題長度：
手把手帶你實現(xiàn)ChatGLM2-6B的P-Tuning微調(diào),深度學習,大模型

2.2 模型訓練

A100 80G 2塊，占用率16%，3000輪訓練時間21h
手把手帶你實現(xiàn)ChatGLM2-6B的P-Tuning微調(diào),深度學習,大模型

文章來源地址http://www.zghlxwxcb.cn/news/detail-582777.html

2.3 模型預測

from transformers import AutoConfig, AutoModel, AutoTokenizer
import torch
import os

tokenizer = AutoTokenizer.from_pretrained(“…/THUDM-model”, trust_remote_code=True)
CHECKPOINT_PATH = ‘./output/adgen-chatglm2-6b-pt-64-2e-2/checkpoint-3000’
PRE_SEQ_LEN = 64

config = AutoConfig.from_pretrained(“…/THUDM-model”, trust_remote_code=True, pre_seq_len=PRE_SEQ_LEN)
model = AutoModel.from_pretrained(“…/THUDM-model”, config=config, trust_remote_code=True).half().cuda()
prefix_state_dict = torch.load(os.path.join(CHECKPOINT_PATH, “pytorch_model.bin”))
new_prefix_state_dict = { }
for k, v in prefix_state_dict.items():
if k.startswith(“transformer.prefix_encoder.”):
new_prefix_state_dict[k[len(“transformer.prefix_encoder.”):]] = v
model.transformer.prefix_encoder.load_state_dict(new_prefix_state_dict)

#model = AutoModel.from_pretrained(CHECKPOINT_PATH, trust_remote_code=True)
model = model.eval()

response, history = model.chat(tokenizer, “你好”, history=[]

參照p-tuning readme，需注意：
(1) 注意可能需要將 pre_seq_len 改成訓練時的實際值。
(2) 如果是從本地加載模型的話，需要將 THUDM/chatglm2-6b 改成本地的模型路徑（注意不是checkpoint路徑）; CHECKPOINT_PATH路徑需要修改。
(3) 報錯”RuntimeError: “addmm_impl_cpu_” not implemented for ‘Half’“
需要在model = AutoModel.from_pretrained(“…/THUDM-model”, config=config, trust_remote_code=True) 后加 .half().cuda()

495條數(shù)據(jù)，無history，單輪，模型加載+預測時間3min
手把手帶你實現(xiàn)ChatGLM2-6B的P-Tuning微調(diào),深度學習,大模型