使用官方提供的腳本創(chuàng)建ChatGLM3的DEMO:
cd basic_demo
python web_demo_gradio.py
?出現效果異常問題:
====conversation====
?[{'role': 'user', 'content': '你好'}, {'role': 'assistant', 'content': '你好,有什么我可以幫助你的嗎?\n\n<|im_end|>'}, {'role': 'user', 'content': '你好'}]No chat template is defined for this tokenizer - using a default chat template that implements the ChatML format (without BOS/EOS tokens!). If the default is not appropriate for your model, please set `tokenizer.chat_template` to an appropriate template. See https://huggingface.co/docs/transformers/main/chat_templating for more information.
原因分析:
- 模型版本與代碼不匹配,tokenizer_config.json配置文件中缺少prompt模板
- 官方代碼存在問題,尚不支持本地模型使用apply_chat_template方法
解決方案:修改tokenizer方式,不要使用apply_chat_template方法,單輪對話可以改用build_chat_input方法
def predict(history, max_length, top_p, temperature):
stop = StopOnTokens()
messages = []
for idx, (user_msg, model_msg) in enumerate(history):
if idx == len(history) - 1 and not model_msg:
messages.append({"role": "user", "content": user_msg})
break
if user_msg:
messages.append({"role": "user", "content": user_msg})
if model_msg:
messages.append({"role": "assistant", "content": model_msg})
print("\n\n====conversation====\n", messages)
model_inputs = tokenizer.apply_chat_template(messages,
add_generation_prompt=True,
tokenize=True,
return_tensors="pt").to(next(model.parameters()).device)
print('debug: old: model_inputs: {}'.format(model_inputs))
model_inputs = tokenizer.build_chat_input(messages[-1]['content'], history=None, role="user").input_ids.to(model.device)
print('debug: new: model_inputs: {}'.format(model_inputs))
streamer = TextIteratorStreamer(tokenizer, timeout=60, skip_prompt=True, skip_special_tokens=True)
generate_kwargs = {
"input_ids": model_inputs,
"streamer": streamer,
"max_new_tokens": max_length,
"do_sample": True,
"top_p": top_p,
"temperature": temperature,
"stopping_criteria": StoppingCriteriaList([stop]),
"repetition_penalty": 1.2,
}
t = Thread(target=model.generate, kwargs=generate_kwargs)
t.start()
for new_token in streamer:
if new_token != '':
history[-1][1] += new_token
yield history
tokenizer.chat_template介紹
Next time you use?apply_chat_template(), it will use your new template! This attribute will be saved in the?
tokenizer_config.json
?file, so you can use?push_to_hub()?to upload your new template to the Hub and make sure everyone’s using the right template for your model!設置tokenizer.chat_template屬性后,下次使用apply_chat_template()時,將使用您的新模板!此屬性保存在tokenizer_config.json文件中,因此您可以用push_to_hub()將新模板上傳到Hub,確保大家都能使用正確的模板!
If a model does not have a chat template set, but there is a default template for its model class, the?
ConversationalPipeline
?class and methods like?apply_chat_template
?will use the class template instead. You can find out what the default template for your tokenizer is by checking the?tokenizer.default_chat_template
?attribute.文章來源:http://www.zghlxwxcb.cn/news/detail-819345.html如果模型沒有設置聊天模板,但有其模型類的默認模板,則ConversationalPipeline類和apply_chat_template等方法將使用類模板代替。你可以通過檢查tokenizer.default_chat_template屬性來了解你的tokenizer的默認模板是什么。?文章來源地址http://www.zghlxwxcb.cn/news/detail-819345.html
def predict(history, max_length, top_p, temperature):
stop = StopOnTokens()
messages = []
for idx, (user_msg, model_msg) in enumerate(history):
if idx == len(history) - 1 and not model_msg:
messages.append({"role": "user", "content": user_msg})
break
if user_msg:
messages.append({"role": "user", "content": user_msg})
if model_msg:
messages.append({"role": "assistant", "content": model_msg})
print("\n\n====conversation====\n", messages)
print('debug: tokenizer.chat_template:\n{}'.format(tokenizer.chat_template))
print('debug: tokenizer.default_chat_template:\n{}'.format(tokenizer.default_chat_template))
model_inputs = tokenizer.apply_chat_template(messages,
add_generation_prompt=True,
tokenize=True,
return_tensors="pt").to(next(model.parameters()).device)
streamer = TextIteratorStreamer(tokenizer, timeout=600, skip_prompt=True, skip_special_tokens=True)
generate_kwargs = {
"input_ids": model_inputs,
"streamer": streamer,
"max_new_tokens": max_length,
"do_sample": True,
"top_p": top_p,
"temperature": temperature,
"stopping_criteria": StoppingCriteriaList([stop]),
"repetition_penalty": 1.2,
}
t = Thread(target=model.generate, kwargs=generate_kwargs)
t.start()
for new_token in streamer:
if new_token != '':
history[-1][1] += new_token
yield history
到了這里,關于ChatGLM3報錯:No chat template is defined for this tokenizer的文章就介紹完了。如果您還想了解更多內容,請在右上角搜索TOY模板網以前的文章或繼續(xù)瀏覽下面的相關文章,希望大家以后多多支持TOY模板網!