通過langchain調(diào)用Qwen/Qwen-1_8B-Chat模型時(shí),對話過程中出現(xiàn)報(bào)錯(cuò)提示:
ERROR: object of type 'NoneType' has no len()
Traceback (most recent call last):
File "/root/anaconda3/envs/chatchat/lib/python3.10/site-packages/langchain/chains/base.py", line 385, in acall
raise e
File "/root/anaconda3/envs/chatchat/lib/python3.10/site-packages/langchain/chains/base.py", line 379, in acall
await self._acall(inputs, run_manager=run_manager)
File "/root/anaconda3/envs/chatchat/lib/python3.10/site-packages/langchain/chains/llm.py", line 275, in _acall
response = await self.agenerate([inputs], run_manager=run_manager)
File "/root/anaconda3/envs/chatchat/lib/python3.10/site-packages/langchain/chains/llm.py", line 142, in agenerate
return await self.llm.agenerate_prompt(
File "/root/anaconda3/envs/chatchat/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 506, in agenerate_prompt
return await self.agenerate(
File "/root/anaconda3/envs/chatchat/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 466, in agenerate
raise exceptions[0]
File "/root/anaconda3/envs/chatchat/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 569, in _agenerate_with_cache
return await self._agenerate(
File "/root/anaconda3/envs/chatchat/lib/python3.10/site-packages/langchain_community/chat_models/openai.py", line 519, in _agenerate
return await agenerate_from_stream(stream_iter)
File "/root/anaconda3/envs/chatchat/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 85, in agenerate_from_stream
async for chunk in stream:
File "/root/anaconda3/envs/chatchat/lib/python3.10/site-packages/langchain_community/chat_models/openai.py", line 490, in _astream
if len(chunk["choices"]) == 0:
TypeError: object of type 'NoneType' has no len()
很疑惑,其他LLM模型都能正常運(yùn)行,唯獨(dú)Qwen不行。
查了很多資料,眾說紛紜,未解決。
于是仔細(xì)看報(bào)錯(cuò)信息,最后一行報(bào)錯(cuò)說 File “/root/anaconda3/envs/chatchat/lib/python3.10/site-packages/langchain_community/chat_models/openai.py”, line 490有問題,那就打開490行附近,看看源碼:
if not isinstance(chunk, dict):
chunk = chunk.dict()
if len(chunk["choices"]) == 0:
continue
choice = chunk["choices"][0]
應(yīng)該就是這個(gè)chunk里面沒有choices導(dǎo)致的報(bào)錯(cuò)。
那我們把這個(gè)chunk打印一下,看看他里面有些什么,于是修改這個(gè)文件代碼為:
if not isinstance(chunk, dict):
chunk = chunk.dict()
print(f'chunk:{chunk}')
if len(chunk["choices"]) == 0:
continue
choice = chunk["choices"][0]
再次運(yùn)行,看到chunk的輸出為:
chunk:{'id': None, 'choices': None, 'created': None, 'model': None, 'object': None, 'system_fingerprint': None, 'text': '**NETWORK ERROR DUE TO HIGH TRAFFIC. PLEASE REGENERATE OR REFRESH THIS PAGE.**\n\n(FlashAttention only supports Ampere GPUs or newer.)', 'error_code': 50001}
終于看到真正的錯(cuò)誤信息了:NETWORK ERROR DUE TO HIGH TRAFFIC. PLEASE REGENERATE OR REFRESH THIS PAGE:FlashAttention only supports Ampere GPUs or newer。
看樣子真正出問題的點(diǎn)在flash-attention上。
翻看huggingface上通義千問的安裝說明:
依賴項(xiàng)(Dependency)
運(yùn)行Qwen-1.8B-Chat,請確保滿足上述要求,再執(zhí)行以下pip命令安裝依賴庫
pip install transformers==4.32.0 accelerate tiktoken einops scipy transformers_stream_generator==0.0.4 peft deepspeed
另外,推薦安裝flash-attention庫(當(dāng)前已支持flash attention 2),以實(shí)現(xiàn)更高的效率和更低的顯存占用。
git clone https://github.com/Dao-AILab/flash-attention
cd flash-attention && pip install .
# 下方安裝可選,安裝可能比較緩慢。
# pip install csrc/layer_norm
# pip install csrc/rotary
按照文檔,flash-attention是安裝好了的,問題應(yīng)該不是出在安裝上面。
在qwenlm的issue上看到說要卸載flash-atten:https://github.com/QwenLM/Qwen/issues/438
然后在huggingface社區(qū)看到對這個(gè)問題的解釋:https://huggingface.co/Qwen/Qwen-7B-Chat/discussions/37:文章來源:http://www.zghlxwxcb.cn/news/detail-812227.html
flash attention是一個(gè)用于加速模型訓(xùn)練推理的可選項(xiàng),且僅適用于Turing、Ampere、Ada、Hopper架構(gòu)的Nvidia GPU顯卡(如H100、A100、RTX 3090、T4、RTX 2080),您可以在不安裝flash attention的情況下正常使用模型進(jìn)行推理。
再一核對我自己的GPU,了然了,原來是我的GPU不適用于flash attention!
所以,解決方案就是:文章來源地址http://www.zghlxwxcb.cn/news/detail-812227.html
pip uninstall flash-atten
到了這里,關(guān)于通義千問Qwen模型運(yùn)行異常解決記錄:FlashAttention only supports Ampere GPUs or newer的文章就介紹完了。如果您還想了解更多內(nèi)容,請?jiān)谟疑辖撬阉鱐OY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!