關(guān)于 ollama 項(xiàng)目
https://github.com/ollama/ollama/tree/main/docs
https://www.bilibili.com/video/BV1oS421w7aM/
【ollama】(2):在linux搭建環(huán)境,編譯ollama代碼,測(cè)試qwen大模型,本地運(yùn)行速度飛快,本質(zhì)上是對(duì)llama.cpp 項(xiàng)目封裝
1,下載代碼和子模塊
git clone --recurse-submodules https://github.com/ollama/ollama.git
正克隆到 'ollama'...
remote: Enumerating objects: 11260, done.
remote: Counting objects: 100% (494/494), done.
remote: Compressing objects: 100% (263/263), done.
remote: Total 11260 (delta 265), reused 357 (delta 190), pack-reused 10766
接收對(duì)象中: 100% (11260/11260), 6.92 MiB | 57.00 KiB/s, 完成.
處理 delta 中: 100% (6984/6984), 完成.
子模組 'llama.cpp'(https://github.com/ggerganov/llama.cpp.git)已對(duì)路徑 'llm/llama.cpp' 注冊(cè)
正克隆到 '/data/home/test/go/src/ollama/llm/llama.cpp'...
fatal: 無(wú)法訪問(wèn) 'https://github.com/ggerganov/llama.cpp.git/':GnuTLS recv error (-110): The TLS connection was non-properly terminated.
要是失敗執(zhí)行,子模塊更新:
$ git submodule update
正克隆到 '/data/home/test/go/src/ollama/llm/llama.cpp'...
remote: Enumerating objects: 12802, done.
remote: Counting objects: 100% (12802/12802), done.
remote: Compressing objects: 100% (3561/3561), done.
remote: Total 12483 (delta 9258), reused 12045 (delta 8826), pack-reused 0
接收對(duì)象中: 100% (12483/12483), 10.19 MiB | 679.00 KiB/s, 完成.
處理 delta 中: 100% (9258/9258), 完成 260 個(gè)本地對(duì)象.
來(lái)自 https://github.com/ggerganov/llama.cpp
* branch c2101a2e909ac7c08976d414e64e96c90ee5fa9e -> FETCH_HEAD
子模組路徑 'llm/llama.cpp':檢出 'c2101a2e909ac7c08976d414e64e96c90ee5fa9e'
2,然后就可以編譯了
go build
llm/payload_linux.go:7:12: pattern llama.cpp/build/linux/*/*/lib/*: no matching files found
需要編譯 llama.cpp 的代碼,
test@thinkPadE15:~/go/src/ollama$ cd llm/generate/
test@thinkPadE15:~/go/src/ollama/llm/generate$ ls
gen_common.sh gen_darwin.sh generate_darwin.go generate_linux.go generate_windows.go gen_linux.sh gen_windows.ps1
test@thinkPadE15:~/go/src/ollama/llm/generate$ bash gen_linux.sh
- Build files have been written to: /data/home/test/go/src/ollama/llm/llama.cpp/build/linux/x86_64/cpu
+ cmake --build ../llama.cpp/build/linux/x86_64/cpu --target ext_server -j8
[ 6%] Generating build details from Git
[ 20%] Building C object CMakeFiles/ggml.dir/ggml-backend.c.o
[ 20%] Building C object CMakeFiles/ggml.dir/ggml.c.o
[ 20%] Building C object CMakeFiles/ggml.dir/ggml-quants.c.o
[ 26%] Building C object CMakeFiles/ggml.dir/ggml-alloc.c.o
-- Found Git: /usr/bin/git (found version "2.34.1")
[ 33%] Building CXX object common/CMakeFiles/build_info.dir/build-info.cpp.o
[ 33%] Built target build_info
然后經(jīng)過(guò)漫長(zhǎng)的編譯,就而可以生成了在 build 文件夾。里面有相關(guān)的liib庫(kù)了。
3,再執(zhí)行 go build 就可以了
go build
ls -lha ollama
-rwxrwxr-x 1 test test 34M 3月 9 22:13 ollama
會(huì)有幾個(gè)警告,問(wèn)題不大:
gpu_info.h:33:23: note: in definition of macro ‘LOG’
33 | fprintf(stderr, __VA_ARGS__); \
| ^~~~~~~~~~~
gpu_info_cuda.c:159:41: note: format string is defined here
159 | LOG(h.verbose, "[%d] CUDA usedMem %ld\n", i, memInfo.used);
| ~~^
| |
| long int
| %lld
運(yùn)行 qwen 模型:
https://ollama.com/library/qwen
./ollama run qwen:0.5b
pulling manifest
pulling fad2a06e4cc7... 25% ▕█████████████ ▏ 96 MB/394 MB 368 KB/s 13m28s
框架本質(zhì)上還是使用了 llama.cpp 項(xiàng)目的。
測(cè)試速度飛快:文章來(lái)源:http://www.zghlxwxcb.cn/news/detail-844538.html
>>> 你好
你好!有什么我可以幫您的嗎?
>>> 北京景點(diǎn)推薦
在北京,有很多值得一去的地方。以下是一些推薦的北京景點(diǎn):
1. 故宮:世界文化遺產(chǎn),擁有豐富的歷史文化和藝術(shù)價(jià)值。
2. 長(zhǎng)城:世界自然遺產(chǎn),擁有壯麗的山巒和河流景觀。
3. 北京天安門(mén)廣場(chǎng):中國(guó)國(guó)家象征,擁有壯觀的建筑景觀。
以上是一些推薦的北京景點(diǎn)。希望對(duì)您有所幫助!
>>> 測(cè) 你是誰(shuí)
我是阿里云研發(fā)的大規(guī)模語(yǔ)言模型“通義千問(wèn)”。如果您有任何問(wèn)題或需要幫助,請(qǐng)隨時(shí)告訴我,我會(huì)盡力提供支持和解答。
4,查看文件存儲(chǔ)位置
- macOS: `~/.ollama/models`
- Linux: `/usr/share/ollama/.ollama/models`
- Windows: `C:\Users\<username>\.ollama\models`
5,可以使用 openai api 標(biāo)準(zhǔn)接口測(cè)試
curl http://localhost:11434/api/chat -d '{
"model": "qwen:7b",
"messages": [
{
"role": "user",
"content": "你好"
}
]
}'
curl http://localhost:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "qwen:7b","stream":true,
"messages": [
{
"role": "user",
"content": "你好"
}
]
}'
支持流返回:文章來(lái)源地址http://www.zghlxwxcb.cn/news/detail-844538.html
data: {"id":"chatcmpl-263","object":"chat.completion.chunk","created":1710001611,"model":"qwen:7b","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"你好"},"finish_reason":null}]}
data: {"id":"chatcmpl-263","object":"chat.completion.chunk","created":1710001611,"model":"qwen:7b","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"!"},"finish_reason":null}]}
data: {"id":"chatcmpl-263","object":"chat.completion.chunk","created":1710001612,"model":"qwen:7b","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"有什么"},"finish_reason":null}]}
data: {"id":"chatcmpl-263","object":"chat.completion.chunk","created":1710001612,"model":"qwen:7b","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"問(wèn)題"},"finish_reason":null}]}
data: {"id":"chatcmpl-263","object":"chat.completion.chunk","created":1710001612,"model":"qwen:7b","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"或者"},"finish_reason":null}]}
data: {"id":"chatcmpl-263","object":"chat.completion.chunk","created":1710001612,"model":"qwen:7b","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"需要"},"finish_reason":null}]}
data: {"id":"chatcmpl-263","object":"chat.completion.chunk","created":1710001612,"model":"qwen:7b","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"幫助"},"finish_reason":null}]}
data: {"id":"chatcmpl-263","object":"chat.completion.chunk","created":1710001613,"model":"qwen:7b","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"嗎"},"finish_reason":null}]}
data: {"id":"chatcmpl-263","object":"chat.completion.chunk","created":1710001613,"model":"qwen:7b","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"?"},"finish_reason":null}]}
data: {"id":"chatcmpl-263","object":"chat.completion.chunk","created":1710001613,"model":"qwen:7b","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"我"},"finish_reason":null}]}
data: {"id":"chatcmpl-263","object":"chat.completion.chunk","created":1710001613,"model":"qwen:7b","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"在這里"},"finish_reason":null}]}
data: {"id":"chatcmpl-263","object":"chat.completion.chunk","created":1710001613,"model":"qwen:7b","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"。"},"finish_reason":null}]}
data: {"id":"chatcmpl-263","object":"chat.completion.chunk","created":1710001614,"model":"qwen:7b","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":"\n"},"finish_reason":null}]}
data: {"id":"chatcmpl-263","object":"chat.completion.chunk","created":1710001614,"model":"qwen:7b","system_fingerprint":"fp_ollama","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":"stop"}]}
data: [DONE]
到了這里,關(guān)于【ollama】(2):在linux搭建環(huán)境,編譯ollama代碼,測(cè)試qwen大模型,本地運(yùn)行速度飛快,本質(zhì)上是對(duì)llama.cpp 項(xiàng)目封裝的文章就介紹完了。如果您還想了解更多內(nèi)容,請(qǐng)?jiān)谟疑辖撬阉鱐OY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!