We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
T4服务器启用了加速能力: CMAKE_ARGS="-DGGML_CUDA=ON" pip install -U chatglm-cpp CMAKE_ARGS="-DGGML_CUDA=ON" pip install 'chatglm-cpp[api]' 对量化模型进行启动: MODEL=/home/ops/chatglm/chatglm.cpp/models/chatglm3-q8-0-ggml.bin uvicorn chatglm_cpp.openai_api:app --host xx.xx.xx.xx --port 8000 启动后的应用不调用GPU资源,请问是什么原因呢,有什么比较好的解决办法呢? uvicorn chatglm_cpp.openai_api:app --host xx.xx.xx.xx --port 8000 INFO: Started server process [6821] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://xx.xx.xx.xx:8000 (Press CTRL+C to quit)
补充一下: 使用如下脚本可以调用GPU资源 ./build/bin/main -m /home/ops/chatglm/chatglm.cpp/models/chatglm3-q8-0-ggml.bin -i --top_p 0.8 --temp 0.8
The text was updated successfully, but these errors were encountered:
我的也是一样的情况
Sorry, something went wrong.
No branches or pull requests
T4服务器启用了加速能力:
CMAKE_ARGS="-DGGML_CUDA=ON" pip install -U chatglm-cpp
CMAKE_ARGS="-DGGML_CUDA=ON" pip install 'chatglm-cpp[api]'
对量化模型进行启动:
MODEL=/home/ops/chatglm/chatglm.cpp/models/chatglm3-q8-0-ggml.bin uvicorn chatglm_cpp.openai_api:app --host xx.xx.xx.xx --port 8000
启动后的应用不调用GPU资源,请问是什么原因呢,有什么比较好的解决办法呢?
uvicorn chatglm_cpp.openai_api:app --host xx.xx.xx.xx --port 8000
INFO: Started server process [6821]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://xx.xx.xx.xx:8000 (Press CTRL+C to quit)
补充一下:
使用如下脚本可以调用GPU资源
./build/bin/main -m /home/ops/chatglm/chatglm.cpp/models/chatglm3-q8-0-ggml.bin -i --top_p 0.8 --temp 0.8
The text was updated successfully, but these errors were encountered: