windows 安裝cuda版本
查看window cuda版本
nvidia-smi
vllm 獲取鏡像,此版本需要cuda 版本12.8 或以上
docker pull vllm/vllm-openai:latest
下載模型
git lfs installcd e:\ai mkdir vllm\models\qwen2cd vllm\models#通過git下載git clone https://www.modelscope.cn/qwen/qwen2-0.5b.git Qwen2-0.5B#通過sdk下載pip install modelscope from modelscope import snapshot_download
model_dir = snapshot_download('qwen/qwen2-0.5b',local_dir='e:\ai\vllm\models\qwen2')#通過命令下載conda create --name vLLM python=3.10 -yconda activate vllmpip install modelscopemodelscope download --model qwen/qwen2-0.5b --local_dir e:\ai\vllm\models\qwen2
下載結果
運行vllm
services:vllm:container_name: vllmrestart: noimage: vllm/vllm-openai:latestruntime: nvidiaipc: host #environment:# - HF_HUB_OFFLINE = 1# - CUDA_VISIBLE_DEVICES = 0volumes:- E:\ai\vllm\models\Qwen2:/modelscommand: ["--model", "/models/Qwen/qwen2-0___5b","--served_model_name", "qen2","--gpu_memory_utilization", "0.90","--max_model_len", "1024 ","--tensor-parallel-size", "1"]ports:- 8000:8000deploy:resources:reservations:devices:- driver: nvidiacapabilities: [ gpu ]count: all
vllm 運行時提示,需要的gpu版本,運行后查看cuda版本
cuda版本可以做升級處理
CUDA下載地址:CUDA Toolkit Archive | NVIDIA Developer
升級處理 安裝選自定義全部安裝
啟動vllm
cd E:\project\vllm-maindocker-compose up -d