文章目錄
- 1 我也許不是傻瓜,卻只想做個傻瓜
- 2 環境要求
- 3 安裝
- 3.1 下載源碼
- 3.2 創建虛擬環境
- 3.3 安裝
- 4 下載數據
- 5 查看支持的模型和數據集
- 6 評測
- 6.1 指定模型路徑
- 6.2 指定配置文件
- 6.2.1 評測本地qwen2.5模型
- 6.2.1.1 查看opencompass支持的qwen2.5模型
- 6.2.1.2 創建配置文件
- 6.2.1.3 再次使用`python tools/list_configs.py | grep hf_qwen2_5`進行查看
- 6.2.1.4 運行
- 6.2.2 評測ollama模型
- 6.2.2.1 創建配置文件`eval_ollama.py`
- 6.2.2.2 啟動ollama服務
- 6.2.2.3 運行方式1
- 6.2.2.4 運行方式2
- 6.2.3 使用lmdeploy加速評測
- 6.2.3.1 安裝lmdeploy
- 6.2.3.2 查看支持lmdeploy加速的qwen2.5模型
- 6.2.3.3 創建配置文件
- 6.2.3.4 運行
1 我也許不是傻瓜,卻只想做個傻瓜
OpenCompass
在github
中說祝賀 OpenCompass 作為大模型標準測試工具被Meta AI官方推薦,實力不俗。但官方教程真是寫得那個隨心所欲,我只想無腦照著教程,快速跑起來,再慢慢深入了解。但折騰半天,不是這里錯就是那里有問題,根本跑不起來,差點就勸退了。于是有了這篇入門教程,力爭無腦操作,躺平做個傻瓜!
2 環境要求
- 最好是
Linux
或WSL
,Windows
可能會有不期而遇的錯誤 - 必須是
python 3.10
,我一上來就用了python 3.12
,裝了半天也裝不上 - 直接
源碼
安裝!千萬別pip install opencompass
,然后照著官方教程,直接勸退
3 安裝
3.1 下載源碼
git clone https://github.com/open-compass/opencompass opencompass
3.2 創建虛擬環境
conda create -n open-compass python=3.10 -y
conda activate open-compass
python --version
3.3 安裝
cd opencompass
pip install -e . -i https://mirrors.aliyun.com/pypi/simple/
pip list | grep opencompass
opencompass 0.4.2 /home/ubuntu/ws/opencompass
4 下載數據
# 先進入到opencompass目錄下
# 注意,opencompass下還有一個opencompass文件夾,別進錯了
# cd opencompasswget https://github.com/open-compass/opencompass/releases/download/0.2.2.rc1/OpenCompassData-core-20240207.zip
# 會自動解壓到data文件夾中
unzip OpenCompassData-core-20240207.zip
5 查看支持的模型和數據集
OpenCompass
采用基于配置文件與命名約定
的運行機制。運行時指定的模型
和數據集
參數并非隨意傳遞,而是通過預定義的名稱與配置文件映射關系來定位。用戶只需指定預定義的名稱,系統就能自動關聯到對應的配置文件。要查看所有可用的配置映射關系,可以運行命令python tools/list_configs.py
。
# cd opencompass
# conda activate open-compass
python tools/list_configs.py
+-----------------------------------------------+-------------------------------------------------------------------------------------+
| Model | Config Path |
|-----------------------------------------------+-------------------------------------------------------------------------------------|
| README | opencompass/configs/models/qwen/README.md |
| README | opencompass/configs/models/hf_internlm/README.md |
| accessory_llama2_7b | opencompass/configs/models/accessory/accessory_llama2_7b.py |+-----------------------------------------------+-------------------------------------------------------------------------------------+
| Dataset | Config Path |
|-----------------------------------------------+-------------------------------------------------------------------------------------|
| ARC_c_clean_ppl | opencompass/configs/datasets/ARC_c/ARC_c_clean_ppl.py |
| ARC_c_cot_gen_926652 | opencompass/configs/datasets/ARC_c/ARC_c_cot_gen_926652.py |
| ARC_c_few_shot_gen_e9b043 | opencompass/configs/datasets/ARC_c/ARC_c_few_shot_gen_e9b043.py |
6 評測
opencompass
主要有兩種運行方式,一種指定模型路徑
,opencompass
直接加載運行模型進行評測;一種是指定配置文件
,opencompass
加載解析配置文件,加載本地模型
或調用模型的服務化API
進行評測。
6.1 指定模型路徑
# cd opencompass
# conda activate open-compass# 必須是在opencompass目錄下,才有run.py這個腳本
# --hf-type 指定chat或base模型類型
# --hf-path 指定本地模型的路徑,open-compass會以huggingface的方式加載模型
# --datasets 指定數據集名稱,必須是在tools/list_configs.py中python run.py \--hf-type chat \--hf-path /mnt/d/models/Qwen/Qwen2.5-1.5B-Instruct \--datasets demo_gsm8k_chat_gen \--work-dir eval_results/hf/Qwen2.5-1.5B-Instruct \--debug
6.2 指定配置文件
6.2.1 評測本地qwen2.5模型
6.2.1.1 查看opencompass支持的qwen2.5模型
# cd opencompass
# conda activate open-compasspython tools/list_configs.py | grep hf_qwen2_5
| hf_qwen2_57b_a14b | opencompass/configs/models/qwen/hf_qwen2_57b_a14b.py |
| hf_qwen2_5_0_5b_instruct | opencompass/configs/models/qwen2_5/hf_qwen2_5_0_5b_instruct.py |
| hf_qwen2_5_14b_instruct | opencompass/configs/models/qwen2_5/hf_qwen2_5_14b_instruct.py |
| hf_qwen2_5_1_5b_instruct | opencompass/configs/models/qwen2_5/hf_qwen2_5_1_5b_instruct.py |
| hf_qwen2_5_32b_instruct | opencompass/configs/models/qwen2_5/hf_qwen2_5_32b_instruct.py |
| hf_qwen2_5_3b_instruct | opencompass/configs/models/qwen2_5/hf_qwen2_5_3b_instruct.py |
| hf_qwen2_5_72b_instruct | opencompass/configs/models/qwen2_5/hf_qwen2_5_72b_instruct.py |
| hf_qwen2_5_7b_instruct | opencompass/configs/models/qwen2_5/hf_qwen2_5_7b_instruct.py |
6.2.1.2 創建配置文件
以Qwen2.5-1.5B-Instruct
為例,即hf_qwen2_5_1_5b_instruct
,可看到它對應的配置文件為opencompass/configs/models/qwen2_5/hf_qwen2_5_1_5b_instruct.py
,以它為模板,拷貝后修改如下,并命名為 local_hf_qwen2_5_1_5b_instruct.py
:
from opencompass.models import HuggingFacewithChatTemplatemodels = [dict(type=HuggingFacewithChatTemplate,abbr='local_qwen2.5-1.5b-instruct-hf',path='/mnt/d/models/Qwen/Qwen2.5-1.5B-Instruct', # 指定本地路徑max_out_len=4096,batch_size=8,run_cfg=dict(num_gpus=1),)
]
6.2.1.3 再次使用python tools/list_configs.py | grep hf_qwen2_5
進行查看
| hf_qwen2_57b_a14b | opencompass/configs/models/qwen/hf_qwen2_57b_a14b.py |
| hf_qwen2_5_0_5b_instruct | opencompass/configs/models/qwen2_5/hf_qwen2_5_0_5b_instruct.py |
| hf_qwen2_5_14b_instruct | opencompass/configs/models/qwen2_5/hf_qwen2_5_14b_instruct.py |
| hf_qwen2_5_1_5b_instruct | opencompass/configs/models/qwen2_5/hf_qwen2_5_1_5b_instruct.py |
| hf_qwen2_5_32b_instruct | opencompass/configs/models/qwen2_5/hf_qwen2_5_32b_instruct.py |
| hf_qwen2_5_3b_instruct | opencompass/configs/models/qwen2_5/hf_qwen2_5_3b_instruct.py |
| hf_qwen2_5_72b_instruct | opencompass/configs/models/qwen2_5/hf_qwen2_5_72b_instruct.py |
| hf_qwen2_5_7b_instruct | opencompass/configs/models/qwen2_5/hf_qwen2_5_7b_instruct.py |
| local_hf_qwen2_5_1_5b_instruct | opencompass/configs/models/qwen2_5/local_hf_qwen2_5_1_5b_instruct.py |
6.2.1.4 運行
# cd opencompass
# conda activate open-compass# --models 要指定tools/list_configs.py中定義的模型配置文件
python run.py \--models local_hf_qwen2_5_1_5b_instruct \--datasets demo_gsm8k_chat_gen \--work-dir eval_results/hf/Qwen2.5-1.5B-Instruct \--debug
6.2.2 評測ollama模型
6.2.2.1 創建配置文件eval_ollama.py
from mmengine.config import read_base from opencompass.models import OpenAI
from opencompass.partitioners import NaivePartitioner
from opencompass.runners import LocalRunner
from opencompass.tasks import OpenICLInferTask with read_base(): from opencompass.configs.datasets.cmmlu.cmmlu_gen import cmmlu_datasets api_meta_template = dict(round=[ dict(role='HUMAN', api_role='HUMAN'), dict(role='BOT', api_role='BOT', generate=True),
], ) models = [ dict( abbr='ollama', type=OpenAI, path='qwen3:0.6b', # 指定ollama中的模型,可通過ollama list進行查看,如果不存在,則必須先下載 # 若不在這里指定,則需要在shell中使用 export OPENAI_BASE_URL=http://192.168.56.1:11434/v1# 不能 import os,否則會報錯,所以也不能使用os.environ['OPENAI_BASE_URL'] = 'http://192.168.56.1:11434/v1' # 這里要指定windows的IP,使用localhost訪問不到 openai_api_base='http://192.168.56.1:11434/v1/chat/completions', # 若不在這里指定,則需要在shell中使用 export OPENAI_API_KEY=Nonekey='None', meta_template=api_meta_template, query_per_second=1, max_out_len=2048, max_seq_len=2048, batch_size=2 ), ] infer = dict( partitioner=dict(type=NaivePartitioner), runner=dict(type=LocalRunner, max_num_workers=2, task=dict(type=OpenICLInferTask)),
) datasets = cmmlu_datasetsif __name__ == '__main__': from opencompass.cli.main import main import sys from pathlib import Path sys.argv.append(str(Path(__file__))) sys.argv.extend(['--work-dir', 'eval_results/ollama/qwen3_0_6b']) sys.argv.append('--debug') main()
注意:
- 代碼中指定的
path='qwen3:0.6b'
, 是指定ollama中的qwen3:0.6b
模型,可通過ollama list
進行查看,如果不存在,則必須先下載 - 腳本中指定的
openai_api_base='http://192.168.56.1:11434/v1/chat/completions'
,其中的192.168.56.1
是windows
的IP
,若直接使用localhost
,無法從WSL
訪問到ollama
。
6.2.2.2 啟動ollama服務
在windows
中打開powershell
# 配置ollama環境變量
$env:OLLAMA_HOST="0.0.0.0:11434"
$env:OLLAMA_MODELS="D:\models\ollama"
$env:OLLAMA_DEBUG="2"# 啟動ollama服務
ollama serve
6.2.2.3 運行方式1
# cd opencompass
# conda activate open-compass# 不加 --debug,可能會報錯
# 08/14 23:36:45 - OpenCompass - ERROR - /home/ubuntu/ws/opencompass/opencompass/runners/local.py - _launch - 241 - task OpenICLInfer[ollama/cmmlu-chinese_civil_service_exam] fail, see outputs/default/20250814_233629/logs/infer/ollama/cmmlu-chinese_civil_service_exam.out
python run.py eval_ollama.py --debug
6.2.2.4 運行方式2
這種方式可以pycharm
中使用debug
方式運行,方便debug
# cd opencompass
# conda activate open-compasspython eval_ollama.py
6.2.3 使用lmdeploy加速評測
默認transformers加載模型,推理比較慢。使用lmdeploy加載模型,加快推理速度。
6.2.3.1 安裝lmdeploy
conda activate open-compass
pip install lmdeploy
如果CUDA不是12+,請參考官方教程進行安裝:使用 LMDeploy 加速評測 — OpenCompass 0.4.2 documentation
6.2.3.2 查看支持lmdeploy加速的qwen2.5模型
# cd opencompass
# conda activate open-compasspython tools/list_configs.py lmdeploy_qwen2_5
+--------------------------------+----------------------------------------------------------------------+
| Model | Config Path |
|--------------------------------+----------------------------------------------------------------------|
| lmdeploy_qwen2_5_0_5b_instruct | opencompass/configs/models/qwen2_5/lmdeploy_qwen2_5_0_5b_instruct.py |
| lmdeploy_qwen2_5_14b | opencompass/configs/models/qwen2_5/lmdeploy_qwen2_5_14b.py |
| lmdeploy_qwen2_5_14b_instruct | opencompass/configs/models/qwen2_5/lmdeploy_qwen2_5_14b_instruct.py |
| lmdeploy_qwen2_5_1_5b | opencompass/configs/models/qwen2_5/lmdeploy_qwen2_5_1_5b.py |
| lmdeploy_qwen2_5_1_5b_instruct | opencompass/configs/models/qwen2_5/lmdeploy_qwen2_5_1_5b_instruct.py |
| lmdeploy_qwen2_5_32b | opencompass/configs/models/qwen2_5/lmdeploy_qwen2_5_32b.py |
| lmdeploy_qwen2_5_32b_instruct | opencompass/configs/models/qwen2_5/lmdeploy_qwen2_5_32b_instruct.py |
| lmdeploy_qwen2_5_3b_instruct | opencompass/configs/models/qwen2_5/lmdeploy_qwen2_5_3b_instruct.py |
| lmdeploy_qwen2_5_72b | opencompass/configs/models/qwen2_5/lmdeploy_qwen2_5_72b.py |
| lmdeploy_qwen2_5_72b_instruct | opencompass/configs/models/qwen2_5/lmdeploy_qwen2_5_72b_instruct.py |
| lmdeploy_qwen2_5_7b | opencompass/configs/models/qwen2_5/lmdeploy_qwen2_5_7b.py |
| lmdeploy_qwen2_5_7b_instruct | opencompass/configs/models/qwen2_5/lmdeploy_qwen2_5_7b_instruct.py |
+--------------------------------+----------------------------------------------------------------------+
6.2.3.3 創建配置文件
以lmdeploy_qwen2_5_1_5b_instruct
為模板,拷貝創建local_lmdeploy_qwen2_5_1_5b_instruct.py
,修改如下:
from opencompass.models import TurboMindModelwithChatTemplatemodels = [dict(type=TurboMindModelwithChatTemplate,abbr='local-qwen2.5-1.5b-instruct-turbomind',path='/mnt/d/models/Qwen/Qwen2.5-1.5B-Instruct', # 指定模型路徑engine_config=dict(session_len=16384, max_batch_size=16, tp=1),gen_config=dict(top_k=1, temperature=1e-6, top_p=0.9, max_new_tokens=4096),max_seq_len=16384,max_out_len=4096,batch_size=16,run_cfg=dict(num_gpus=1),)
]
6.2.3.4 運行
# cd opencompass
# conda activate open-compass# --models 要指定tools/list_configs.py中定義的模型配置文件
python run.py \--models local_lmdeploy_qwen2_5_1_5b_instruct \--datasets demo_gsm8k_chat_gen \--work-dir eval_results/lmdeploy/Qwen2.5-1.5B-Instruct \--debug