簡介
MinerU 2.0使用sglang加速,與之前差別較大,建議按照官方的Docker鏡像的方式啟動。
Docker鏡像
Dockerfile
這是官方的Dockerfile
# Use the official sglang image
FROM lmsysorg/sglang:v0.4.7-cu124# install mineru latest
RUN python3 -m pip install -U 'mineru[core]' -i https://mirrors.aliyun.com/pypi/simple --break-system-packages# Download models and update the configuration file
RUN /bin/bash -c "mineru-models-download -s modelscope -m all"# Set the entry point to activate the virtual environment and run the command line tool
ENTRYPOINT ["/bin/bash", "-c", "export MINERU_MODEL_SOURCE=local && exec \"$@\"", "--"]
建議使用下面這個Dockerfile,相較于官方的,它增加了緩存(提升下次構建的速度),值下載vlm的模型(官方還會下載pipeline的)。
# Use the official sglang image
FROM lmsysorg/sglang:v0.4.7-cu124# install mineru latest
RUN --mount=type=cache,id=mineru_cache,target=/root/.cache,sharing=locked \python3 -m pip install -U 'mineru[core]' -i https://mirrors.aliyun.com/pypi/simple --break-system-packages# Download models and update the configuration file
RUN --mount=type=cache,id=mineru_cache,target=/root/.cache,sharing=locked \mineru-models-download -s modelscope -m vlm && \cp -r /root/.cache/modelscope /tmp/modelscope
RUN mkdir -p /root/.cache && \mv /tmp/modelscope /root/.cache/modelscope# Set the entry point to activate the virtual environment and run the command line tool
ENTRYPOINT ["/bin/bash", "-c", "export MINERU_MODEL_SOURCE=local && exec \"$@\"", "--"]
構建Docker鏡像
docker build -t mineru-sglang:latest -f Dockerfile .
啟動
Docker
# --gpus all
docker run -e MINERU_MODEL_SOURCE=local --gpus '"device=0,1"' \--shm-size 100g \-p 80:80 \--ipc=host \mineru-sglang:latest \mineru-sglang-server --host 0.0.0.0 --port 80 --enable-torch-compile --tp 2
Docker compose
services:mineru-sglang:image: mineru-sglang:latestcontainer_name: mineru-sglangrestart: alwaysports:- 30000:30000environment:MINERU_MODEL_SOURCE: localentrypoint: mineru-sglang-servercommand:--host 0.0.0.0--port 80# --enable-torch-compile # You can also enable torch.compile to accelerate inference speed by approximately 15%# --dp 2 # If you have more than two GPUs with 24GB VRAM or above, you can use sglang's multi-GPU parallel mode to increase throughput # --tp 2 # If you have two GPUs with 12GB or 16GB VRAM, you can use the Tensor Parallel (TP) mode# --mem-fraction-static 0.7 # If you have two GPUs with 11GB VRAM, in addition to Tensor Parallel mode, you need to reduce the KV cache sizeulimits:memlock: -1stack: 67108864ipc: hosthealthcheck:test: ["CMD-SHELL", "curl -f http://localhost:30000/health || exit 1"]deploy:resources:reservations:devices:- driver: nvidiadevice_ids: ["0"]capabilities: [gpu]
測試
"""
pip install -U mineru -i https://mirrors.aliyun.com/pypi/simple
"""
import json
import os
import timefrom mineru.backend.vlm.vlm_analyze import doc_analyze as vlm_doc_analyze
from mineru.backend.vlm.vlm_middle_json_mkcontent import union_make as vlm_union_make
from mineru.cli.common import convert_pdf_bytes_to_bytes_by_pypdfium2, prepare_env
from mineru.data.data_reader_writer import FileBasedDataWriter
from mineru.utils.enum_class import MakeModedef process_pdf(file_path:str,
):output_dir = 'output'server_url = 'http://<mineru_sglang_ip>:<port>'f_make_md_mode = MakeMode.MM_MDf_dump_md = Truef_dump_content_list = Truef_dump_middle_json = Truef_dump_model_output = Truestart = time.time()parts = os.path.splitext(os.path.basename(file_path))pdf_file_name = parts[0]with open(file_path, 'rb') as f:pdf_bytes = f.read()pdf_bytes = convert_pdf_bytes_to_bytes_by_pypdfium2(pdf_bytes, 0, None)local_image_dir, local_md_dir = prepare_env(output_dir, pdf_file_name, 'auto')image_writer, md_writer = FileBasedDataWriter(local_image_dir), FileBasedDataWriter(local_md_dir)end1 = time.time()print(f'start to call sglang, cost, {end1 - start}')middle_json, infer_result = vlm_doc_analyze(pdf_bytes, image_writer=image_writer, backend='sglang-client',server_url=server_url)end2 = time.time()print(f'end to call sglang, cost, {end2 - end1}')pdf_info = middle_json["pdf_info"]# draw_layout_bbox(pdf_info, pdf_bytes, local_md_dir, f"{pdf_file_name}_layout.pdf")# draw_span_bbox(pdf_info, pdf_bytes, local_md_dir, f"{pdf_file_name}_span.pdf")if f_dump_md:image_dir = str(os.path.basename(local_image_dir))md_content_str = vlm_union_make(pdf_info, f_make_md_mode, image_dir)md_writer.write_string(f"{pdf_file_name}.md",md_content_str,)end3 = time.time()print(f'end to gen md, cost, {end3 - end2}')if f_dump_content_list:image_dir = str(os.path.basename(local_image_dir))content_list = vlm_union_make(pdf_info, MakeMode.CONTENT_LIST, image_dir)md_writer.write_string(f"{pdf_file_name}_content_list.json",json.dumps(content_list, ensure_ascii=False, indent=4),)if f_dump_middle_json:md_writer.write_string(f"{pdf_file_name}_middle.json",json.dumps(middle_json, ensure_ascii=False, indent=4),)if f_dump_model_output:model_output = ("\n" + "-" * 50 + "\n").join(infer_result)md_writer.write_string(f"{pdf_file_name}_model_output.txt",model_output,)print(f"local output dir is {local_md_dir}")if __name__ == '__main__':file = 'demo.pdf'process_pdf(file)