本地服務器端部署基于大模型的通用OCR項目——dots.ocr

本地服務器端部署基于大模型的通用OCR項目——dots.ocr

  • dots.ocr相關介紹
  • 本地服務器端部署
    • 第一步:安裝cuda12.8與CUDNN8.9.7
    • 第二步:創建項目所需的依賴環境
    • 第三步:啟動項目
    • 第四步:測試
    • 第五步:文本解析相關性測試
    • 第六步:利用gradio部署服務項目

dots.ocr相關介紹

dots.ocr 是一個強大的多語言文檔解析器,它在一個單一的視覺-語言模型中統一了布局檢測和內容識別,同時保持良好的閱讀順序。盡管其基礎是緊湊的1.7B參數LLM,但它實現了最先進的(SOTA)性能。

強大的性能: dots.ocr 在OmniDocBench上實現了文本、表格和閱讀順序的SOTA性能,同時在公式識別方面達到了與Doubao-1.5和gemini2.5-pro等更大模型相當的結果。
多語言支持: dots.ocr 對低資源語言展示了強大的解析能力,在我們內部的多語言文檔基準測試中,在布局檢測和內容識別方面都取得了決定性的優勢。
統一且簡單的架構: 通過利用單一的視覺-語言模型,dots.ocr 提供了一個比依賴復雜多模型流水線的傳統方法更簡潔的架構。通過簡單地改變輸入提示即可在任務之間切換,證明VLM可以與傳統的檢測模型如DocLayout-YOLO相比達到有競爭力的檢測結果。
高效且快速的性能: 基于緊湊的1.7B LLM構建,dots.ocr 比許多基于更大基礎的高性能模型提供了更快的推理速度。

本地服務器端部署

項目先決條件:
該項目官方要求的pytroch為2.7、vllm為0.9.1,并且由于該項目依賴于flash-attn==2.8.0.post2,該包的編譯安裝依賴于cuda版本為12.8,所以,需要項目需要較新版本的英偉達顯卡驅動與CUDA12.8(要求是12.8)

第一步:安裝cuda12.8與CUDNN8.9.7

下載地址依次是:

https://developer.nvidia.com/cuda-12-8-0-download-archive?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=22.04&target_type=runfile_local

在這里插入圖片描述
Ubutnu命令行運行如下指令:
下載cuda12.8

wget https://developer.download.nvidia.com/compute/cuda/12.8.0/local_installers/cuda_12.8.0_570.86.10_linux.run

安裝cuda12.8

# 賦予權限
chmod +x cuda_12.8.0_570.86.10_linux.run
# 執行安裝
./cuda_12.8.0_570.86.10_linux.run

修改~/.bashrc文件

vim ~/.bashrc

將如下內容添加到文件中

export CUDA_HOME=/usr/local/cuda-12.8
export PATH=$CUDA_HOME/bin:$PATH
export LD_LIBRARY_PATH=$CUDA_HOME/lib64:$LD_LIBRARY_PATH

在這里插入圖片描述
修改完后,執行如下指令以生效:

 source ~/.bashrc

確保nvcc可以被找到

which nvcc
nvcc -V

在這里插入圖片描述

下載cudnn8.9.7

https://developer.nvidia.com/rdp/cudnn-archive

在這里插入圖片描述

# 解壓壓縮包文件
tar -xvf cudnn-linux-x86_64-8.9.7.29_cuda12-archive.tar.xz

拷貝文件并授權

cd cudnn-linux-x86_64-8.9.7.29_cuda12-archivesudo cp include/cudnn*.h /usr/local/cuda-12.8/include/
sudo cp lib/libcudnn* /usr/local/cuda-12.8/lib64/
sudo chmod a+r /usr/local/cuda-12.8/include/cudnn*.h /usr/local/cuda-12.8/lib64/libcudnn*

驗證是否成功

cat /usr/local/cuda/include/cudnn_version.h | grep CUDNN_MAJOR -A 2

在這里插入圖片描述

第二步:創建項目所需的依賴環境

安裝好Miniconda后,執行如下指令創建虛擬環境

# 創建
conda create -n vllm_0_9_1 python=3.12
# 激活
conda activate vllm_0_9_1

下載項目代碼

git clone https://github.com/rednote-hilab/dots.ocr.git

安裝依賴庫

# 進入到項目目錄內
cd dots.ocr# 執行如下語句
pip install torch==2.7.0 torchvision==0.22.0 torchaudio==2.7.0 --index-url https://download.pytorch.org/whl/cu128   --default-timeout=100pip install vllm==0.9.1  --default-timeout=100 -i https://pypi.tuna.tsinghua.edu.cn/simplepip install -e .  --default-timeout=100 -i https://pypi.tuna.tsinghua.edu.cn/simple

模型下載
項目結構如下所示:

(vllm_0_9_1) root@d205d5131c12:/home/data/dots.ocr-master# tree -L 2
.
├── LICENSE
├── README.md
├── assets
│   ├── blog.md
│   ├── chart.png
│   ├── logo.png
│   ├── showcase
│   ├── showcase_origin
│   └── wechat.jpg
├── demo
│   ├── demo_gradio.py
│   ├── demo_gradio_annotion.py
│   ├── demo_hf.py
│   ├── demo_image1.jpg
│   ├── demo_pdf1.pdf
│   ├── demo_streamlit.py
│   ├── demo_vllm.py
│   └── launch_model_vllm.sh
├── docker
│   ├── Dockerfile
│   └── docker-compose.yml
├── dots_ocr
│   ├── __init__.py
│   ├── model
│   ├── parser.py
│   └── utils
├── dots_ocr.egg-info
│   ├── PKG-INFO
│   ├── SOURCES.txt
│   ├── dependency_links.txt
│   ├── requires.txt
│   └── top_level.txt
├── requirements.txt
├── setup.py
├── tools
│   └── download_model.py
└── weights└── DotsOCR12 directories, 26 files

在項目根目錄下執行如下指令下載模型:

python3 tools/download_model.py -t modelscope

第三步:啟動項目

以下啟動指令需要切換到項目根目錄下執行


export hf_model_path=./weights/DotsOCR  # Path to your downloaded model weights, Please use a directory name without periods (e.g., `DotsOCR` instead of `dots.ocr`) for the model save path. This is a temporary workaround pending our integration with Transformers.
export PYTHONPATH=$(dirname "$hf_model_path"):$PYTHONPATH
sed -i '/^from vllm\.entrypoints\.cli\.main import main$/a\
from DotsOCR import modeling_dots_ocr_vllm' `which vllm`  # If you downloaded model weights by yourself, please replace `DotsOCR` by your model saved directory name, and remember to use a directory name without periods (e.g., `DotsOCR` instead of `dots.ocr`) # launch vllm server# 單張顯卡
CUDA_VISIBLE_DEVICES=0 vllm serve ${hf_model_path} \--tensor-parallel-size 1 \--host 0.0.0.0     \--port 8070     \--gpu-memory-utilization 0.95 \--chat-template-content-format string \--served-model-name model \--trust-remote-code# If you get a ModuleNotFoundError: No module named 'DotsOCR', please check the note above on the saved model directory name.

項目順利啟動成功:
在這里插入圖片描述

第四步:測試

上述vllm拉起項目后,便可運行官方自帶的測試腳本進行測試:
dots.ocr-master/demo目錄下,有個名稱為demo_vllm.py的腳本,內容如下:

import argparse
import osfrom openai import OpenAI
from transformers.utils.versions import require_version
from PIL import Image
import io
import base64
from dots_ocr.utils import dict_promptmode_to_prompt
from dots_ocr.model.inference import inference_with_vllmparser = argparse.ArgumentParser()
parser.add_argument("--ip", type=str, default="localhost")
parser.add_argument("--port", type=str, default="8000")
parser.add_argument("--model_name", type=str, default="model")
parser.add_argument("--prompt_mode", type=str, default="prompt_layout_all_en")args = parser.parse_args()require_version("openai>=1.5.0", "To fix: pip install openai>=1.5.0")def main():addr = f"http://{args.ip}:{args.port}/v1"image_path = "demo/demo_image1.jpg"prompt = dict_promptmode_to_prompt[args.prompt_mode]image = Image.open(image_path)response = inference_with_vllm(image,prompt, ip="localhost",port=8000,temperature=0.1,top_p=0.9,)print(f"response: {response}")if __name__ == "__main__":main()

**注意:**如果在拉起vllm項目時修改了訪問ip與端口號后,直接運行上述腳本會報錯(原因是該腳本訪問ip與端口號被寫死)。
需要進行相應改動,修改完后的腳本如下:

import argparse
import osfrom openai import OpenAI
from transformers.utils.versions import require_version
from PIL import Image
import io
import base64
from dots_ocr.utils import dict_promptmode_to_prompt
from dots_ocr.model.inference import inference_with_vllmparser = argparse.ArgumentParser()
parser.add_argument("--ip", type=str, default="localhost")
parser.add_argument("--port", type=str, default="8070")
parser.add_argument("--model_name", type=str, default="model")
parser.add_argument("--prompt_mode", type=str, default="prompt_layout_all_en")args = parser.parse_args()require_version("openai>=1.5.0", "To fix: pip install openai>=1.5.0")def main():addr = f"http://{args.ip}:{args.port}/v1"image_path = "demo/demo_image1.jpg"prompt = dict_promptmode_to_prompt[args.prompt_mode]image = Image.open(image_path)response = inference_with_vllm(image,prompt, ip=args.ip,port=int(args.port),temperature=0.1,top_p=0.9,)print(f"response: {response}")if __name__ == "__main__":main()

然后直接運行下屬指令便可返回識別結果

python3 ./demo/demo_vllm.py --prompt_mode prompt_layout_all_en --ip 192.168.4.22 --port 8070

在這里插入圖片描述

第五步:文本解析相關性測試

進入到dots.ocr-master/dots_ocr目錄下,有個名為parser.py的文件,可以提供如下測試:
針對圖像或 PDF 文件進行 版面分析(Layout Parsing)和文字識別(OCR)。每條命令根據不同的參數控制功能側重。下面是詳細解釋:


# 基礎命令:同時做版面檢測 + OCR#對一張圖片進行:#版面分析(如識別段落、標題、表格、圖片區域等布局結構)#OCR 識別(提取其中的文字內容)
python3 dots_ocr/parser.py demo/demo_image1.jpg# -------------------------------------------------------------# 對一個 PDF 文件進行版面檢測和 OCR。# 額外加了 --num_thread 64,說明:# 多線程解析 PDF 的每一頁(尤其是多頁 PDF),提高處理速度# 可以嘗試更高線程數(如 --num_thread 128)以加速大文檔處理
python3 dots_ocr/parser.py demo/demo_pdf1.pdf  --num_thread 64  # try bigger num_threads for pdf with a large number of pages# 只做版面分析(不提取文字)
python3 dots_ocr/parser.py demo/demo_image1.jpg --prompt prompt_layout_only_en# 只做 OCR,不提取頁眉頁腳的文字 Parse text only, except Page-header and Page-footer
python3 dots_ocr/parser.py demo/demo_image1.jpg --prompt prompt_ocr#  對指定區域(bounding box)做 OCR+Layout 分析 Parse layout info by bbox
python3 dots_ocr/parser.py demo/demo_image1.jpg --prompt prompt_grounding_ocr --bbox 163 241 1536 705

如果上述vllm在啟動模型時配置了相關的ip與端口,則在啟動時需要加上–ip與–port
如下所示:

python3 dots_ocr/parser.py demo/demo_pdf1.pdf  --num_thread 64  --ip 192.168.4.22 --port 8070

腳本代碼如下所示:

import os
import json
from tqdm import tqdm
from multiprocessing.pool import ThreadPool, Pool
import argparsefrom dots_ocr.model.inference import inference_with_vllm
from dots_ocr.utils.consts import image_extensions, MIN_PIXELS, MAX_PIXELS
from dots_ocr.utils.image_utils import get_image_by_fitz_doc, fetch_image, smart_resize
from dots_ocr.utils.doc_utils import fitz_doc_to_image, load_images_from_pdf
from dots_ocr.utils.prompts import dict_promptmode_to_prompt
from dots_ocr.utils.layout_utils import post_process_output, draw_layout_on_image, pre_process_bboxes
from dots_ocr.utils.format_transformer import layoutjson2mdclass DotsOCRParser:"""parse image or pdf file"""def __init__(self, ip='localhost',port=8000,model_name='model',temperature=0.1,top_p=1.0,max_completion_tokens=16384,num_thread=64,dpi = 200, output_dir="./output", min_pixels=None,max_pixels=None,):self.dpi = dpi# default args for vllm serverself.ip = ipself.port = portself.model_name = model_name# default args for inferenceself.temperature = temperatureself.top_p = top_pself.max_completion_tokens = max_completion_tokensself.num_thread = num_threadself.output_dir = output_dirself.min_pixels = min_pixelsself.max_pixels = max_pixelsassert self.min_pixels is None or self.min_pixels >= MIN_PIXELSassert self.max_pixels is None or self.max_pixels <= MAX_PIXELSdef _inference_with_vllm(self, image, prompt):response = inference_with_vllm(image,prompt, model_name=self.model_name,ip=self.ip,port=self.port,temperature=self.temperature,top_p=self.top_p,max_completion_tokens=self.max_completion_tokens,)return responsedef get_prompt(self, prompt_mode, bbox=None, origin_image=None, image=None, min_pixels=None, max_pixels=None):prompt = dict_promptmode_to_prompt[prompt_mode]if prompt_mode == 'prompt_grounding_ocr':assert bbox is not Nonebboxes = [bbox]bbox = pre_process_bboxes(origin_image, bboxes, input_width=image.width, input_height=image.height, min_pixels=min_pixels, max_pixels=max_pixels)[0]prompt = prompt + str(bbox)return prompt# def post_process_results(self, response, prompt_mode, save_dir, save_name, origin_image, image, min_pixels, max_pixels)def _parse_single_image(self, origin_image, prompt_mode, save_dir, save_name, source="image", page_idx=0, bbox=None,fitz_preprocess=False,):min_pixels, max_pixels = self.min_pixels, self.max_pixelsif prompt_mode == "prompt_grounding_ocr":min_pixels = min_pixels or MIN_PIXELS  # preprocess image to the final inputmax_pixels = max_pixels or MAX_PIXELSif min_pixels is not None: assert min_pixels >= MIN_PIXELS, f"min_pixels should >= {MIN_PIXELS}"if max_pixels is not None: assert max_pixels <= MAX_PIXELS, f"max_pixels should <+ {MAX_PIXELS}"if source == 'image' and fitz_preprocess:image = get_image_by_fitz_doc(origin_image, target_dpi=self.dpi)image = fetch_image(image, min_pixels=min_pixels, max_pixels=max_pixels)else:image = fetch_image(origin_image, min_pixels=min_pixels, max_pixels=max_pixels)input_height, input_width = smart_resize(image.height, image.width)prompt = self.get_prompt(prompt_mode, bbox, origin_image, image, min_pixels=min_pixels, max_pixels=max_pixels)response = self._inference_with_vllm(image, prompt)result = {'page_no': page_idx,"input_height": input_height,"input_width": input_width}if source == 'pdf':save_name = f"{save_name}_page_{page_idx}"if prompt_mode in ['prompt_layout_all_en', 'prompt_layout_only_en', 'prompt_grounding_ocr']:cells, filtered = post_process_output(response, prompt_mode, origin_image, image,min_pixels=min_pixels, max_pixels=max_pixels,)if filtered and prompt_mode != 'prompt_layout_only_en':  # model output json failed, use filtered processjson_file_path = os.path.join(save_dir, f"{save_name}.json")with open(json_file_path, 'w') as w:json.dump(response, w, ensure_ascii=False)image_layout_path = os.path.join(save_dir, f"{save_name}.jpg")origin_image.save(image_layout_path)result.update({'layout_info_path': json_file_path,'layout_image_path': image_layout_path,})md_file_path = os.path.join(save_dir, f"{save_name}.md")with open(md_file_path, "w", encoding="utf-8") as md_file:md_file.write(cells)result.update({'md_content_path': md_file_path})result.update({'filtered': True})else:try:image_with_layout = draw_layout_on_image(origin_image, cells)except Exception as e:print(f"Error drawing layout on image: {e}")image_with_layout = origin_imagejson_file_path = os.path.join(save_dir, f"{save_name}.json")with open(json_file_path, 'w') as w:json.dump(cells, w, ensure_ascii=False)image_layout_path = os.path.join(save_dir, f"{save_name}.jpg")image_with_layout.save(image_layout_path)result.update({'layout_info_path': json_file_path,'layout_image_path': image_layout_path,})if prompt_mode != "prompt_layout_only_en":  # no text md when detection onlymd_content = layoutjson2md(origin_image, cells, text_key='text')md_content_no_hf = layoutjson2md(origin_image, cells, text_key='text', no_page_hf=True) # used for clean output or metric of omnidocbench、olmbench md_file_path = os.path.join(save_dir, f"{save_name}.md")with open(md_file_path, "w", encoding="utf-8") as md_file:md_file.write(md_content)md_nohf_file_path = os.path.join(save_dir, f"{save_name}_nohf.md")with open(md_nohf_file_path, "w", encoding="utf-8") as md_file:md_file.write(md_content_no_hf)result.update({'md_content_path': md_file_path,'md_content_nohf_path': md_nohf_file_path,})else:image_layout_path = os.path.join(save_dir, f"{save_name}.jpg")origin_image.save(image_layout_path)result.update({'layout_image_path': image_layout_path,})md_content = responsemd_file_path = os.path.join(save_dir, f"{save_name}.md")with open(md_file_path, "w", encoding="utf-8") as md_file:md_file.write(md_content)result.update({'md_content_path': md_file_path,})return resultdef parse_image(self, input_path, filename, prompt_mode, save_dir, bbox=None, fitz_preprocess=False):origin_image = fetch_image(input_path)result = self._parse_single_image(origin_image, prompt_mode, save_dir, filename, source="image", bbox=bbox, fitz_preprocess=fitz_preprocess)result['file_path'] = input_pathreturn [result]def parse_pdf(self, input_path, filename, prompt_mode, save_dir):print(f"loading pdf: {input_path}")images_origin = load_images_from_pdf(input_path, dpi=self.dpi)total_pages = len(images_origin)tasks = [{"origin_image": image,"prompt_mode": prompt_mode,"save_dir": save_dir,"save_name": filename,"source":"pdf","page_idx": i,} for i, image in enumerate(images_origin)]def _execute_task(task_args):return self._parse_single_image(**task_args)num_thread = min(total_pages, self.num_thread)print(f"Parsing PDF with {total_pages} pages using {num_thread} threads...")results = []with ThreadPool(num_thread) as pool:with tqdm(total=total_pages, desc="Processing PDF pages") as pbar:for result in pool.imap_unordered(_execute_task, tasks):results.append(result)pbar.update(1)results.sort(key=lambda x: x["page_no"])for i in range(len(results)):results[i]['file_path'] = input_pathreturn resultsdef parse_file(self, input_path, output_dir="", prompt_mode="prompt_layout_all_en",bbox=None,fitz_preprocess=False):output_dir = output_dir or self.output_diroutput_dir = os.path.abspath(output_dir)filename, file_ext = os.path.splitext(os.path.basename(input_path))save_dir = os.path.join(output_dir, filename)os.makedirs(save_dir, exist_ok=True)if file_ext == '.pdf':results = self.parse_pdf(input_path, filename, prompt_mode, save_dir)elif file_ext in image_extensions:results = self.parse_image(input_path, filename, prompt_mode, save_dir, bbox=bbox, fitz_preprocess=fitz_preprocess)else:raise ValueError(f"file extension {file_ext} not supported, supported extensions are {image_extensions} and pdf")print(f"Parsing finished, results saving to {save_dir}")with open(os.path.join(output_dir, os.path.basename(filename)+'.jsonl'), 'w') as w:for result in results:w.write(json.dumps(result, ensure_ascii=False) + '\n')return resultsdef main():prompts = list(dict_promptmode_to_prompt.keys())parser = argparse.ArgumentParser(description="dots.ocr Multilingual Document Layout Parser",)parser.add_argument("input_path", type=str,help="Input PDF/image file path")parser.add_argument("--output", type=str, default="./output",help="Output directory (default: ./output)")parser.add_argument("--prompt", choices=prompts, type=str, default="prompt_layout_all_en",help="prompt to query the model, different prompts for different tasks")parser.add_argument('--bbox', type=int, nargs=4, metavar=('x1', 'y1', 'x2', 'y2'),help='should give this argument if you want to prompt_grounding_ocr')parser.add_argument("--ip", type=str, default="localhost",help="")parser.add_argument("--port", type=int, default=8000,help="")parser.add_argument("--model_name", type=str, default="model",help="")parser.add_argument("--temperature", type=float, default=0.1,help="")parser.add_argument("--top_p", type=float, default=1.0,help="")parser.add_argument("--dpi", type=int, default=200,help="")parser.add_argument("--max_completion_tokens", type=int, default=16384,help="")parser.add_argument("--num_thread", type=int, default=16,help="")# parser.add_argument(#     "--fitz_preprocess", type=bool, default=False,#     help="False will use tikz dpi upsample pipeline, good for images which has been render with low dpi, but maybe result in higher computational costs"# )parser.add_argument("--min_pixels", type=int, default=None,help="")parser.add_argument("--max_pixels", type=int, default=None,help="")args = parser.parse_args()dots_ocr_parser = DotsOCRParser(ip=args.ip,port=args.port,model_name=args.model_name,temperature=args.temperature,top_p=args.top_p,max_completion_tokens=args.max_completion_tokens,num_thread=args.num_thread,dpi=args.dpi,output_dir=args.output, min_pixels=args.min_pixels,max_pixels=args.max_pixels,)result = dots_ocr_parser.parse_file(args.input_path, prompt_mode=args.prompt,bbox=args.bbox,)if __name__ == "__main__":main()

輸出結果

  • 結構化布局數據 (demo_image1.json):包含檢測到的布局元素(包括它們的邊界框、類別和提取的文本)的 JSON 文件。
  • 處理后的 Markdown 文件 (demo_image1.md):從所有檢測到的單元格拼接文本生成的 Markdown 文件。

另提供了一個版本 demo_image1_nohf.md,它排除了頁面頭部和底部,以兼容如 Omnidocbench 和 olmOCR-bench 等基準測試。

  • 布局可視化 (demo_image1.jpg):原始圖像上繪制了檢測到的布局邊界框。

第六步:利用gradio部署服務項目

思路是:vllm已經加載模型啟動項目,使用gradio啟動一個服務,訪問vllm加載的模型,返回識別結果
在dots.ocr-master/demo目錄下,修改demo_gradio.py文件,配置成vllm項目的ip與端口,如下所示
在這里插入圖片描述

執行如下指令啟動gradio服務

python demo/demo_gradio.py

腳本代碼:

"""
Layout Inference Web Application with GradioA Gradio-based layout inference tool that supports image uploads and multiple backend inference engines.
It adopts a reference-style interface design while preserving the original inference logic.
"""import gradio as gr
import json
import os
import io
import tempfile
import base64
import zipfile
import uuid
import re
from pathlib import Path
from PIL import Image
import requests# Local tool imports
from dots_ocr.utils import dict_promptmode_to_prompt
from dots_ocr.utils.consts import MIN_PIXELS, MAX_PIXELS
from dots_ocr.utils.demo_utils.display import read_image
from dots_ocr.utils.doc_utils import load_images_from_pdf# Add DotsOCRParser import
from dots_ocr.parser import DotsOCRParser# ==================== Configuration ====================
DEFAULT_CONFIG = {'ip': "192.168.4.22",'port_vllm': 8070,'min_pixels': MIN_PIXELS,'max_pixels': MAX_PIXELS,'test_images_dir': "./assets/showcase_origin",
}# ==================== Global Variables ====================
# Store current configuration
current_config = DEFAULT_CONFIG.copy()# Create DotsOCRParser instance
dots_parser = DotsOCRParser(ip=DEFAULT_CONFIG['ip'],port=DEFAULT_CONFIG['port_vllm'],dpi=200,min_pixels=DEFAULT_CONFIG['min_pixels'],max_pixels=DEFAULT_CONFIG['max_pixels']
)# Store processing results
processing_results = {'original_image': None,'processed_image': None,'layout_result': None,'markdown_content': None,'cells_data': None,'temp_dir': None,'session_id': None,'result_paths': None,'pdf_results': None  # Store multi-page PDF results
}# PDF caching mechanism
pdf_cache = {"images": [],"current_page": 0,"total_pages": 0,"file_type": None,  # 'image' or 'pdf'"is_parsed": False,  # Whether it has been parsed"results": []  # Store parsing results for each page
}def read_image_v2(img):"""Reads an image, supports URLs and local paths"""if isinstance(img, str) and img.startswith(("http://", "https://")):with requests.get(img, stream=True) as response:response.raise_for_status()img = Image.open(io.BytesIO(response.content))elif isinstance(img, str):img, _, _ = read_image(img, use_native=True)elif isinstance(img, Image.Image):passelse:raise ValueError(f"Invalid image type: {type(img)}")return imgdef load_file_for_preview(file_path):"""Loads a file for preview, supports PDF and image files"""global pdf_cacheif not file_path or not os.path.exists(file_path):return None, "<div id='page_info_box'>0 / 0</div>"file_ext = os.path.splitext(file_path)[1].lower()if file_ext == '.pdf':try:# Read PDF and convert to images (one image per page)pages = load_images_from_pdf(file_path)pdf_cache["file_type"] = "pdf"except Exception as e:return None, f"<div id='page_info_box'>PDF loading failed: {str(e)}</div>"elif file_ext in ['.jpg', '.jpeg', '.png']:# For image files, read directly as a single-page imagetry:image = Image.open(file_path)pages = [image]pdf_cache["file_type"] = "image"except Exception as e:return None, f"<div id='page_info_box'>Image loading failed: {str(e)}</div>"else:return None, "<div id='page_info_box'>Unsupported file format</div>"pdf_cache["images"] = pagespdf_cache["current_page"] = 0pdf_cache["total_pages"] = len(pages)pdf_cache["is_parsed"] = Falsepdf_cache["results"] = []return pages[0], f"<div id='page_info_box'>1 / {len(pages)}</div>"def turn_page(direction):"""Page turning function"""global pdf_cacheif not pdf_cache["images"]:return None, "<div id='page_info_box'>0 / 0</div>", "", ""if direction == "prev":pdf_cache["current_page"] = max(0, pdf_cache["current_page"] - 1)elif direction == "next":pdf_cache["current_page"] = min(pdf_cache["total_pages"] - 1, pdf_cache["current_page"] + 1)index = pdf_cache["current_page"]current_image = pdf_cache["images"][index]  # Use the original image by defaultpage_info = f"<div id='page_info_box'>{index + 1} / {pdf_cache['total_pages']}</div>"# If parsed, display the results for the current pagecurrent_md = ""current_md_raw = ""current_json = ""if pdf_cache["is_parsed"] and index < len(pdf_cache["results"]):result = pdf_cache["results"][index]if 'md_content' in result:# Get the raw markdown contentcurrent_md_raw = result['md_content']# Process the content after LaTeX renderingcurrent_md = result['md_content'] if result['md_content'] else ""if 'cells_data' in result:try:current_json = json.dumps(result['cells_data'], ensure_ascii=False, indent=2)except:current_json = str(result.get('cells_data', ''))# Use the image with layout boxes (if available)if 'layout_image' in result and result['layout_image']:current_image = result['layout_image']return current_image, page_info, current_jsondef get_test_images():"""Gets the list of test images"""test_images = []test_dir = current_config['test_images_dir']if os.path.exists(test_dir):test_images = [os.path.join(test_dir, name) for name in os.listdir(test_dir) if name.lower().endswith(('.png', '.jpg', '.jpeg', '.pdf'))]return test_imagesdef convert_image_to_base64(image):"""Converts a PIL image to base64 encoding"""buffered = io.BytesIO()image.save(buffered, format="PNG")img_str = base64.b64encode(buffered.getvalue()).decode()return f"data:image/png;base64,{img_str}"def create_temp_session_dir():"""Creates a unique temporary directory for each processing request"""session_id = uuid.uuid4().hex[:8]temp_dir = os.path.join(tempfile.gettempdir(), f"dots_ocr_demo_{session_id}")os.makedirs(temp_dir, exist_ok=True)return temp_dir, session_iddef parse_image_with_high_level_api(parser, image, prompt_mode, fitz_preprocess=False):"""Processes using the high-level API parse_image from DotsOCRParser"""# Create a temporary session directorytemp_dir, session_id = create_temp_session_dir()try:# Save the PIL Image as a temporary filetemp_image_path = os.path.join(temp_dir, f"input_{session_id}.png")image.save(temp_image_path, "PNG")# Use the high-level API parse_imagefilename = f"demo_{session_id}"results = parser.parse_image(# input_path=temp_image_path,input_path=image,filename=filename, prompt_mode=prompt_mode,save_dir=temp_dir,fitz_preprocess=fitz_preprocess)# Parse the resultsif not results:raise ValueError("No results returned from parser")result = results[0]  # parse_image returns a list with a single result# Read the result fileslayout_image = Nonecells_data = Nonemd_content = Noneraw_response = Nonefiltered = False# Read the layout imageif 'layout_image_path' in result and os.path.exists(result['layout_image_path']):layout_image = Image.open(result['layout_image_path'])# Read the JSON dataif 'layout_info_path' in result and os.path.exists(result['layout_info_path']):with open(result['layout_info_path'], 'r', encoding='utf-8') as f:cells_data = json.load(f)# Read the Markdown contentif 'md_content_path' in result and os.path.exists(result['md_content_path']):with open(result['md_content_path'], 'r', encoding='utf-8') as f:md_content = f.read()# Check for the raw response file (when JSON parsing fails)if 'filtered' in result:filtered = result['filtered']return {'layout_image': layout_image,'cells_data': cells_data,'md_content': md_content,'filtered': filtered,'temp_dir': temp_dir,'session_id': session_id,'result_paths': result,'input_width': result['input_width'],'input_height': result['input_height'],}except Exception as e:# Clean up the temporary directory on errorimport shutilif os.path.exists(temp_dir):shutil.rmtree(temp_dir, ignore_errors=True)raise edef parse_pdf_with_high_level_api(parser, pdf_path, prompt_mode):"""Processes using the high-level API parse_pdf from DotsOCRParser"""# Create a temporary session directorytemp_dir, session_id = create_temp_session_dir()try:# Use the high-level API parse_pdffilename = f"demo_{session_id}"results = parser.parse_pdf(input_path=pdf_path,filename=filename,prompt_mode=prompt_mode,save_dir=temp_dir)# Parse the resultsif not results:raise ValueError("No results returned from parser")# Handle multi-page resultsparsed_results = []all_md_content = []all_cells_data = []for i, result in enumerate(results):page_result = {'page_no': result.get('page_no', i),'layout_image': None,'cells_data': None,'md_content': None,'filtered': False}# Read the layout imageif 'layout_image_path' in result and os.path.exists(result['layout_image_path']):page_result['layout_image'] = Image.open(result['layout_image_path'])# Read the JSON dataif 'layout_info_path' in result and os.path.exists(result['layout_info_path']):with open(result['layout_info_path'], 'r', encoding='utf-8') as f:page_result['cells_data'] = json.load(f)all_cells_data.extend(page_result['cells_data'])# Read the Markdown contentif 'md_content_path' in result and os.path.exists(result['md_content_path']):with open(result['md_content_path'], 'r', encoding='utf-8') as f:page_content = f.read()page_result['md_content'] = page_contentall_md_content.append(page_content)# Check for the raw response file (when JSON parsing fails)page_result['filtered'] = Falseif 'filtered' in page_result:page_result['filtered'] = page_result['filtered']parsed_results.append(page_result)# Merge the content of all pagescombined_md = "\n\n---\n\n".join(all_md_content) if all_md_content else ""return {'parsed_results': parsed_results,'combined_md_content': combined_md,'combined_cells_data': all_cells_data,'temp_dir': temp_dir,'session_id': session_id,'total_pages': len(results)}except Exception as e:# Clean up the temporary directory on errorimport shutilif os.path.exists(temp_dir):shutil.rmtree(temp_dir, ignore_errors=True)raise e# ==================== Core Processing Function ====================
def process_image_inference(test_image_input, file_input,prompt_mode, server_ip, server_port, min_pixels, max_pixels,fitz_preprocess=False):"""Core function to handle image/PDF inference"""global current_config, processing_results, dots_parser, pdf_cache# First, clean up previous processing results to avoid confusion with the download buttonif processing_results.get('temp_dir') and os.path.exists(processing_results['temp_dir']):import shutiltry:shutil.rmtree(processing_results['temp_dir'], ignore_errors=True)except Exception as e:print(f"Failed to clean up previous temporary directory: {e}")# Reset processing resultsprocessing_results = {'original_image': None,'processed_image': None,'layout_result': None,'markdown_content': None,'cells_data': None,'temp_dir': None,'session_id': None,'result_paths': None,'pdf_results': None}# Update configurationcurrent_config.update({'ip': server_ip,'port_vllm': server_port,'min_pixels': min_pixels,'max_pixels': max_pixels})# Update parser configurationdots_parser.ip = server_ipdots_parser.port = server_portdots_parser.min_pixels = min_pixelsdots_parser.max_pixels = max_pixels# Determine the input sourceinput_file_path = Noneimage = None# Prioritize file input (supports PDF)if file_input is not None:input_file_path = file_inputfile_ext = os.path.splitext(input_file_path)[1].lower()if file_ext == '.pdf':# PDF file processingtry:return process_pdf_file(input_file_path, prompt_mode)except Exception as e:return None, f"PDF processing failed: {e}", "", "", gr.update(value=None), None, ""elif file_ext in ['.jpg', '.jpeg', '.png']:# Image file processingtry:image = Image.open(input_file_path)except Exception as e:return None, f"Failed to read image file: {e}", "", "", gr.update(value=None), None, ""# If no file input, check the test image inputif image is None:if test_image_input and test_image_input != "":file_ext = os.path.splitext(test_image_input)[1].lower()if file_ext == '.pdf':return process_pdf_file(test_image_input, prompt_mode)else:try:image = read_image_v2(test_image_input)except Exception as e:return None, f"Failed to read test image: {e}", "", "", gr.update(value=None), gr.update(value=None), None, ""if image is None:return None, "Please upload image/PDF file or select test image", "", "", gr.update(value=None), None, ""try:# Clear PDF cache (for image processing)pdf_cache["images"] = []pdf_cache["current_page"] = 0pdf_cache["total_pages"] = 0pdf_cache["is_parsed"] = Falsepdf_cache["results"] = []# Process using the high-level API of DotsOCRParseroriginal_image = imageparse_result = parse_image_with_high_level_api(dots_parser, image, prompt_mode, fitz_preprocess)# Extract parsing resultslayout_image = parse_result['layout_image']cells_data = parse_result['cells_data']md_content = parse_result['md_content']filtered = parse_result['filtered']# Handle parsing failure caseif filtered:# JSON parsing failed, only text content is availableinfo_text = f"""
**Image Information:**
- Original Size: {original_image.width} x {original_image.height}
- Processing: JSON parsing failed, using cleaned text output
- Server: {current_config['ip']}:{current_config['port_vllm']}
- Session ID: {parse_result['session_id']}"""# Store resultsprocessing_results.update({'original_image': original_image,'processed_image': None,'layout_result': None,'markdown_content': md_content,'cells_data': None,'temp_dir': parse_result['temp_dir'],'session_id': parse_result['session_id'],'result_paths': parse_result['result_paths']})return (original_image,  # No layout imageinfo_text,md_content,md_content,  # Display raw markdown textgr.update(visible=False),  # Hide download buttonNone,  # Page info""     # Current page JSON output)# JSON parsing successful case# Save the raw markdown content (before LaTeX processing)md_content_raw = md_content or "No markdown content generated"# Store resultsprocessing_results.update({'original_image': original_image,'processed_image': None,  # High-level API does not return processed_image'layout_result': layout_image,'markdown_content': md_content,'cells_data': cells_data,'temp_dir': parse_result['temp_dir'],'session_id': parse_result['session_id'],'result_paths': parse_result['result_paths']})# Prepare display informationnum_elements = len(cells_data) if cells_data else 0info_text = f"""
**Image Information:**
- Original Size: {original_image.width} x {original_image.height}
- Model Input Size: {parse_result['input_width']} x {parse_result['input_height']}
- Server: {current_config['ip']}:{current_config['port_vllm']}
- Detected {num_elements} layout elements
- Session ID: {parse_result['session_id']}"""# Current page JSON outputcurrent_json = ""if cells_data:try:current_json = json.dumps(cells_data, ensure_ascii=False, indent=2)except:current_json = str(cells_data)# Create the download ZIP filedownload_zip_path = Noneif parse_result['temp_dir']:download_zip_path = os.path.join(parse_result['temp_dir'], f"layout_results_{parse_result['session_id']}.zip")try:with zipfile.ZipFile(download_zip_path, 'w', zipfile.ZIP_DEFLATED) as zipf:for root, dirs, files in os.walk(parse_result['temp_dir']):for file in files:if file.endswith('.zip'):continuefile_path = os.path.join(root, file)arcname = os.path.relpath(file_path, parse_result['temp_dir'])zipf.write(file_path, arcname)except Exception as e:print(f"Failed to create download ZIP: {e}")download_zip_path = Nonereturn (layout_image,info_text,md_content or "No markdown content generated",md_content_raw,  # Raw markdown textgr.update(value=download_zip_path, visible=True) if download_zip_path else gr.update(visible=False),  # Set the download fileNone,  # Page info (not displayed for image processing)current_json     # Current page JSON)except Exception as e:return None, f"Error during processing: {e}", "", "", gr.update(value=None), None, ""def process_pdf_file(pdf_path, prompt_mode):"""Dedicated function for processing PDF files"""global pdf_cache, processing_results, dots_parsertry:# First, load the PDF for previewpreview_image, page_info = load_file_for_preview(pdf_path)# Parse the PDF using DotsOCRParserpdf_result = parse_pdf_with_high_level_api(dots_parser, pdf_path, prompt_mode)# Update the PDF cachepdf_cache["is_parsed"] = Truepdf_cache["results"] = pdf_result['parsed_results']# Handle LaTeX table renderingcombined_md = pdf_result['combined_md_content']combined_md_raw = combined_md or "No markdown content generated"  # Save the raw content# Store resultsprocessing_results.update({'original_image': None,'processed_image': None,'layout_result': None,'markdown_content': combined_md,'cells_data': pdf_result['combined_cells_data'],'temp_dir': pdf_result['temp_dir'],'session_id': pdf_result['session_id'],'result_paths': None,'pdf_results': pdf_result['parsed_results']})# Prepare display informationtotal_elements = len(pdf_result['combined_cells_data'])info_text = f"""
**PDF Information:**
- Total Pages: {pdf_result['total_pages']}
- Server: {current_config['ip']}:{current_config['port_vllm']}
- Total Detected Elements: {total_elements}
- Session ID: {pdf_result['session_id']}"""# Content of the current page (first page)current_page_md = ""current_page_md_raw = ""current_page_json = ""current_page_layout_image = preview_image  # Use the original preview image by defaultif pdf_cache["results"] and len(pdf_cache["results"]) > 0:current_result = pdf_cache["results"][0]if current_result['md_content']:# Raw markdown contentcurrent_page_md_raw = current_result['md_content']# Process the content after LaTeX renderingcurrent_page_md = current_result['md_content']if current_result['cells_data']:try:current_page_json = json.dumps(current_result['cells_data'], ensure_ascii=False, indent=2)except:current_page_json = str(current_result['cells_data'])# Use the image with layout boxes (if available)if 'layout_image' in current_result and current_result['layout_image']:current_page_layout_image = current_result['layout_image']# Create the download ZIP filedownload_zip_path = Noneif pdf_result['temp_dir']:download_zip_path = os.path.join(pdf_result['temp_dir'], f"layout_results_{pdf_result['session_id']}.zip")try:with zipfile.ZipFile(download_zip_path, 'w', zipfile.ZIP_DEFLATED) as zipf:for root, dirs, files in os.walk(pdf_result['temp_dir']):for file in files:if file.endswith('.zip'):continuefile_path = os.path.join(root, file)arcname = os.path.relpath(file_path, pdf_result['temp_dir'])zipf.write(file_path, arcname)except Exception as e:print(f"Failed to create download ZIP: {e}")download_zip_path = Nonereturn (current_page_layout_image,  # Use the image with layout boxesinfo_text,combined_md or "No markdown content generated",  # Display the markdown for the entire PDFcombined_md_raw or "No markdown content generated",  # Display the raw markdown for the entire PDFgr.update(value=download_zip_path, visible=True) if download_zip_path else gr.update(visible=False),  # Set the download filepage_info,current_page_json)except Exception as e:# Reset the PDF cachepdf_cache["images"] = []pdf_cache["current_page"] = 0pdf_cache["total_pages"] = 0pdf_cache["is_parsed"] = Falsepdf_cache["results"] = []raise edef clear_all_data():"""Clears all data"""global processing_results, pdf_cache# Clean up the temporary directoryif processing_results.get('temp_dir') and os.path.exists(processing_results['temp_dir']):import shutiltry:shutil.rmtree(processing_results['temp_dir'], ignore_errors=True)except Exception as e:print(f"Failed to clean up temporary directory: {e}")# Reset processing resultsprocessing_results = {'original_image': None,'processed_image': None,'layout_result': None,'markdown_content': None,'cells_data': None,'temp_dir': None,'session_id': None,'result_paths': None,'pdf_results': None}# Reset the PDF cachepdf_cache = {"images": [],"current_page": 0,"total_pages": 0,"file_type": None,"is_parsed": False,"results": []}return (None,  # Clear file input"",    # Clear test image selectionNone,  # Clear result image"Waiting for processing results...",  # Reset info display"## Waiting for processing results...",  # Reset Markdown display"馃晲 Waiting for parsing result...",    # Clear raw Markdown textgr.update(visible=False),  # Hide download button"<div id='page_info_box'>0 / 0</div>",  # Reset page info"馃晲 Waiting for parsing result..."     # Clear current page JSON)def update_prompt_display(prompt_mode):"""Updates the prompt display content"""return dict_promptmode_to_prompt[prompt_mode]# ==================== Gradio Interface ====================
def create_gradio_interface():"""Creates the Gradio interface"""# CSS styles, matching the reference stylecss = """#parse_button {background: #FF576D !important; /* !important 紜繚瑕嗙洊涓婚榛樿鏍峰紡 */border-color: #FF576D !important;}/* 榧犳爣鎮仠鏃剁殑棰滆壊 */#parse_button:hover {background: #F72C49 !important;border-color: #F72C49 !important;}#page_info_html {display: flex;align-items: center;justify-content: center;height: 100%;margin: 0 12px;}#page_info_box {padding: 8px 20px;font-size: 16px;border: 1px solid #bbb;border-radius: 8px;background-color: #f8f8f8;text-align: center;min-width: 80px;box-shadow: 0 1px 3px rgba(0,0,0,0.1);}#markdown_output {min-height: 800px;overflow: auto;}footer {visibility: hidden;}#info_box {padding: 10px;background-color: #f8f9fa;border-radius: 8px;border: 1px solid #dee2e6;margin: 10px 0;font-size: 14px;}#result_image {border-radius: 8px;}#markdown_tabs {height: 100%;}"""with gr.Blocks(theme="ocean", css=css, title='dots.ocr') as demo:# Titlegr.HTML("""<div style="display: flex; align-items: center; justify-content: center; margin-bottom: 20px;"><h1 style="margin: 0; font-size: 2em;">馃攳 dots.ocr</h1></div><div style="text-align: center; margin-bottom: 10px;"><em>Supports image/PDF layout analysis and structured output</em></div>""")with gr.Row():# Left side: Input and Configurationwith gr.Column(scale=1, elem_id="left-panel"):gr.Markdown("### 馃摜 Upload & Select")file_input = gr.File(label="Upload PDF/Image", type="filepath", file_types=[".pdf", ".jpg", ".jpeg", ".png"],)test_images = get_test_images()test_image_input = gr.Dropdown(label="Or Select an Example",choices=[""] + test_images,value="",)gr.Markdown("### 鈿欙笍 Prompt & Actions")prompt_mode = gr.Dropdown(label="Select Prompt",choices=["prompt_layout_all_en", "prompt_layout_only_en", "prompt_ocr"],value="prompt_layout_all_en",show_label=True)# Display current prompt contentprompt_display = gr.Textbox(label="Current Prompt Content",value=dict_promptmode_to_prompt[list(dict_promptmode_to_prompt.keys())[0]],lines=4,max_lines=8,interactive=False,show_copy_button=True)with gr.Row():process_btn = gr.Button("馃攳 Parse", variant="primary", scale=2, elem_id="parse_button")clear_btn = gr.Button("馃棏鋦?Clear", variant="secondary", scale=1)with gr.Accordion("馃洜鋦?Advanced Configuration", open=False):fitz_preprocess = gr.Checkbox(label="Enable fitz_preprocess for images", value=True,info="Processes image via a PDF-like pipeline (image->pdf->200dpi image). Recommended if your image DPI is low.")with gr.Row():server_ip = gr.Textbox(label="Server IP", value=DEFAULT_CONFIG['ip'])server_port = gr.Number(label="Port", value=DEFAULT_CONFIG['port_vllm'], precision=0)with gr.Row():min_pixels = gr.Number(label="Min Pixels", value=DEFAULT_CONFIG['min_pixels'], precision=0)max_pixels = gr.Number(label="Max Pixels", value=DEFAULT_CONFIG['max_pixels'], precision=0)# Right side: Result Displaywith gr.Column(scale=6, variant="compact"):with gr.Row():# Result Imagewith gr.Column(scale=3):gr.Markdown("### 馃憗鋦?File Preview")result_image = gr.Image(label="Layout Preview",visible=True,height=800,show_label=False)# Page navigation (shown during PDF preview)with gr.Row():prev_btn = gr.Button("猬?Previous", size="sm")page_info = gr.HTML(value="<div id='page_info_box'>0 / 0</div>", elem_id="page_info_html")next_btn = gr.Button("Next 鉃?, size="sm")# Info Displayinfo_display = gr.Markdown("Waiting for processing results...",elem_id="info_box")# Markdown Resultwith gr.Column(scale=3):gr.Markdown("### 鉁旓笍 Result Display")with gr.Tabs(elem_id="markdown_tabs"):with gr.TabItem("Markdown Render Preview"):md_output = gr.Markdown("## Please click the parse button to parse or select for single-task recognition...",label="Markdown Preview",max_height=600,latex_delimiters=[{"left": "$$", "right": "$$", "display": True},{"left": "$", "right": "$", "display": False},],show_copy_button=False,elem_id="markdown_output")with gr.TabItem("Markdown Raw Text"):md_raw_output = gr.Textbox(value="馃晲 Waiting for parsing result...",label="Markdown Raw Text",max_lines=100,lines=38,show_copy_button=True,elem_id="markdown_output",show_label=False)with gr.TabItem("Current Page JSON"):current_page_json = gr.Textbox(value="馃晲 Waiting for parsing result...",label="Current Page JSON",max_lines=100,lines=38,show_copy_button=True,elem_id="markdown_output",show_label=False)# Download Buttonwith gr.Row():download_btn = gr.DownloadButton("猬囷笍 Download Results",visible=False)# When the prompt mode changes, update the display contentprompt_mode.change(fn=update_prompt_display,inputs=prompt_mode,outputs=prompt_display,show_progress=False)# Show preview on file uploadfile_input.upload(fn=load_file_for_preview,inputs=file_input,outputs=[result_image, page_info],show_progress=False)# Page navigationprev_btn.click(fn=lambda: turn_page("prev"), outputs=[result_image, page_info, current_page_json],show_progress=False)next_btn.click(fn=lambda: turn_page("next"), outputs=[result_image, page_info, current_page_json],show_progress=False)process_btn.click(fn=process_image_inference,inputs=[test_image_input, file_input,prompt_mode, server_ip, server_port, min_pixels, max_pixels, fitz_preprocess],outputs=[result_image, info_display, md_output, md_raw_output,download_btn, page_info, current_page_json],show_progress=True)clear_btn.click(fn=clear_all_data,outputs=[file_input, test_image_input,result_image, info_display, md_output, md_raw_output,download_btn, page_info, current_page_json],show_progress=False)return demo# ==================== Main Program ====================
if __name__ == "__main__":demo = create_gradio_interface()demo.queue().launch(server_name="0.0.0.0", server_port=8091, debug=True)

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/web/92187.shtml
繁體地址,請注明出處:http://hk.pswp.cn/web/92187.shtml
英文地址,請注明出處:http://en.pswp.cn/web/92187.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

Text2SQL 智能問答系統開發-spider驗證集(三)

概述 已完成 基礎 Text2SQL 功能實現 實現用戶輸入自然語言問題后&#xff0c;系統能夠自動生成 SQL 并執行返回結果。用戶交互優化 支持用戶通過補充信息對查詢進行調整&#xff0c;提升易用性。模糊時間處理機制 對“最近”“近期”等模糊時間關鍵詞進行補全或引導&#xf…

ElementUI常用的組件展示

文章目錄1、要使用ElementUI先導入組件庫2、自定義表頭&#xff0c;可以改為添加和批量刪除的按鈕3、Dialog模態框&#xff0c;主要用于添加和修改時展示信息4、抽屜5、消息提示&#xff1a;用于提示是否操作成功6、詢問&#xff1a;常用于詢問是否確定刪除7、批量選擇復選框8、…

在電腦上可以存儲文件并合理備份文件的工具用哪個?

每天被群消息、報表、PPT 輪番轟炸的上班族&#xff0c;最怕的不是加班&#xff0c;而是——文件突然失蹤&#xff01;別再把“CtrlS”當護身符&#xff0c;今天一口氣測完 4 款熱門“文件保險箱”&#xff0c;看看誰才真正配得上你的 Deadline。 敬業簽 首先登場的是敬業簽&am…

JavaWeb(04)

MyBatis 時一款優秀的持久層框架&#xff0c;用于簡化JDBC的開發 The MyBatis Blog 目錄 MyBatis入門Mybatis基礎CRUDMybatis動態SQL Mybatis入門 快速入門 JDBC介紹 數據庫連接池 lombok 準備工作(創建springboot工程&#xff0c;數據庫表user&#xff0c;實體類User) …

統計學1:伯努利模型的參數估計與等價性分析

伯努利模型的參數估計方法 1. 統計學習方法三要素對比方法模型策略算法極大似然估計概率模型經驗風險最小化數值解貝葉斯估計概率模型結構風險最小化解析解2. 極大似然估計 2.1 模型設定 設P(x1)θP(x1)\thetaP(x1)θ&#xff0c;則P(x0)1?θP(x0)1-\thetaP(x0)1?θ 2.2 似然…

游戲行業DDoS攻防實戰指南

一、游戲DDoS攻擊特征分析游戲行業DDoS攻擊呈現高度復合化特征&#xff0c;攻擊手段日益專業化。2023年Akamai監測數據顯示&#xff0c;63%的游戲服務器攻擊采用UDP反射放大&#xff08;如NTP、Memcached協議&#xff09;與HTTP慢速攻擊&#xff08;如Slowloris&#xff09;相結…

[自動化Adapt] 錄制引擎 | iframe 穿透 | NTP | AIOSQLite | 數據分片

鏈接&#xff1a;https://github.com/OpenAdaptAI/OpenAdapt/wiki/OpenAdapt-Architecture-(draft) docs&#xff1a;OpenAdapt OpenAdapt 是一個開源項目&#xff0c;旨在 記錄 和 回放 用戶在計算機上的交互行為。 它如同智能助手般 觀察 我們的操作&#xff08;鼠標點擊、…

ipv6學習

ipv6的歷史背景和及展望ipv6普及不夠&#xff0c;ipv4快要用完。ipv6技術部分ivp6包頭結構ipv6不允許分片&#xff0c;減輕中間設備壓力。IPv6 包頭結構可按字段分層解析&#xff0c;核心特點是 固定頭部長度&#xff08;40 字節&#xff09; &#xff0c;將可選功能移至擴展頭…

軟件定義汽車 --- 電子電氣架構的驅動

我是穿拖鞋的漢子,魔都中堅持長期主義的汽車電子工程師。 老規矩,分享一段喜歡的文字,避免自己成為高知識低文化的工程師: 做到欲望極簡,了解自己的真實欲望,不受外在潮流的影響,不盲從,不跟風。把自己的精力全部用在自己。一是去掉多余,凡事找規律,基礎是誠信;二是…

HTML5 語義元素

HTML5 語義元素 引言 HTML5 作為現代網頁開發的基礎&#xff0c;引入了許多新的語義元素&#xff0c;這些元素使得網頁內容更加結構化&#xff0c;便于搜索引擎更好地理解和索引頁面內容。本文將詳細介紹 HTML5 中的語義元素&#xff0c;并探討其在網頁設計中的應用。 HTML5…

vue3 el-select el-option 使用

在 Vue 3 中&#xff0c;el-select 是 Element Plus 組件庫中的一個選擇器組件&#xff0c;它允許用戶從下拉菜單中選擇一個或多個選項。如果你想在使用 Vue 3 和 Element Plus 時讓 el-select 支持多種選擇&#xff08;即多選&#xff09;&#xff0c;你可以通過設置 multiple…

windows搬運文件腳本

使用方法&#xff1a;copy_files_by_prefix.bat [目標目錄] [結果目錄] [文件名前綴] [可選參數&#xff1a;文件包含內容]echo off chcp 65001 >nul setlocal enabledelayedexpansion:: Check parameters if "%~3""" (echo Usage: %~nx0 [SourceDir] […

C++ 中 initializer_list 類型推導

在 C 中&#xff0c;initializer_list 是一種用于表示列表初始化的標準庫模板類&#xff0c;提供了一種方便的方式來初始化容器或者進行函數調用時傳遞一組參數。initializer_list&& 類型推導涉及到右值引用和移動語義&#xff0c;這在現代 C 中變得越來越重要。initia…

自動駕駛中的傳感器技術22——Camera(13)

1、可靠性驗證的目標車載攝像頭作為自動駕駛和高級駕駛輔助系統&#xff08;ADAS&#xff09;的核心傳感器&#xff0c;其可靠性直接影響到行車安全。可靠性驗證的目標如下&#xff1a;暴露產品缺陷&#xff1a;在研制階段&#xff0c;通過測試發現并修正產品設計中的問題&…

一周學會Matplotlib3 Python 數據可視化-圖形的組成部分

鋒哥原創的Matplotlib3 Python數據可視化視頻教程&#xff1a; 2026版 Matplotlib3 Python 數據可視化 視頻教程(無廢話版) 玩命更新中~_嗶哩嗶哩_bilibili 課程介紹 本課程講解利用python進行數據可視化 科研繪圖-Matplotlib&#xff0c;學習Matplotlib圖形參數基本設置&…

三萬字帶你了解那些年面過的Java八股文

Java基礎 1. String 和StringBuffer 和 StringBuilder的區別&#xff1f; String 字符串常量 StringBuffer 字符串變量&#xff08;線程安全&#xff09; StringBuilder 字符串變量&#xff08;非線程安全&#xff09; 2. sleep() 區間wait()區間有什么區別&#xff1f; sleep…

HTML 媒體元素概述

HTML 提供了多種元素用于嵌入和控制多媒體內容&#xff0c;包括音頻、視頻、圖像、畫布等。以下是常用的 HTML 媒體元素及其用法&#xff1a;音頻 (<audio>)<audio> 元素用于嵌入音頻內容&#xff0c;支持 MP3、WAV、OGG 等格式。 示例代碼&#xff1a;<audio c…

http請求結構體解析

copy了一個接口的curl用來說明http請求的三個結構&#xff1a;請求行&#xff0c;請求頭&#xff0c;請求體 文章目錄一、請求的curl報文示例二、解析1. 請求行&#xff08;Request Line&#xff09;2. 請求頭&#xff08;Request Headers&#xff09;3. 請求體&#xff08;Req…

無人機遙控器舵量技術解析

一、舵量的核心作用1. 精確控制的核心 舵量值&#xff08;通常以PWM微秒值表示&#xff09;量化了操作指令的強度&#xff1a; 小舵量&#xff08;1000μs&#xff09;&#xff1a;對應舵機最小角度或電機最低轉速&#xff1b; 中點&#xff08;1500μs&#xff09;&#xf…

Git分支相關命令

在 Git 中&#xff0c;分支管理是非常重要的一部分。下面是一些常用的 Git 分支操作命令及其示例。 1. 查看所有分支 要查看項目中的所有分支&#xff08;包括本地和遠程&#xff09;&#xff0c;可以使用&#xff1a; git branch -a僅查看本地分支&#xff1a;git branch2. 創…