【打怪升級 - 03】YOLO11/YOLO12/YOLOv10/YOLOv8 完全指南：從理論到代碼實戰，新手入門必看教程

引言：為什么選擇 YOLO？

在目標檢測領域，YOLO（You Only Look Once）系列模型一直以其高效性和準確性備受關注。作為新版本，YOLO系列的新版本總能在前輩的基礎上進行了多項改進，包括更高的檢測精度、更快的推理速度以及更強的小目標檢測能力。

在這里插入圖片描述

無論是無論是計算機視覺領域的初學者，還是希望快速部署目標檢測系統的開發者，YOLO系列都是一個理想的選擇。本指南將帶你從理論基礎開始，逐步掌握 YOLO系列的核心原理與實戰技能。

一、YOLO 核心理論解析

1.1 YOLO 算法的基本思想

YOLO 系列的核心思想是將目標檢測任務轉化為一個回歸問題。與傳統的兩階段檢測算法（先產生候選區域再進行分類）不同，YOLO 采用單階段檢測策略，直接在一張圖片上同時預測目標的位置和類別。

在這里插入圖片描述

這種設計使得 YOLO 的檢測速度遠超傳統方法，能夠滿足實時檢測的需求。

1.2 YOLO11/12 的核心改進（簡單介紹~ 本文側重實戰）

YOLO11 在之前版本的基礎上進行了多項關鍵改進：

新的骨干網絡：采用更高效的特征提取網絡，在減少計算量的同時提升特征表達能力
改進的頸部結構：優化了特征融合機制，增強了多尺度特征處理能力
優化的損失函數：提高了模型對小目標和遮擋目標的檢測能力
動態錨框機制：根據不同數據集自動調整錨框參數，提升檢測精度

YOLO12 則引入了一種以注意力為中心的架構，它不同于以往YOLO 模型中使用的基于 CNN 的傳統方法，但仍保持了許多應用所必需的實時推理速度。該模型通過對注意力機制和整體網絡架構進行新穎的方法創新，實現了最先進的物體檢測精度，同時保持了實時性能。

1.3 目標檢測的基本概念

在深入 YOLO系列之前，我們需要了解幾個關鍵概念：

邊界框（Bounding Box）：用于定位目標位置的矩形框，通常由 (x, y, w, h) 表示
置信度（Confidence）：模型對預測框中存在目標的信任程度
交并比（IoU）：衡量預測框與真實框重疊程度的指標
非極大值抑制（NMS）：用于去除重復檢測框的后處理方法

二、環境搭建：從零開始配置YOLO11/YOLO12/YOLOv10/YOLOv8

2.1 硬件要求

YOLO系列雖然對硬件要求不算極端，但為了獲得良好的訓練和推理體驗，建議配置：

CPU：至少 4 核處理器
GPU：NVIDIA 顯卡（推薦 RTX 3060 及以上，支持 CUDA）
內存：至少 8GB（推薦 16GB 及以上）
硬盤：至少 10GB 空閑空間

2.2 軟件安裝步驟

參考我們之前的文章（含視頻講解）
【打怪升級 - 01】保姆級機器視覺入門指南：硬件選型 + CUDA/cuDNN/Miniconda/PyTorch/Pycharm 安裝全流程（附版本匹配秘籍）

2.2.1 安裝 ultralytics

Ultralytics 庫已經集成了YOLO11/YOLO12/YOLOv10/YOLOv8，安裝命令如下：

pip3 install ultralytics

2.2.4 驗證安裝

from ultralytics import YOLO# 加載預訓練模型
model = YOLO('yolo11n.pt')# 打印模型信息
print(model.info())

如果運行無錯誤并顯示模型信息，則安裝成功。

三、YOLO11/YOLO12/YOLOv10/YOLOv8 實戰：圖像與視頻檢測

3.1 使用預訓練模型進行圖像檢測

YOLO11/YOLO12/YOLOv10/YOLOv8提供了多個預訓練模型，從小型模型（n）到大型模型（x），可以根據需求選擇：

from ultralytics import YOLO
import cv2# 加載預訓練模型
model = YOLO('yolo11n.pt')  # 小型模型，速度快
# model = YOLO('yolo11s.pt')  # 中型模型，平衡速度和精度
# model = YOLO('yolo11m.pt')  # 大型模型，精度更高
# model = YOLO('yolo11l.pt')  # 更大的模型
# model = YOLO('yolo11x.pt')  # 最大模型，精度最高# Load a COCO-pretrained YOLO12n model
#model = YOLO("yolo12n.pt")# Load a COCO-pretrained YOLOv10n model
#model = YOLO("yolov10n.pt")# Load a COCO-pretrained YOLOv8n model
#model = YOLO("yolov8n.pt")# 檢測單張圖片
results = model('test.jpg')  # 替換為你的圖片路徑# 處理檢測結果
for result in results:# 繪制檢測框annotated_img = result.plot()# 顯示結果cv2.imshow('YOLO11 Detection', annotated_img)cv2.waitKey(0)cv2.destroyAllWindows()# 保存結果result.save('result.jpg')

3.2 視頻目標檢測

YOLO11/YOLO12/YOLOv10/YOLOv8 同樣支持視頻文件和攝像頭實時檢測：

from ultralytics import YOLO
import cv2# 加載模型
model = YOLO('yolo11n.pt')# Load a COCO-pretrained YOLO12n model
#model = YOLO("yolo12n.pt")# Load a COCO-pretrained YOLOv10n model
#model = YOLO("yolov10n.pt")# Load a COCO-pretrained YOLOv8n model
#model = YOLO("yolov8n.pt")# 視頻文件檢測
video_path = "input.mp4"
cap = cv2.VideoCapture(video_path)# 獲取視頻屬性
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = cap.get(cv2.CAP_PROP_FPS)# 設置輸出視頻
output_path = "output.mp4"
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
out = cv2.VideoWriter(output_path, fourcc, fps, (width, height))while cap.isOpened():ret, frame = cap.read()if not ret:break# 檢測幀results = model(frame)# 繪制檢測結果annotated_frame = results[0].plot()# 顯示幀cv2.imshow('YOLO Video Detection', annotated_frame)# 寫入輸出視頻out.write(annotated_frame)# 按q退出if cv2.waitKey(1) & 0xFF == ord('q'):break# 釋放資源
cap.release()
out.release()
cv2.destroyAllWindows()

3.3 攝像頭實時檢測

只需將視頻路徑替換為攝像頭索引即可實現實時檢測：

# 攝像頭實時檢測
cap = cv2.VideoCapture(0)  # 0表示默認攝像頭

四、模型訓練：自定義數據集訓練 YOLO11/YOLO12/YOLOv10/YOLOv8

4.1 數據集準備

YOLO系列需要特定格式的數據集，基本結構如下：

dataset/
├── images/
│   ├── train/
│   │   ├── img1.jpg
│   │   ├── img2.jpg
│   │   └── ...
│   └── val/
│       ├── img1.jpg
│       ├── img2.jpg
│       └── ...
└── labels/├── train/│   ├── img1.txt│   ├── img2.txt│   └── ...└── val/├── img1.txt├── img2.txt└── ...

每個圖像對應一個標簽文件，標簽格式為：

class_id x_center y_center width height

其中所有坐標都是歸一化的（0-1 范圍）。

4.2 創建配置文件

創建一個 YAML 配置文件（例如 coco8.yaml）：

# Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license# COCO8 dataset (first 8 images from COCO train2017) by Ultralytics
# Documentation: https://docs.ultralytics.com/datasets/detect/coco8/
# Example usage: yolo train data=coco8.yaml
# parent
# ├── ultralytics
# └── datasets
#     └── coco8 ← downloads here (1 MB)# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: coco8 # dataset root dir
train: images/train # train images (relative to 'path') 4 images
val: images/val # val images (relative to 'path') 4 images
test: # test images (optional)# Classes
names:0: person1: bicycle2: car3: motorcycle4: airplane5: bus6: train7: truck8: boat9: traffic light10: fire hydrant11: stop sign12: parking meter13: bench14: bird15: cat16: dog17: horse18: sheep19: cow20: elephant21: bear22: zebra23: giraffe24: backpack25: umbrella26: handbag27: tie28: suitcase29: frisbee30: skis31: snowboard32: sports ball33: kite34: baseball bat35: baseball glove36: skateboard37: surfboard38: tennis racket39: bottle40: wine glass41: cup42: fork43: knife44: spoon45: bowl46: banana47: apple48: sandwich49: orange50: broccoli51: carrot52: hot dog53: pizza54: donut55: cake56: chair57: couch58: potted plant59: bed60: dining table61: toilet62: tv63: laptop64: mouse65: remote66: keyboard67: cell phone68: microwave69: oven70: toaster71: sink72: refrigerator73: book74: clock75: vase76: scissors77: teddy bear78: hair drier79: toothbrush# Download script/URL (optional)
download: https://github.com/ultralytics/assets/releases/download/v0.0.0/coco8.zip

4.3 開始訓練

from ultralytics import YOLO# 加載基礎模型
model = YOLO('yolo11n.pt')# Load a COCO-pretrained YOLO12n model
#model = YOLO("yolo12n.pt")# Load a COCO-pretrained YOLOv10n model
#model = YOLO("yolov10n.pt")# Load a COCO-pretrained YOLOv8n model
#model = YOLO("yolov8n.pt")# 訓練模型
results = model.train(data='custom_data.yaml',  # 配置文件路徑epochs=50,                # 訓練輪數imgsz=640,                # 輸入圖像大小batch=16,                 # 批次大小device=0,                 # GPU編號，-1表示使用CPUworkers=4,                # 數據加載線程數project='my_yolo11_project',  # 項目名稱name='custom_training'    # 訓練名稱
)

4.4 訓練過程監控

訓練過程中，可以通過以下方式監控：

控制臺輸出：包含每輪的損失值、精度等指標
TensorBoard：運行tensorboard --logdir=my_yolo_project/custom_training查看詳細曲線
訓練生成的圖表：保存在my_yolo11_project/custom_training/results.png

五、模型評估與優化

5.1 評估模型性能

訓練完成后，可以評估模型在驗證集上的表現：

# 評估模型
metrics = model.val()# 打印評估指標
print(f"mAP@0.5: {metrics.box.map50:.3f}")
print(f"mAP@0.5:0.95: {metrics.box.map:.3f}")

關鍵評估指標：

mAP@0.5：IoU 閾值為 0.5 時的平均精度
mAP@0.5:0.95：IoU 閾值從 0.5 到 0.95 的平均精度

5.2 模型優化策略

如果模型性能不理想，可以嘗試以下優化策略：

增加訓練數據：收集更多多樣化的樣本（用的最多的方法~）

數據增強

：在訓練時使用更多數據增強方法

 model.train(data='custom_data.yaml', epochs=50, augment=True)

調整學習率：根據訓練曲線調整學習率
使用更大的模型：如從 yolo11n 換成 yolo11s 或更大的模型
延長訓練時間：增加訓練輪數
調整圖像大小：嘗試更大的輸入尺寸（如 800 或 1024）

六、模型部署

6.1 導出為其他格式

YOLO11/YOLO12/YOLOv10/YOLOv8 支持導出為多種部署格式：

# 導出為ONNX格式
model.export(format='onnx')# 導出為TensorRT格式（需要安裝TensorRT）
model.export(format='engine')# 導出為CoreML格式（適用于iOS設備）
model.export(format='coreml')

6.2 構建簡單的 Web 應用

使用 Flask 構建一個簡單的 YOLO目標檢測 Web 服務：

from flask import Flask, request, jsonify
from ultralytics import YOLO
import cv2
import base64
import numpy as npapp = Flask(__name__)
model = YOLO('yolo11n.pt')
# yolo12n
#model = YOLO("yolo12n.pt")
# yolov10n
#model = YOLO("yolov10n.pt")
# yolov8n
#model = YOLO("yolov8n.pt")@app.route('/detect', methods=['POST'])
def detect():# 獲取圖像數據data = request.jsonimg_data = base64.b64decode(data['image'])nparr = np.frombuffer(img_data, np.uint8)img = cv2.imdecode(nparr, cv2.IMREAD_COLOR)# 檢測results = model(img)# 處理結果detections = []for result in results:for box in result.boxes:x1, y1, x2, y2 = box.xyxy[0].tolist()conf = box.conf[0].item()cls = box.cls[0].item()detections.append({'class': model.names[int(cls)],'confidence': conf,'bbox': [x1, y1, x2, y2]})return jsonify({'detections': detections})if __name__ == '__main__':app.run(host='0.0.0.0', port=5000)