(6)機器學習小白入門 YOLOv:圖片的數據預處理

(1)機器學習小白入門YOLOv ：從概念到實踐
(2)機器學習小白入門 YOLOv：從模塊優化到工程部署
(3)機器學習小白入門 YOLOv：解鎖圖片分類新技能
(4)機器學習小白入門YOLOv ：圖片標注實操手冊
(5)機器學習小白入門 YOLOv：數據需求與圖像不足應對策略
(6)機器學習小白入門 YOLOv:圖片的數據預處理
(7)機器學習小白入門 YOLOv：模型訓練詳解

在使用 YOLOv 模型進行目標檢測前，圖片的數據預處理是非常重要的一環，它決定了你訓練出來的模型效果好不好、能不能泛化到實際場景中。下面我為你詳細介紹 YOLOv 的數據預處理技術與步驟，幫助你更好地準備用于訓練的數據。

一、YOLO 數據集結構要求

1. 常見文件夾結構（如使用 darknet 格式）：

yolov_dataset/
│
├── images/             # 放置圖片
│   ├── train/
│   └── val/
│
└── labels/              # 對應的標簽文件，即 label.txt 文件├── train/└── val/

2. 圖片命名要求：

所有圖片以 .jpg 或者 .png 格式存儲；

train/val 中的圖像名稱要一致，如：

images/train/1.jpg
labels/train/1.txtimages/train/2.jpg
labels/train/2.txt

3. label.txt 文件格式：

每張圖片對應的 label.txt 包含若干行（對應圖像中有多少個目標），每一行為如下結構：

class_id x_center y_center width height

x_center, y_center, width, height：歸一化的坐標，范圍 0~1。
class_id：目標類別在類列表中的索引編號（從 0 開始）。

🛠? 二、YOLO 數據預處理技術與步驟

1. 圖像標準化（Normalize Image）

將圖片尺寸統一到模型訓練時使用的大小，例如：

from PIL import Imagedef resize_image(img, target_size=(640, 640)): # YOLOv5 常用輸入分辨率return img.resize(target_size)

2. 圖像歸一化（Normalize Pixel）

YOLO 訓練過程中，一般使用以下方式進行圖像歸一：

import numpy as npdef normalize(img):# 將圖片轉換為 np.arrayimg_array = np.array(img) / 255.0  # 歸一到 [0,1] 區間return img_array

3. 標簽標準化處理（Label Normalization）

將標注文件中的 x_center, y_center, width, height 按圖像尺寸進行歸一化，例如：

def normalize_label(label_path, image_width, image_height):labels = []with open(label_path, 'r') as f:lines = f.readlines()for line in lines:parts = line.strip().split()class_id, x_center, y_center, width, height = map(float, parts)# 歸一化到 0~1x_center_norm = x_center / image_widthy_center_norm = y_center / image_heightwidth_norm = width / image_widthheight_norm = height / image_heightlabels.append(f"{int(class_id)} {x_center_norm:.6f} {y_center_norm:.6f} {width_norm:.6f} {height_norm:.6f}")return labels

4. 圖像增強（Image Augmentation）（可選，但推薦使用）

圖像增強是提高模型泛化能力的利器。你可以采用以下方式進行增強：

使用 `albumentations` 進行數據增強：

import albumentations as Atransform = A.Compose([A.HorizontalFlip(p=0.5),A.RandomBrightnessContrast(p=0.2),A.Rotate(limit=15, p=0.5),A.Cutout(num_holes=4, max_height=8, max_width=8, fill_value=0, p=0.3)
])def augment_image(img):return transform(image=np.array(img))['image']

5. 分割數據集（Train/Val/Test）

使用 sklearn 或自定義方式劃分訓練集與驗證集：

from sklearn.model_selection import train_test_split# 假設 images_list 是你的圖像文件名列表
train_files, val_files = train_test_split(images_list, test_size=0.2, random_state=42)

三、使用 LabelImg 等工具生成標簽（可選）

你還可以編寫腳本將 .xml 轉換為 YOLO 可讀的 label.txt 文件：

import xml.etree.ElementTree as ETdef convert_xml_to_yolo(xml_path, img_w, img_h):tree = ET.parse(xml_path)root = tree.getroot()labels = []for obj in root.findall('object'):class_name = obj.find('name').textclass_id = 0  # 根據你自己的類定義填寫 class_idbox = obj.find('bndbox')x_min = int(box.find('xmin').text)y_min = int(box.find('ymin').text)x_max = int(box.find('xmax').text)y_max = int(box.find('ymax').text)width = x_max - x_minheight = y_max - y_minxc = (x_min + x_max) / 2 / img_wyc = (y_min + y_max) / 2 / img_hw = width / img_wh = height / img_hlabels.append(f"{class_id} {xc:.6f} {yc:.6f} {w:.6f} {h:.6f}")return labels

四、預處理完整流程圖（可選）

你可以使用如下結構進行訓練數據的預處理：

五、小結

步驟	內容說明
圖像標準化	調整圖片大小為統一尺寸
標簽處理	將 label.txt 中的坐標歸一化到 [0,1]
數據增強（可選）	通過旋轉、翻轉、亮度變換等方式增強多樣性
分割訓練/驗證集	提升模型泛化能力和評估性能

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/web/88317.shtml
繁體地址，請注明出處：http://hk.pswp.cn/web/88317.shtml
英文地址，請注明出處：http://en.pswp.cn/web/88317.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！