時頻圖數據集更正程序，去除坐標軸白邊及調整對應的標簽值

當數據集是時頻圖時可能有一個尷尬的問題，就是數據集制作好后，發現有白邊。

其實這也不影響訓練模型，可能對模型訓練效果的影響也是微乎其微的，于是大多數情況我會選擇直接用整張圖片訓練模型。但是，有的情況下，去掉白邊模型訓練效果好，不去白邊模型某個類別效果就不好。比如圖中的BPSK和Frank信號。

一開始我設置416*416的神經網絡輸入大小，甚至BPSK，Frank信號檢測出來的概率特別低，10個中有2個的樣子？上面的檢測結果是640*640大小訓練出來的，雖然信號檢測出來的，BPSK，Frank這種特別窄的信號置信度很低，見下圖。

這種情況白邊就不是可留可不留的了，是必須要去掉。白邊占的面積還是挺大的。

裁剪圖片并不難，難的是還要對應修正labels中的數值。于是，我寫了一個更正程序，既可以裁剪圖片，也可以修正labels。

#作者：zhouzhichao
#時間：25年7月4日
#功能：裁剪數據集的圖片和調整對應標簽數值import os
from PIL import Image, ImageDraw# origin_img_path = ""
# target_img_path = ""
# origin_label_path = ""
# target_label_path = ""origin_width = 875
origin_height = 656x1 = 114
x2 = 791
y1 = 49
y2 = 583after_cut_width = x2 - x1
after_cut_height = y2 - y1def cut_img(target_path):# 設置文件夾路徑folder_path = target_pathoutput_folder = target_path# 如果輸出文件夾不存在，則創建if not os.path.exists(output_folder):os.makedirs(output_folder)# 獲取文件夾中的所有圖片for filename in os.listdir(folder_path):if filename.endswith(('.png', '.jpg', '.jpeg')):  # 根據需要修改圖片格式img_path = os.path.join(folder_path, filename)img = Image.open(img_path)# 裁剪區域：左上角(114, 49)到右下角(791, 583)cropped_img = img.crop((x1, y1, x2, y2))# 保存裁剪后的圖片到輸出文件夾output_path = os.path.join(output_folder, filename)cropped_img.save(output_path)print("裁剪完成！")def modify_label(labels_folder):output_folder = labels_folderfor filename in os.listdir(labels_folder):if filename.endswith('.txt'):txt_path = os.path.join(labels_folder, filename)with open(txt_path, 'r') as file:lines = file.readlines()# 修改后的數據將存儲在這里modified_lines = []for line in lines:parts = line.strip().split()  # 拆分每一行class_id = parts[0]x_center = float(parts[1])y_center = float(parts[2])width = float(parts[3])height = float(parts[4])# 修改第2列為0.5，第4列為1modified_y_center = (y_center*origin_height-y1)/after_cut_heightmodified_y_height = height*origin_height/after_cut_height# 拼接修改后的行modified_line = f"{class_id} {0.5:.3f} {modified_y_center:.3f} {1:.3f} {modified_y_height:.3f}\n"modified_lines.append(modified_line)# 保存修改后的文件output_txt_path = os.path.join(output_folder, filename)with open(output_txt_path, 'w') as output_file:output_file.writelines(modified_lines)def watch(images_folder,labels_folder,output_folder):for filename in os.listdir(labels_folder):if filename.endswith('.txt'):txt_path = os.path.join(labels_folder, filename)# 獲取對應的圖片文件名img_filename = filename.replace('.txt', '.jpg')img_path = os.path.join(images_folder, img_filename)# 打開圖片img = Image.open(img_path)draw = ImageDraw.Draw(img)# 讀取標簽文件with open(txt_path, 'r') as file:lines = file.readlines()# 遍歷每一行標簽，繪制矩形框for line in lines:parts = line.strip().split()  # 拆分每一行class_id = int(parts[0])x_center = float(parts[1]) * img.widthy_center = float(parts[2]) * img.heightwidth = float(parts[3]) * img.widthheight = float(parts[4]) * img.height# 計算矩形框的左上角和右下角坐標x1 = x_center - width / 2y1 = y_center - height / 2x2 = x_center + width / 2y2 = y_center + height / 2# 繪制矩形框，使用紅色邊框draw.rectangle([x1, y1, x2, y2], outline="white", width=2)# 保存帶框的圖片output_img_path = os.path.join(output_folder, img_filename)img.save(output_img_path)if __name__ == "__main__":img_path = "D:\english\yolov7\datasets_higher_cut\images\\train"label_path = "D:\english\yolov7\datasets_higher_cut\labels\\train"output_img_path = "D:\english\yolov7\datasets_higher_cut\watch"# cut_img(img_path)# modify_label(label_path)watch(img_path, label_path, output_img_path)

其中，cut_img是裁剪圖片的，modify_label是更正標簽的，watch是檢測更正結果的。

①裁剪效果：

每張圖片原本大小是875*656，把左上角114，49到右下角的791，583，覆蓋原圖。

②label更正效果，原標簽：

更正后標簽：

主要是把這些數字中的第2列都改成0.5，第4列改成1，第3列改成原來的數值稱原圖片高度減去y1后除以裁剪后的圖片高度。最后一列改成原值乘以原圖高度除以裁剪后的圖片高度。

③檢查效果

最后，用這個程序需要注意，對一個文件架不能使用兩次，那圖片不就被裁了2次嘛，label的數值不就改亂了嘛。

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/news/913126.shtml
繁體地址，請注明出處：http://hk.pswp.cn/news/913126.shtml
英文地址，請注明出處：http://en.pswp.cn/news/913126.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！