coco與voc相互轉化

把LabelImg標注的YOLO格式標簽轉化為VOC格式標簽和把VOC格式標簽轉化為YOLO格式標簽

點亮～黑夜 2020-07-07 11:08:24 ?3537 ?已收藏 90
分類專欄： 19—目標檢測文章標簽： voc yolo
版權
把LabelImg標注的YOLO格式標簽轉化為VOC格式標簽和把VOC格式標簽轉化為YOLO格式標簽

文章目錄：
1 用LabelImgvoc和yolo標注標簽格式說明
1.1 LabelImg標注的VOC數據格式
1.2 LabelImg標注的YOLO數據格式
2 voc轉換為yolo格式計算
3 yolo轉換為voc格式計算
4 yolo格式標簽轉化為voc格式標簽代碼
5 VOC格式標簽轉化為YOLO格式標簽代碼
1 用LabelImgvoc和yolo標注標簽格式說明
關于LabelImg工具的使用，參考

1.1 LabelImg標注的VOC數據格式
VOC數據格式，會直接把每張圖片標注的標簽信息保存到一個xml文件中

例如：我們上面標注的JPEGImage/000001.jpg圖片，標注的標簽信息會保存到Annotation/000001.xml文件中，000001.xml中的信息如下：

<annotation>
?? ?<folder>JPEGImage</folder>
?? ?<filename>000000.jpg</filename>
?? ?<path>D:\ZF\2_ZF_data\3_stamp_data\標注公章數據\JPEGImage\000000.jpg</path>
?? ?<source>
?? ??? ?<database>Unknown</database>
?? ?</source>
?? ?<size>
?? ??? ?<width>500</width>
?? ??? ?<height>402</height>
?? ??? ?<depth>3</depth>
?? ?</size>
?? ?<segmented>0</segmented>
?? ?<object>
?? ??? ?<name>circle_red</name>
?? ??? ?<pose>Unspecified</pose>
?? ??? ?<truncated>0</truncated>
?? ??? ?<difficult>0</difficult>
?? ??? ?<bndbox>
?? ??? ??? ?<xmin>168</xmin>
?? ??? ??? ?<ymin>2</ymin>
?? ??? ??? ?<xmax>355</xmax>
?? ??? ??? ?<ymax>186</ymax>
?? ??? ?</bndbox>
?? ?</object>
?? ?<object>
?? ??? ?<name>circle_red</name>
?? ??? ?<pose>Unspecified</pose>
?? ??? ?<truncated>0</truncated>
?? ??? ?<difficult>0</difficult>
?? ??? ?<bndbox>
?? ??? ??? ?<xmin>2</xmin>
?? ??? ??? ?<ymin>154</ymin>
?? ??? ??? ?<xmax>208</xmax>
?? ??? ??? ?<ymax>367</ymax>
?? ??? ?</bndbox>
?? ?</object>
?? ?<object>
?? ??? ?<name>circle_red</name>
?? ??? ?<pose>Unspecified</pose>
?? ??? ?<truncated>0</truncated>
?? ??? ?<difficult>0</difficult>
?? ??? ?<bndbox>
?? ??? ??? ?<xmin>305</xmin>
?? ??? ??? ?<ymin>174</ymin>
?? ??? ??? ?<xmax>493</xmax>
?? ??? ??? ?<ymax>364</ymax>
?? ??? ?</bndbox>
?? ?</object>
</annotation>

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
xml中的關鍵信息說明：

圖片的名字
每個目標的標定框坐標：即左上角的坐標和右下角的坐標
xmin
ymin
xmax
ymax
1.2 LabelImg標注的YOLO數據格式
YOLO數據格式，會直接把每張圖片標注的標簽信息保存到一個txt文件中

例如：我們上面標注的JPEGImage/000001.jpg圖片，標注的標簽信息會保存到Annotation/000001.txt文件中（同時會生成一個classes.txt文件，也保存到Annotation/classes.txt），000001.txt中的信息如下：

0 0.521000 0.235075 0.362000 0.450249
0 0.213000 0.645522 0.418000 0.519900
0 0.794000 0.665423 0.376000 0.470149
1
2
3
txt中信息說明：

每一行代表標注的一個目標
第一個數代表標注目標的標簽，第一目標circle_red，對應數字就是0
后面的四個數代表標注框的中心坐標和標注框的相對寬和高（進行了歸一化，如何歸一化可以參考我的這篇博客中的介紹）
五個數據從左到右以此為：c l a s s _ i n d e x , x _ c e n t e r , y _ c e n t e r , w , h class\_index, x\_center, y\_center, w, hclass_index,x_center,y_center,w,h。（后面的四個數據都是歸一化的）
同時會生成一個Annotation/classes.txt實際類別文件classes.txt，里面的內容如下：

circle_red
circle_gray
rectangle_red
rectangle_gray
fingeprint_red
fingeprint_gray
other
1
2
3
4
5
6
7
2 voc轉換為yolo格式計算
標注好的VOC格式的標簽xml文件，存儲的主要信息為：

圖片的名字
圖片的高height、寬width、通道depth
標定框的坐標位置：xmin、ymin、xmax、ymax
例如下圖代表的是一樣圖片：

紅框代表的是原圖大小：height=8，width=8
藍框代表的是標注物體的框：左上角坐標為 (xmin, ymin)=(2,2)，右下角的坐標為 (xmax, ymax)=(6,6)

而voc_label.py目的就是把標注為VOC格式數據轉化為標注為yolo格式數據：
VOC格式標簽：圖片的實際寬和高，標注框的左上角和右下角坐標
YOLO格式標簽：標注框的中心坐標（歸一化的），標注框的寬和高（歸一化的）
VOC格式標簽轉換為YOLO格式標簽計算公式：

框中心的實際坐標（x, y）：（一般可能還會在后面減去1）
x _ c e n t e r = x m a x + x m i n 2 = 6 + 2 2 = 4 x\_center=\frac{xmax+xmin}{2}=\frac{6+2}{2}=4
x_center=?
2
xmax+xmin
??? ?
?=?
2
6+2
??? ?
?=4

y _ c e n t e r = y m a x + y m i n 2 = 6 + 2 2 = 4 y\_center=\frac{ymax+ymin}{2}=\frac{6+2}{2}=4
y_center=?
2
ymax+ymin
??? ?
?=?
2
6+2
??? ?
?=4

框歸一化后的中心坐標（x, y）：
x = x _ c e n t e r w i d t h = 4 8 = 0.5 x=\frac{x\_center}{width}=\frac{4}{8}=0.5
x=?
width
x_center
??? ?
?=?
8
4
??? ?
?=0.5

y = y _ c e n t e r h e i g h t = 4 8 = 0.5 y=\frac{y\_center}{height}=\frac{4}{8}=0.5
y=?
height
y_center
??? ?
?=?
8
4
??? ?
?=0.5

框的高和框（歸一化的）：
w = x m a x ? x m i n w i d t h = 6 ? 2 8 = 0.5 w=\frac{xmax-xmin}{width}=\frac{6-2}{8}=0.5
w=?
width
xmax?xmin
??? ?
?=?
8
6?2
??? ?
?=0.5

h = y m a x ? y m i n h e i g h t = 6 ? 2 8 = 0.5 h=\frac{ymax-ymin}{height}=\frac{6-2}{8}=0.5
h=?
height
ymax?ymin
??? ?
?=?
8
6?2
??? ?
?=0.5

3 yolo轉換為voc格式計算
voc中保存的坐標信息為：xmin, ymin, xmax, ymax，所以只要根據上面的公式，推導出這四個值即可，推導如下：

推導：xmin, xmax
{ x m a x + x m i n = 2 x _ c e n t e r x m a x ? x m i n = w ? w i d t h
{xmax+xmin=2x_centerxmax?xmin=w?width
{xmax+xmin=2x_centerxmax?xmin=w?width
{?
xmax+xmin=2x_center
xmax?xmin=w?width
??? ?
?

{ 2 x m a x = 2 x _ c e n t e r + w ? w i d t h = > x m a x = x _ c e n t e r + 1 2 ? w ? w i d t h 2 x m i n = 2 x _ c e n t e r ? w ? w i d t h = > x m i n = x _ c e n t e r ? 1 2 ? w ? w i d t h
{2xmax=2x_center+w?width=>xmax=x_center+12?w?width2xmin=2x_center?w?width=>xmin=x_center?12?w?width
{2xmax=2x_center+w?width=>xmax=x_center+12?w?width2xmin=2x_center?w?width=>xmin=x_center?12?w?width
{?
2xmax=2x_center+w?width=>xmax=x_center+?
2
1
??? ?
??w?width
2xmin=2x_center?w?width=>xmin=x_center??
2
1
??? ?
??w?width
??? ?
?

推導：ymin, ymax
{ y m a x + y m i n = 2 y _ c e n t e r y m a x ? y m i n = y ? h e i g h t
{ymax+ymin=2y_centerymax?ymin=y?height
{ymax+ymin=2y_centerymax?ymin=y?height
{?
ymax+ymin=2y_center
ymax?ymin=y?height
??? ?
?

{ 2 y m a x = 2 y _ c e n t e r + h ? h e i g h t = > y m a x = y _ c e n t e r + 1 2 ? h ? h e i g h t 2 y m i n = 2 y _ c e n t e r ? h ? h e i g h t = > y m i n = y _ c e n t e r ? 1 2 ? h ? h e i g h t
{2ymax=2y_center+h?height=>ymax=y_center+12?h?height2ymin=2y_center?h?height=>ymin=y_center?12?h?height
{2ymax=2y_center+h?height=>ymax=y_center+12?h?height2ymin=2y_center?h?height=>ymin=y_center?12?h?height
{?
2ymax=2y_center+h?height=>ymax=y_center+?
2
1
??? ?
??h?height
2ymin=2y_center?h?height=>ymin=y_center??
2
1
??? ?
??h?height
??? ?
?

4 yolo格式標簽轉化為voc格式標簽代碼
代碼是把txt標簽轉化為voc標簽
代碼支持一個標簽文件中有多個目標
__Author__ = "Shliang"
__Email__ = "shliang0603@gmail.com"

import os
import xml.etree.ElementTree as ET
from xml.dom.minidom import Document
import cv2

'''
import xml
xml.dom.minidom.Document().writexml()
def writexml(self,
? ? ? ? ? ? ?writer: Any,
? ? ? ? ? ? ?indent: str = "",
? ? ? ? ? ? ?addindent: str = "",
? ? ? ? ? ? ?newl: str = "",
? ? ? ? ? ? ?encoding: Any = None) -> None
'''

class YOLO2VOCConvert:
? ? def __init__(self, txts_path, xmls_path, imgs_path):
? ? ? ? self.txts_path = txts_path ? # 標注的yolo格式標簽文件路徑
? ? ? ? self.xmls_path = xmls_path ? # 轉化為voc格式標簽之后保存路徑
? ? ? ? self.imgs_path = imgs_path ? # 讀取讀片的路徑個圖片名字，存儲到xml標簽文件中
? ? ? ? self.classes = ["shirt", "non_shirt", "western_style_clothes", "coat", "down_filled_coat",
? ? ? ? ? ? ? ? ? ? ? ? "cotton", "sweater", "silk_scarf", "tie", "bow_tie"]

? ? # 從所有的txt文件中提取出所有的類別， yolo格式的標簽格式類別為數字 0,1,...
? ? # writer為True時，把提取的類別保存到'./Annotations/classes.txt'文件中
? ? def search_all_classes(self, writer=False):
? ? ? ? # 讀取每一個txt標簽文件，取出每個目標的標注信息
? ? ? ? all_names = set()
? ? ? ? txts = os.listdir(self.txts_path)
? ? ? ? # 使用列表生成式過濾出只有后綴名為txt的標簽文件
? ? ? ? txts = [txt for txt in txts if txt.split('.')[-1] == 'txt']
? ? ? ? print(len(txts), txts)
? ? ? ? # 11 ['0002030.txt', '0002031.txt', ... '0002039.txt', '0002040.txt']
? ? ? ? for txt in txts:
? ? ? ? ? ? txt_file = os.path.join(self.txts_path, txt)
? ? ? ? ? ? with open(txt_file, 'r') as f:
? ? ? ? ? ? ? ? objects = f.readlines()
? ? ? ? ? ? ? ? for object in objects:
? ? ? ? ? ? ? ? ? ? object = object.strip().split(' ')
? ? ? ? ? ? ? ? ? ? print(object) ?# ['2', '0.506667', '0.553333', '0.490667', '0.658667']
? ? ? ? ? ? ? ? ? ? all_names.add(int(object[0]))
? ? ? ? ? ? # print(objects) ?# ['2 0.506667 0.553333 0.490667 0.658667\n', '0 0.496000 0.285333 0.133333 0.096000\n', '8 0.501333 0.412000 0.074667 0.237333\n']

? ? ? ? print("所有的類別標簽：", all_names, "共標注數據集：%d張" % len(txts))

? ? ? ? # 把從xmls標簽文件中提取的類別寫入到'./Annotations/classes.txt'文件中
? ? ? ? # if writer:
? ? ? ? # ? ? with open('./Annotations/classes.txt', 'w') as f:
? ? ? ? # ? ? ? ? for label in all_names:
? ? ? ? # ? ? ? ? ? ? f.write(label + '\n')

? ? ? ? return list(all_names)

? ? def yolo2voc(self):
? ? ? ? # 創建一個保存xml標簽文件的文件夾
? ? ? ? if not os.path.exists(self.xmls_path):
? ? ? ? ? ? os.mkdir(self.xmls_path)

? ? ? ? # # 讀取每張圖片，獲取圖片的尺寸信息（shape）
? ? ? ? # imgs = os.listdir(self.imgs_path)
? ? ? ? # for img_name in imgs:
? ? ? ? # ? ? img = cv2.imread(os.path.join(self.imgs_path, img_name))
? ? ? ? # ? ? height, width, depth = img.shape
? ? ? ? # ? ? # print(height, width, depth) ? # h 就是多少行（對應圖片的高度）， w就是多少列（對應圖片的寬度）
? ? ? ? #
? ? ? ? # # 讀取每一個txt標簽文件，取出每個目標的標注信息
? ? ? ? # all_names = set()
? ? ? ? # txts = os.listdir(self.txts_path)
? ? ? ? # # 使用列表生成式過濾出只有后綴名為txt的標簽文件
? ? ? ? # txts = [txt for txt in txts if txt.split('.')[-1] == 'txt']
? ? ? ? # print(len(txts), txts)
? ? ? ? # # 11 ['0002030.txt', '0002031.txt', ... '0002039.txt', '0002040.txt']
? ? ? ? # for txt_name in txts:
? ? ? ? # ? ? txt_file = os.path.join(self.txts_path, txt_name)
? ? ? ? # ? ? with open(txt_file, 'r') as f:
? ? ? ? # ? ? ? ? objects = f.readlines()
? ? ? ? # ? ? ? ? for object in objects:
? ? ? ? # ? ? ? ? ? ? object = object.strip().split(' ')
? ? ? ? # ? ? ? ? ? ? print(object) ?# ['2', '0.506667', '0.553333', '0.490667', '0.658667']

? ? ? ? # 把上面的兩個循環改寫成為一個循環：
? ? ? ? imgs = os.listdir(self.imgs_path)
? ? ? ? txts = os.listdir(self.txts_path)
? ? ? ? txts = [txt for txt in txts if not txt.split('.')[0] == "classes"] ?# 過濾掉classes.txt文件
? ? ? ? print(txts)
? ? ? ? # 注意，這里保持圖片的數量和標簽txt文件數量相等，且要保證名字是一一對應的 ? (后面改進，通過判斷txt文件名是否在imgs中即可)
? ? ? ? if len(imgs) == len(txts): ? # 注意：./Annotation_txt 不要把classes.txt文件放進去
? ? ? ? ? ? map_imgs_txts = [(img, txt) for img, txt in zip(imgs, txts)]
? ? ? ? ? ? txts = [txt for txt in txts if txt.split('.')[-1] == 'txt']
? ? ? ? ? ? print(len(txts), txts)
? ? ? ? ? ? for img_name, txt_name in map_imgs_txts:
? ? ? ? ? ? ? ? # 讀取圖片的尺度信息
? ? ? ? ? ? ? ? print("讀取圖片：", img_name)
? ? ? ? ? ? ? ? img = cv2.imread(os.path.join(self.imgs_path, img_name))
? ? ? ? ? ? ? ? height_img, width_img, depth_img = img.shape
? ? ? ? ? ? ? ? print(height_img, width_img, depth_img) ? # h 就是多少行（對應圖片的高度）， w就是多少列（對應圖片的寬度）

? ? ? ? ? ? ? ? # 獲取標注文件txt中的標注信息
? ? ? ? ? ? ? ? all_objects = []
? ? ? ? ? ? ? ? txt_file = os.path.join(self.txts_path, txt_name)
? ? ? ? ? ? ? ? with open(txt_file, 'r') as f:
? ? ? ? ? ? ? ? ? ? objects = f.readlines()
? ? ? ? ? ? ? ? ? ? for object in objects:
? ? ? ? ? ? ? ? ? ? ? ? object = object.strip().split(' ')
? ? ? ? ? ? ? ? ? ? ? ? all_objects.append(object)
? ? ? ? ? ? ? ? ? ? ? ? print(object) ?# ['2', '0.506667', '0.553333', '0.490667', '0.658667']

? ? ? ? ? ? ? ? # 創建xml標簽文件中的標簽
? ? ? ? ? ? ? ? xmlBuilder = Document()
? ? ? ? ? ? ? ? # 創建annotation標簽，也是根標簽
? ? ? ? ? ? ? ? annotation = xmlBuilder.createElement("annotation")

? ? ? ? ? ? ? ? # 給標簽annotation添加一個子標簽
? ? ? ? ? ? ? ? xmlBuilder.appendChild(annotation)

? ? ? ? ? ? ? ? # 創建子標簽folder
? ? ? ? ? ? ? ? folder = xmlBuilder.createElement("folder")
? ? ? ? ? ? ? ? # 給子標簽folder中存入內容，folder標簽中的內容是存放圖片的文件夾，例如：JPEGImages
? ? ? ? ? ? ? ? folderContent = xmlBuilder.createTextNode(self.imgs_path.split('/')[-1]) ?# 標簽內存
? ? ? ? ? ? ? ? folder.appendChild(folderContent) ?# 把內容存入標簽
? ? ? ? ? ? ? ? annotation.appendChild(folder) ? # 把存好內容的folder標簽放到 annotation根標簽下

? ? ? ? ? ? ? ? # 創建子標簽filename
? ? ? ? ? ? ? ? filename = xmlBuilder.createElement("filename")
? ? ? ? ? ? ? ? # 給子標簽filename中存入內容，filename標簽中的內容是圖片的名字，例如：000250.jpg
? ? ? ? ? ? ? ? filenameContent = xmlBuilder.createTextNode(txt_name.split('.')[0] + '.jpg') ?# 標簽內容
? ? ? ? ? ? ? ? filename.appendChild(filenameContent)
? ? ? ? ? ? ? ? annotation.appendChild(filename)

? ? ? ? ? ? ? ? # 把圖片的shape存入xml標簽中
? ? ? ? ? ? ? ? size = xmlBuilder.createElement("size")
? ? ? ? ? ? ? ? # 給size標簽創建子標簽width
? ? ? ? ? ? ? ? width = xmlBuilder.createElement("width") ?# size子標簽width
? ? ? ? ? ? ? ? widthContent = xmlBuilder.createTextNode(str(width_img))
? ? ? ? ? ? ? ? width.appendChild(widthContent)
? ? ? ? ? ? ? ? size.appendChild(width) ? # 把width添加為size的子標簽
? ? ? ? ? ? ? ? # 給size標簽創建子標簽height
? ? ? ? ? ? ? ? height = xmlBuilder.createElement("height") ?# size子標簽height
? ? ? ? ? ? ? ? heightContent = xmlBuilder.createTextNode(str(height_img)) ?# xml標簽中存入的內容都是字符串
? ? ? ? ? ? ? ? height.appendChild(heightContent)
? ? ? ? ? ? ? ? size.appendChild(height) ?# 把width添加為size的子標簽
? ? ? ? ? ? ? ? # 給size標簽創建子標簽depth
? ? ? ? ? ? ? ? depth = xmlBuilder.createElement("depth") ?# size子標簽width
? ? ? ? ? ? ? ? depthContent = xmlBuilder.createTextNode(str(depth_img))
? ? ? ? ? ? ? ? depth.appendChild(depthContent)
? ? ? ? ? ? ? ? size.appendChild(depth) ?# 把width添加為size的子標簽
? ? ? ? ? ? ? ? annotation.appendChild(size) ? # 把size添加為annotation的子標簽

? ? ? ? ? ? ? ? # 每一個object中存儲的都是['2', '0.506667', '0.553333', '0.490667', '0.658667']一個標注目標
? ? ? ? ? ? ? ? for object_info in all_objects:
? ? ? ? ? ? ? ? ? ? # 開始創建標注目標的label信息的標簽
? ? ? ? ? ? ? ? ? ? object = xmlBuilder.createElement("object") ?# 創建object標簽
? ? ? ? ? ? ? ? ? ? # 創建label類別標簽
? ? ? ? ? ? ? ? ? ? # 創建name標簽
? ? ? ? ? ? ? ? ? ? imgName = xmlBuilder.createElement("name") ?# 創建name標簽
? ? ? ? ? ? ? ? ? ? imgNameContent = xmlBuilder.createTextNode(self.classes[int(object_info[0])])
? ? ? ? ? ? ? ? ? ? imgName.appendChild(imgNameContent)
? ? ? ? ? ? ? ? ? ? object.appendChild(imgName) ?# 把name添加為object的子標簽

? ? ? ? ? ? ? ? ? ? # 創建pose標簽
? ? ? ? ? ? ? ? ? ? pose = xmlBuilder.createElement("pose")
? ? ? ? ? ? ? ? ? ? poseContent = xmlBuilder.createTextNode("Unspecified")
? ? ? ? ? ? ? ? ? ? pose.appendChild(poseContent)
? ? ? ? ? ? ? ? ? ? object.appendChild(pose) ?# 把pose添加為object的標簽

? ? ? ? ? ? ? ? ? ? # 創建truncated標簽
? ? ? ? ? ? ? ? ? ? truncated = xmlBuilder.createElement("truncated")
? ? ? ? ? ? ? ? ? ? truncatedContent = xmlBuilder.createTextNode("0")
? ? ? ? ? ? ? ? ? ? truncated.appendChild(truncatedContent)
? ? ? ? ? ? ? ? ? ? object.appendChild(truncated)

? ? ? ? ? ? ? ? ? ? # 創建difficult標簽
? ? ? ? ? ? ? ? ? ? difficult = xmlBuilder.createElement("difficult")
? ? ? ? ? ? ? ? ? ? difficultContent = xmlBuilder.createTextNode("0")
? ? ? ? ? ? ? ? ? ? difficult.appendChild(difficultContent)
? ? ? ? ? ? ? ? ? ? object.appendChild(difficult)

? ? ? ? ? ? ? ? ? ? # 先轉換一下坐標
? ? ? ? ? ? ? ? ? ? # (objx_center, objy_center, obj_width, obj_height)->(xmin，ymin, xmax,ymax)
? ? ? ? ? ? ? ? ? ? x_center = float(object_info[1])*width_img + 1
? ? ? ? ? ? ? ? ? ? y_center = float(object_info[2])*height_img + 1
? ? ? ? ? ? ? ? ? ? xminVal = int(x_center - 0.5*float(object_info[3])*width_img) ? # object_info列表中的元素都是字符串類型
? ? ? ? ? ? ? ? ? ? yminVal = int(y_center - 0.5*float(object_info[4])*height_img)
? ? ? ? ? ? ? ? ? ? xmaxVal = int(x_center + 0.5*float(object_info[3])*width_img)
? ? ? ? ? ? ? ? ? ? ymaxVal = int(y_center + 0.5*float(object_info[4])*height_img)

? ? ? ? ? ? ? ? ? ? # 創建bndbox標簽(三級標簽)
? ? ? ? ? ? ? ? ? ? bndbox = xmlBuilder.createElement("bndbox")
? ? ? ? ? ? ? ? ? ? # 在bndbox標簽下再創建四個子標簽(xmin，ymin, xmax,ymax) 即標注物體的坐標和寬高信息
? ? ? ? ? ? ? ? ? ? # 在voc格式中，標注信息：左上角坐標（xmin, ymin）（xmax, ymax）右下角坐標
? ? ? ? ? ? ? ? ? ? # 1、創建xmin標簽
? ? ? ? ? ? ? ? ? ? xmin = xmlBuilder.createElement("xmin") ?# 創建xmin標簽（四級標簽）
? ? ? ? ? ? ? ? ? ? xminContent = xmlBuilder.createTextNode(str(xminVal))
? ? ? ? ? ? ? ? ? ? xmin.appendChild(xminContent)
? ? ? ? ? ? ? ? ? ? bndbox.appendChild(xmin)
? ? ? ? ? ? ? ? ? ? # 2、創建ymin標簽
? ? ? ? ? ? ? ? ? ? ymin = xmlBuilder.createElement("ymin") ?# 創建ymin標簽（四級標簽）
? ? ? ? ? ? ? ? ? ? yminContent = xmlBuilder.createTextNode(str(yminVal))
? ? ? ? ? ? ? ? ? ? ymin.appendChild(yminContent)
? ? ? ? ? ? ? ? ? ? bndbox.appendChild(ymin)
? ? ? ? ? ? ? ? ? ? # 3、創建xmax標簽
? ? ? ? ? ? ? ? ? ? xmax = xmlBuilder.createElement("xmax") ?# 創建xmax標簽（四級標簽）
? ? ? ? ? ? ? ? ? ? xmaxContent = xmlBuilder.createTextNode(str(xmaxVal))
? ? ? ? ? ? ? ? ? ? xmax.appendChild(xmaxContent)
? ? ? ? ? ? ? ? ? ? bndbox.appendChild(xmax)
? ? ? ? ? ? ? ? ? ? # 4、創建ymax標簽
? ? ? ? ? ? ? ? ? ? ymax = xmlBuilder.createElement("ymax") ?# 創建ymax標簽（四級標簽）
? ? ? ? ? ? ? ? ? ? ymaxContent = xmlBuilder.createTextNode(str(ymaxVal))
? ? ? ? ? ? ? ? ? ? ymax.appendChild(ymaxContent)
? ? ? ? ? ? ? ? ? ? bndbox.appendChild(ymax)

? ? ? ? ? ? ? ? ? ? object.appendChild(bndbox)
? ? ? ? ? ? ? ? ? ? annotation.appendChild(object) ?# 把object添加為annotation的子標簽
? ? ? ? ? ? ? ? f = open(os.path.join(self.xmls_path, txt_name.split('.')[0]+'.xml'), 'w')
? ? ? ? ? ? ? ? xmlBuilder.writexml(f, indent='\t', newl='\n', addindent='\t', encoding='utf-8')
? ? ? ? ? ? ? ? f.close()

if __name__ == '__main__':
? ? txts_path1 = './Annotations_txt'
? ? xmls_path1 = './Annotations_xml'
? ? imgs_path1 = './JPEGImages'

? ? yolo2voc_obj1 = YOLO2VOCConvert(txts_path1, xmls_path1, imgs_path1)
? ? labels = yolo2voc.search_all_classes()
? ? print('labels: ', labels)
? ? yolo2voc_obj1.yolo2voc()
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
5 VOC格式標簽轉化為YOLO格式標簽代碼
代碼參考

Github yolov3：https://github.com/AlexeyAB/darknet/blob/master/scripts/voc_label.py
YOLO官網：https://pjreddie.com/media/files/voc_label.py
把標注的VOC格式的.xml標簽文件，轉化為YOLO格式的txt標簽文件

import xml.etree.ElementTree as ET
import pickle
import os
from os import listdir, getcwd
from os.path import join

# classes = ['hard_hat', 'other', 'regular', 'long_hair', 'braid', 'bald', 'beard']

def convert(size, box):
? ? # size=(width, height) ?b=(xmin, xmax, ymin, ymax)
? ? # x_center = (xmax+xmin)/2 ? ? ? ?y_center = (ymax+ymin)/2
? ? # x = x_center / width ? ? ? ? ? ?y = y_center / height
? ? # w = (xmax-xmin) / width ? ? ? ? h = (ymax-ymin) / height
? ??
? ? x_center = (box[0]+box[1])/2.0
? ? y_center = (box[2]+box[3])/2.0
? ? x = x_center / size[0]
? ? y = y_center / size[1]

? ? w = (box[1] - box[0]) / size[0]
? ? h = (box[3] - box[2]) / size[1]
? ??
? ? # print(x, y, w, h)
? ? return (x,y,w,h)

def convert_annotation(xml_files_path, save_txt_files_path, classes): ?
? ? xml_files = os.listdir(xml_files_path)
? ? print(xml_files)
? ? for xml_name in xml_files:
? ? ? ? print(xml_name)
? ? ? ? xml_file = os.path.join(xml_files_path, xml_name)
? ? ? ? out_txt_path = os.path.join(save_txt_files_path, xml_name.split('.')[0] + '.txt')
? ? ? ? out_txt_f = open(out_txt_path, 'w')
? ? ? ? tree=ET.parse(xml_file)
? ? ? ? root = tree.getroot()
? ? ? ? size = root.find('size')
? ? ? ? w = int(size.find('width').text)
? ? ? ? h = int(size.find('height').text)

? ? ? ? for obj in root.iter('object'):
? ? ? ? ? ? difficult = obj.find('difficult').text
? ? ? ? ? ? cls = obj.find('name').text
? ? ? ? ? ? if cls not in classes or int(difficult) == 1:
? ? ? ? ? ? ? ? continue
? ? ? ? ? ? cls_id = classes.index(cls)
? ? ? ? ? ? xmlbox = obj.find('bndbox')
? ? ? ? ? ? b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text), float(xmlbox.find('ymax').text))
? ? ? ? ? ? # b=(xmin, xmax, ymin, ymax)
? ? ? ? ? ? print(w, h, b)
? ? ? ? ? ? bb = convert((w,h), b)
? ? ? ? ? ? out_txt_f.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')

if __name__ == "__main__":
? ? # 測試程序
? ? # classes = ['hard_hat', 'other', 'regular', 'long_hair', 'braid', 'bald', 'beard']
? ? # xml_files = r'D:\ZF\1_ZF_proj\3_腳本程序\2_voc格式轉yolo格式\voc_labels'
? ? # save_txt_files = r'D:\ZF\1_ZF_proj\3_腳本程序\2_voc格式轉yolo格式\yolo_labels'
? ? # convert_annotation(xml_files, save_txt_files, classes)

? ? #====================================================================================================
? ? # 把帽子頭發胡子的voc的xml標簽文件轉化為yolo的txt標簽文件
? ? # 1、帽子頭發胡子的類別
? ? classes1 = ['hard_hat', 'other', 'regular', 'long_hair', 'braid', 'bald', 'beard']
? ? # 2、voc格式的xml標簽文件路徑
? ? xml_files1 = r'D:\ZF\2_ZF_data\19_Yolov5_dataset\VOCdevkit_hat_hair_beard_補過標簽_合并類別\VOC2007\Annotations_合并類別之后的標簽'
? ? # 3、轉化為yolo格式的txt標簽文件存儲路徑
? ? save_txt_files1 = r'D:\ZF\2_ZF_data\19_Yolov5_dataset\VOCdevkit_hat_hair_beard_yolo\labels'

? ? convert_annotation(xml_files1, save_txt_files1, classes1)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79

? ⊕ ? ⊕ ? ⊕ ? ⊕ ? ⊕ ? ⊕ ? ⊕ ? ⊕ ? ⊕ ? ⊕ ? ⊕ ? ⊕ ? ⊕ ? ⊕ ? ⊕ ? ⊕ ? ⊕ ? ⊕ ? ⊕ ? ⊕ ? ⊕ ? ⊕ ? ⊕ ? ⊕ ? ⊕ ? ⊕ ? ⊕ ? ⊕ ? ⊕ ? ⊕ ?
————————————————
版權聲明：本文為CSDN博主「點亮～黑夜」的原創文章，遵循CC 4.0 BY-SA版權協議，轉載請附上原文出處鏈接及本聲明。
原文鏈接：https://blog.csdn.net/weixin_41010198/article/details/107175968