一、大寬高比目標YOLO檢測參數設置
這是yolov7的一個label的txt文件:
1 0.500 0.201 1.000 0.091
2 0.500 0.402 1.000 0.150
3 0.500 0.604 1.000 0.093
0 0.500 0.804 1.000 0.217
對應的樣本:
長寬比分別是:1/0.091=10.98,? 1/0.150=6.67,? 1/0.093=10.75,? 1/0.217=4.61
計算anchor的程序:
import utils.autoanchor as autoAC# 對數據集重新計算 anchors
new_anchors = autoAC.kmean_anchors('D:\實驗室\論文\論文-多信號參數估計\實驗\YOLOv7\yolov7-main\zzc-multisignals-dataset-yolov7.yaml', 4, 416, 11, 1000, True)
print(new_anchors)
其中,4代表聚類出9種錨框,416代表默認的圖片大小,10表示數據集中標注框寬高比的最大閾值,1000代表kmean聚類算法迭代計算1000次。
一開始報錯了:
C:\Users\14115\.conda\envs\yolov7\python.exe "D:\實驗室\論文\論文-多信號參數估計\實驗\YOLOv7\yolov7-main\calculate anchors.py"
Scanning 'D:\english\yolov7\datasets_higher_cut\train.cache' images and labels... 400 found, 0 missing, 0 empty, 0 corrupted: 100%|██████████| 400/400 [00:00<?, ?it/s]
D:\實驗室\論文\論文-多信號參數估計\實驗\YOLOv7\yolov7-main\utils\autoanchor.py:125: RuntimeWarning: divide by zero encountered in dividek, dist = kmeans(wh / s, n, iter=30) # points, mean distance
Traceback (most recent call last):File "D:\實驗室\論文\論文-多信號參數估計\實驗\YOLOv7\yolov7-main\calculate anchors.py", line 4, in <module>new_anchors = autoAC.kmean_anchors('D:\實驗室\論文\論文-多信號參數估計\實驗\YOLOv7\yolov7-main\zzc-multisignals-dataset-yolov7.yaml', 4, 416, 11, 1000, True)^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^File "D:\實驗室\論文\論文-多信號參數估計\實驗\YOLOv7\yolov7-main\utils\autoanchor.py", line 125, in kmean_anchorsk, dist = kmeans(wh / s, n, iter=30) # points, mean distance^^^^^^^^^^^^^^^^^^^^^^^^^^File "C:\Users\14115\.conda\envs\yolov7\Lib\site-packages\scipy\_lib\_util.py", line 440, in wrapperreturn fun(*args, **kwargs)^^^^^^^^^^^^^^^^^^^^File "C:\Users\14115\.conda\envs\yolov7\Lib\site-packages\scipy\cluster\vq.py", line 467, in kmeansobs = _asarray(obs, xp=xp, check_finite=check_finite)^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^File "C:\Users\14115\.conda\envs\yolov7\Lib\site-packages\scipy\_lib\_array_api.py", line 193, in _asarray_check_finite(array, xp)File "C:\Users\14115\.conda\envs\yolov7\Lib\site-packages\scipy\_lib\_array_api.py", line 109, in _check_finiteraise ValueError(msg)
ValueError: array must not contain infs or NaNs
autoanchor: Running kmeans for 4 anchors on 1600 points...進程已結束,退出代碼為 1
發現問題出在yolov7-main/utils/autoanchor.py里kmean_anchors中用標準差歸一化上:
s = wh.std(0) # sigmas for whitening
k, dist = kmeans(wh / s, n, iter=30)
wh
array([[ 322.4, 23.079],[ 322.4, 38.049],[ 322.4, 23.703],...,[ 322.4, 26.198],[ 322.4, 34.931],[ 322.4, 25.574]])
wh.shape
(1600, 2)
s
array([ 0, 8.5888])
可以看到,因為其中一個維度標準差為0,導致按正常歸一化方法就會報錯。那就檢測0元素,賦一個較小值:
s[s == 0] = 1e-8
運行結果:
說明我的多信號時頻圖數據適合用這幾個anchor:
[[ ? ? ?322.6 ? ? ?26.134]
?[ ? ? 323.99 ? ? ?32.985]
?[ ? ? ? ?322 ? ? ?40.793]
?[ ? ? 322.72 ? ? ?47.953]]
或者......如果數據集樣本寬高比差不多的話,自己估摸著樣本的寬高比設計anchor,在默認anchors的基礎上按比例調整
默認anchor:
# anchors
anchors:- [12,16, 19,36, 40,28] # P3/8- [36,75, 76,55, 72,146] # P4/16- [142,110, 192,243, 459,401] # P5/32
我的樣本寬高比達大概在4:1至11:1 ,所以我自己估摸著修改anchor數值:
# anchors
anchors:- [20,10, 20,8, 20,4] # P3/8 640->80 416->52- [80,40, 80,16, 80,8] # P4/16 640->40 416->26- [300,100, 300,60, 300,30] # P5/32 640->20 416->13
這么設置出問題了.....?
設置只在豎直方向進行非極大值抑制。首先定位非極大值抑制函數:
不過這樣找到的函數未必一定運行到這,通過斷點找非極大值抑制函數更準:
?找到了非極大值抑制函數:
def non_max_suppression(prediction, conf_thres=0.25, iou_thres=0.45, classes=None, agnostic=False, multi_label=False,labels=()):"""Runs Non-Maximum Suppression (NMS) on inference resultsReturns:list of detections, on (n,6) tensor per image [xyxy, conf, cls]"""nc = prediction.shape[2] - 5 # number of classesxc = prediction[..., 4] > conf_thres # candidates# Settingsmin_wh, max_wh = 2, 4096 # (pixels) minimum and maximum box width and heightmax_det = 300 # maximum number of detections per imagemax_nms = 30000 # maximum number of boxes into torchvision.ops.nms()time_limit = 10.0 # seconds to quit afterredundant = True # require redundant detectionsmulti_label &= nc > 1 # multiple labels per box (adds 0.5ms/img)merge = False # use merge-NMSt = time.time()output = [torch.zeros((0, 6), device=prediction.device)] * prediction.shape[0]for xi, x in enumerate(prediction): # image index, image inference# Apply constraints# x[((x[..., 2:4] < min_wh) | (x[..., 2:4] > max_wh)).any(1), 4] = 0 # width-heightx = x[xc[xi]] # confidence# Cat apriori labels if autolabellingif labels and len(labels[xi]):l = labels[xi]v = torch.zeros((len(l), nc + 5), device=x.device)v[:, :4] = l[:, 1:5] # boxv[:, 4] = 1.0 # confv[range(len(l)), l[:, 0].long() + 5] = 1.0 # clsx = torch.cat((x, v), 0)# If none remain process next imageif not x.shape[0]:continue# Compute confif nc == 1:x[:, 5:] = x[:, 4:5] # for models with one class, cls_loss is 0 and cls_conf is always 0.5,# so there is no need to multiplicate.else:x[:, 5:] *= x[:, 4:5] # conf = obj_conf * cls_conf# Box (center x, center y, width, height) to (x1, y1, x2, y2)#這里LFM,SFM的概率就遠高于BPSK,Frank了box = xywh2xyxy(x[:, :4])# Detections matrix nx6 (xyxy, conf, cls)if multi_label:i, j = (x[:, 5:] > conf_thres).nonzero(as_tuple=False).Tx = torch.cat((box[i], x[i, j + 5, None], j[:, None].float()), 1)else: # best class onlyconf, j = x[:, 5:].max(1, keepdim=True)x = torch.cat((box, conf, j.float()), 1)[conf.view(-1) > conf_thres]# Filter by classif classes is not None:x = x[(x[:, 5:6] == torch.tensor(classes, device=x.device)).any(1)]# Apply finite constraint# if not torch.isfinite(x).all():# x = x[torch.isfinite(x).all(1)]# Check shape#這里只剩下LFM,SFM類了n = x.shape[0] # number of boxesif not n: # no boxescontinueelif n > max_nms: # excess boxesx = x[x[:, 4].argsort(descending=True)[:max_nms]] # sort by confidence# Batched NMSc = x[:, 5:6] * (0 if agnostic else max_wh) # classesboxes, scores = x[:, :4] + c, x[:, 4] # boxes (offset by class), scoresi = torchvision.ops.nms(boxes, scores, iou_thres) # NMSif i.shape[0] > max_det: # limit detectionsi = i[:max_det]if merge and (1 < n < 3E3): # Merge NMS (boxes merged using weighted mean)# update boxes as boxes(i,4) = weights(i,n) * boxes(n,4)iou = box_iou(boxes[i], boxes) > iou_thres # iou matrixweights = iou * scores[None] # box weightsx[i, :4] = torch.mm(weights, x[:, :4]).float() / weights.sum(1, keepdim=True) # merged boxesif redundant:i = i[iou.sum(1) > 1] # require redundancyoutput[xi] = x[i]if (time.time() - t) > time_limit:print(f'WARNING: NMS time limit {time_limit}s exceeded')break # time limit exceededreturn output
?有一段很關鍵的話:
i = torchvision.ops.nms(boxes, scores, iou_thres) # NMS
如果我們只在豎直方向進行非極大值抑制的話,把boxes中x1,x2分別設置為圖片最左邊和最右邊就好了,這樣計算的IOU是不考慮水平方向的。
注意,下面限制NMS的句子加的位置不對:
# Batched NMS
c = x[:, 5:6] * (0 if agnostic else max_wh) # classes
boxes, scores = x[:, :4] + c, x[:, 4] # boxes (offset by class), scoresboxes[:,0]=0
boxes[:, 2] = 450i = torchvision.ops.nms(boxes, scores, iou_thres) # NMS box的數值和x是不一致的
必須加在+c前
+c是使得NMS可以考慮不同類別
正常的boxes:
+c以后再限制NMS的boxes:
最終的結果非常完美了:
我的另一篇博客記錄了早期的實驗現象:
YOLOv7訓練時4個類別只出2個類別
二、檢測效果深度優化
2.1 更大比值的anchor
即便用更大的寬高比樣本,回歸效果依然不錯(但是置信度略低了,還得繼續優化anchors):
上面這種框只框主瓣,下面是它的標簽?
1 0.500 0.201 1.000 0.037
2 0.500 0.402 1.000 0.150
3 0.500 0.604 1.000 0.045
0 0.500 0.804 1.000 0.217?
長寬比分別是:1/0.037=27.03,? 1/0.150=6.67,? 1/0.045=22.22,? 1/0.217=4.61?
這樣的話,每個尺度特征圖對應的anchor寬高比最好在27-5
我新設計的anchors:
# anchors
anchors:- [20,4, 20,3, 20,2] # P3/8 640->80 416->52- [80,16, 80,8, 80,2] # P4/16 640->40 416->26- [300,60, 300,20, 300,11] # P5/32 640->20 416->13
效果:?
感覺置信度還是低
2.2 帶更多先驗信息的anchor
anchor加入更多先驗信息,已知款的寬度在416:
anchors:- [20,4, 20,3, 20,2] # P3/8 640->80 416->52- [80,16, 80,8, 80,2] # P4/16 640->40 416->26- [410,82, 410,15, 410,5] # P5/32 640->20 416->13
?效果:?
2.3 限制三種尺度anchor的寬度
試試我的另一種anchor方案:
anchors:- [410,4, 410,3, 410,2] # P3/8 640->80 416->52- [410,32, 410,16, 410,8] # P4/16 640->40 416->26- [410,85, 410,60, 410,45] # P5/32 640->20 416->13
效果:?
感覺....還是老樣子
2.4 限制三種尺度anchor的寬度
試試640*640的基礎上用大比例anchor
parser.add_argument('--img-size', nargs='+', type=int, default=[640, 640], help='[train, test] image sizes')