深度學習：C++和Python如何對大圖進行小目標檢測

? ? ? ? 最近在醫美和工業兩條線來回穿梭，甚是疲倦，一會兒搞搞醫美的人像美容，一會兒搞搞工業的檢測，最近新接的一個項目，關于瑕疵檢測的，目標圖像也并不是很大吧，需要放大后，才能看見細小的瑕疵目標。有兩種，一種是912*5000的圖，一種是1024*2048的圖，但是深度學習訓練的時候，對圖像的大小有一定的限制，比方說我的電腦配置可能就只能最大跑1024*1024大小的圖像，否則就出現內存溢出，無法進行訓練，對于這種912*5000的圖就比較不好訓練，如果把它強制轉化成912*912大小的話，細小的目標可能會丟失。所以只能對其進行裁剪，如何裁剪，裁剪的多大，這樣根據你自己的圖像情況去設置，比方說你的圖像是有一些冗余信息的，可以考慮裁剪的時候把空白區域裁剪出去，反正具體問題具體分析吧。具體最后瑕疵檢測我用的哪個模型，這里就不贅述了，這里主要是想總結一些圖像裁剪的方法，代碼實現，以供大家參考使用。

?方法1、

std::vector<std::vector<int64_t>> compute_steps_for_sliding_window(std::vector<int64_t> image_size, std::vector<int64_t> tile_size, double tile_step_size)
{std::vector<double> target_step_sizes_in_voxels(tile_size.size());for (int i = 0; i < tile_size.size(); ++i)target_step_sizes_in_voxels[i] = tile_size[i] * tile_step_size;std::vector<int64_t> num_steps(tile_size.size());for (size_t i = 0; i < image_size.size(); ++i)num_steps[i] = static_cast<int64_t>(std::ceil((image_size[i] - tile_size[i]) / target_step_sizes_in_voxels[i])) + 1;std::vector<std::vector<int64_t>> steps;for (int dim = 0; dim < tile_size.size(); ++dim) {int64_t max_step_value = image_size[dim] - tile_size[dim];double actual_step_size;if (num_steps[dim] > 1)actual_step_size = static_cast<double>(max_step_value) / (num_steps[dim] - 1);elseactual_step_size = 99999999999;std::vector<int64_t> steps_here(num_steps[dim]);for (size_t i = 0; i < num_steps[dim]; ++i)steps_here[i] = static_cast<int64_t>(std::round(actual_step_size * i));steps.push_back(steps_here);}return steps;
}

?方法2：

std::vector<cv::Mat> splitImageIntoBlocks(const cv::Mat& image, int blockSize) {std::vector<cv::Mat> blocks;int rows = image.rows / blockSize;int cols = image.cols / blockSize;for (int i = 0; i < rows; ++i) {for (int j = 0; j < cols; ++j) {cv::Rect roi(j * blockSize, i * blockSize, blockSize, blockSize);cv::Mat block = image(roi).clone();blocks.push_back(block);}}return blocks;
}

方法3：

int divideImage(const cv::Mat& img, int blockWidth,int blockHeight,std::vector<cv::Mat>& blocks){// init image dimensionsint imgWidth = img.cols;int imgHeight = img.rows;std::cout << "IMAGE SIZE: " << "(" << imgWidth << "," << imgHeight << ")" << std::endl;// init block dimensionsint bwSize;int bhSize;int y0 = 0;while (y0 < imgHeight){// compute the block heightbhSize = ((y0 + blockHeight) > imgHeight) * (blockHeight - (y0 + blockHeight - imgHeight)) + ((y0 + blockHeight) <= imgHeight) * blockHeight;int x0 = 0;while (x0 < imgWidth){// compute the block heightbwSize = ((x0 + blockWidth) > imgWidth) * (blockWidth - (x0 + blockWidth - imgWidth)) + ((x0 + blockWidth) <= imgWidth) * blockWidth;// crop blockblocks.push_back(img(cv::Rect(x0, y0, bwSize, bhSize)).clone());// update x-coordinatex0 = x0 + blockWidth;}// update y-coordinatey0 = y0 + blockHeight;}return 0;
}

代碼細節就不在描述了哈，自己理解吧，上面是c++的實現，下面寫一個python實現的也比較簡單，直接利用滑動框的庫SAHI，只要pip這個庫，調用這個庫里的滑動框函數就可以了實現了。

代碼如下：

# arrange an instance segmentation model for test
from sahi import AutoDetectionModel
import time
import cv2
from sahi.utils.cv import read_image
from sahi.utils.file import download_from_url
from sahi.predict import get_prediction, get_sliced_prediction, predict
from IPython.display import Image
model_path = 'runs/train/exp/weights/best.pt'
detection_model = AutoDetectionModel.from_pretrained(model_type='xxx',model_path=model_path,confidence_threshold=0.3,device="cuda:0", # or 'cuda:0'
)
image_name="anormal.jpg"
currentTime = time.time()
result = get_sliced_prediction("test/"+image_name,detection_model,slice_height = 640,slice_width = 640,overlap_height_ratio = 0.2,overlap_width_ratio = 0.2
)
result.export_visuals(export_dir="test/",file_name="output_"+image_name)#圖像保存，output_anormal.jpg
endTime = time.time()
print("時間差:", endTime - currentTime)

關于這里面的model_type的變量值，我此處用xx表示了，你可以在代碼里按住ctr。點函數

AutoDetectionModel進到相應類的腳本，在腳本最上方有model_tpye變量里選擇你用的模型，比方說你用的yolov8，那么xxx就置換為yolov8。

MODEL_TYPE_TO_MODEL_CLASS_NAME = {"yolov8": "Yolov8DetectionModel","rtdetr": "RTDetrDetectionModel","mmdet": "MmdetDetectionModel","yolov5": "Yolov5DetectionModel","detectron2": "Detectron2DetectionModel","huggingface": "HuggingfaceDetectionModel","torchvision": "TorchVisionDetectionModel","yolov5sparse": "Yolov5SparseDetectionModel","yolonas": "YoloNasDetectionModel","yolov8onnx": "Yolov8OnnxDetectionModel",
}

然后運行就可以了。不在細細描述了，自己研究吧。不理解的可以評論詢問。

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/web/39117.shtml
繁體地址，請注明出處：http://hk.pswp.cn/web/39117.shtml
英文地址，請注明出處：http://en.pswp.cn/web/39117.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！