1-首先將拿到的數據集進行重新命名(dataset1:是經過校色之后裁剪的圖片;dataset2:原圖)
圖片文件從1.jpg開始命名的代碼:
folder_path = r'C:\Users\23608\Desktop\Luli_work\data\fanStudent\tongueseg\Fan\Fan\.jpg'
new_folder = r'C:\Users\23608\Desktop\Luli_work\data\fanStudent\tongueseg\imgOrig'jpg_files = [f for f in os.listdir(folder_path) if f.endswith('.jpg')]
n = 1
for i, jpg_file in enumerate(jpg_files):new_filename = f'{n}.jpg'n =n+1# 構造原始文件和新文件的完整路徑original_path = os.path.join(folder_path, jpg_file)new_path = os.path.join(new_folder, new_filename)# 復制文件到新文件夾并重命名shutil.copy(original_path, new_path)print("重命名完成!")
2-將數據預處理之后的數據上傳到服務器,接著使用yolov8SAM代碼將代碼中的舌體掩碼跑出來:
數據存放位置
Imgorig:/share1/luli/tongueseg/data/dataset2/imgOrig/
IMgcrop:/share1/luli/tongueseg/data/dataset1/imgCrop/
微調SAM的查找
How to Fine-Tune Segment Anything
1. How to Fine-Tune Segment Anything
We gave an overview of the SAM architecture in the introduction section. The image encoder has a complex architecture with many parameters. To fine-tune the model, it makes sense for us to focus on the mask decoder which is lightweight and therefore easier, faster and more memory efficient to fine-tune.
In order to fine tune SAM, we need to extract the underlying pieces of its architecture (image and prompt encoders, mask decoder). We cannot use SamPredictor.predict (link) for two reasons:
· We want to fine tune only the mask decoder
· This function calls SamPredictor.predict_torch which has the @torch.no_grad() decorator (link), which prevents us from computing gradients
Thus, we need to examine the SamPredictor.predict function and call the appropriate functions with gradient calculation enabled on the part we want to fine tune (the mask decoder). Doing this is also a good way to learn more about how SAM works.
2. Creating a Custom Dataset
We need three things to fine tune our model:(這里其實并沒有說GT、datase的具體類型)
· Images on which to draw segmentations
· Segmentation ground truth masks
· Prompts to feed into the model
后續在代碼里面可以看到圖片代碼里面是png格式,mask掩碼是黑白二值圖。
我現在使用的是labelme標記的舌頭,json文件,需要把json文件轉化成二值圖。
這里涉及到3個數據類型的轉化:
· png圖片轉json文件(也就是使用yolov8+SAM對數據集進行簡單的分割之后再人工進行微調,此時的微調使用的是labelme格式是json,這里使用到TongueSAM里馬賽克json里面的代碼)
· json文件轉成png圖片,這里使用到labelme里面的自己的代碼(參考鏈接)[labelme] json格式批量轉換為mask.png,步驟入下:
1.使用labelme制作語義分割數據集,生成.json格式文件,將所有放置于一個文件夾下。
2.找到labelme安裝位置的json_to_dataset.py文件,(可以使用Everything軟件)
用下面的代碼替換里面的代碼:
import argparse
import json
import os
import os.path as osp
import warnings
import copy
import numpy as np
import PIL.Image
from skimage import io
import yaml
from labelme import utilsdef main():parser = argparse.ArgumentParser()parser.add_argument('json_file') # 標注文件json所在的文件夾parser.add_argument('-o', '--out', default=None)args = parser.parse_args()json_file = args.json_filelist = os.listdir(json_file) # 獲取json文件列表for i in range(0, len(list)):path = os.path.join(json_file, list[i]) # 獲取每個json文件的絕對路徑filename = list[i][:-5]