1 卷積運算的兩個問題:
1.1 圖像邊緣信息使用少
? ? ? ? 邊緣的像素點可能只會被用一次或者2次,中間的會用的更多。
1.2 圖像被壓縮
? ? ? ? 5*5的圖像,如果經過3*3的卷積核后,大小變成3*3的。
? ? ? ? N*N的圖像,果經過F*F的卷積核后,大小變成(N-F+1)*(N-F+1)的。
為了解決這個問題,在邊緣會增加像素。像素點增加多少,取決于卷積核的尺寸和滑動的步長。
2 CNN模型:
卷積層,池化層,展開層,全連接層,如何組合成一個有效的模型。這可以參考經典的CNN模型。
一種方法,是參考經典的模型來搭建自己的模型。另外一種方法,可以利用經典模型的部分模塊對圖像進行預處理,然后用處理完的數據搭建自己的模型。
2.1 LeNet-5
輸入圖像:32*32灰度圖,1個通道
訓練參數:約60000個
隨著通道越來越深,圖像尺寸變小,通道增多;卷積和池化成對出現
2.2?AlexNet-5
原始圖像經過96個11*11的filter,步長為4變換成:(227-11)/4+1=55,圖像變化成55*55*96
55x55x96經過3*3步長為2的池化變化成:(55-3)/2+1=27,圖像變化成27*27*96
....
輸入圖像:227*227*3 RGB圖,3個通道
訓練參數:約60 000 000個
特點:
? ? ? 1 適用于識別較為復雜的彩色圖,可識別10000種類別
? ? ? 2 結構比LeNet更為復雜,使用Relu作為激活函數
2.3 VGG16
與AlexNet不同的是,在VGG16種卷積核和池化層的大小都是固定的。
輸入圖像:227*227*3 RGB圖,3個通道
訓練參數:約138 000 000個
特點:1 所有卷積核的寬和高都是3,步長是1.padding都使用same convolution
? ? ? 2 所有池化層的filter寬和高都是2,步長都是2
?? ? ?3 相比于alexnet,有更多的filter用于提取輪廓信息,具有更高的準確性
3 CNN模型的搭建
一種方法,是參考經典的模型來搭建自己的模型。
另外一種方法,可以利用經典模型的部分模塊對圖像進行預處理,然后用處理完的數據搭建自己的模型。
比如在VGG16種,可以去除掉FC層,替換成一個MLP層,然后再加一個FC層來復用原理的模型,然后吧7*7*512作為輸入給到MLP進行訓練。
4 代碼示例
4.1 建立CNN模型進行貓狗識別
1 加載數據
#load the data
from keras.preprocessing.image import ImageDataGenerator
#歸一化
train_datagen = ImageDataGenerator(rescale=1./255)
#從目錄里面導入數據
training_set = train_datagen.flow_from_directory('./cats_and_dogs_filtered/train/', target_size=(50,50), batch_size=32,class_mode='binary')
2 建立模型
#set up the cnn model
from keras.models import Sequential
from keras.layers import Conv2D, MaxPool2D, Flatten, Dense
model = Sequential()
#卷積層
model.add(Conv2D(32,(3,3), input_shape=(50,50,3), activation = 'relu'))
#池化層
model.add(MaxPool2D(pool_size=(2,2)))
#卷積層
model.add(Conv2D(32,(3,3), activation = 'relu'))
#池化層
model.add(MaxPool2D(pool_size=(2,2)))
#flattening layer
model.add(Flatten())
#FC layer
model.add(Dense(units=128, activation = 'relu'))
model.add(Dense(units=1, activation='sigmoid'))
3 參數配置
#configure the model
model.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics=['accuracy'])
4 查看模型結構
model.summary()
Model: "sequential_4" _________________________________________________________________Layer (type) Output Shape Param # =================================================================conv2d_5 (Conv2D) (None, 48, 48, 32) 896 max_pooling2d_4 (MaxPooling (None, 24, 24, 32) 0 2D) conv2d_6 (Conv2D) (None, 22, 22, 32) 9248 max_pooling2d_5 (MaxPooling (None, 11, 11, 32) 0 2D) flatten_2 (Flatten) (None, 3872) 0 dense_4 (Dense) (None, 128) 495744 dense_5 (Dense) (None, 1) 129 ================================================================= Total params: 506,017 Trainable params: 506,017 Non-trainable params: 0
5 訓練模型
#train the model
model.fit_generator(training_set, epochs=25)
6 計算訓練集的準確度
#accuracy on the training data
accuracy_train = model.evaluate(training_set)
print(accuracy_train)
7 計算測試集的準確度
#accuracy on the test data
test_set = train_datagen.flow_from_directory('./cats_and_dogs_filtered/validation/', target_size=(50,50), batch_size=32,class_mode='binary')
#accuracy on the training data
accuracy_test = model.evaluate(test_set)
print(accuracy_test)
8 網站下載圖片的測試
#load signal image
from keras.utils import load_img, img_to_array
#pic_dog = './cats_and_dogs_filtered/test_from_internet.jpg'
pic_dog = './cats_and_dogs_filtered/train/dogs/dog.1.jpg'pic_dog = load_img(pic_dog,target_size=(50,50))
pic_dog = img_to_array(pic_dog)
pic_dog = pic_dog/255
pic_dog = pic_dog.reshape(1,50,50,3)
predictions = model.predict(pic_dog)
print(predictions)# 對于二分類問題
if predictions.shape[1] == 1:result = (predictions > 0.5).astype(int)
# 對于多分類問題
else:result = np.argmax(predictions, axis=1)print(result)
4.2 改造VGG16進行識別
1 加載和預處理圖片格式,利用VGG16處理圖片
from keras.utils import load_img, img_to_array
from keras.applications.vgg16 import VGG16
from keras.applications.vgg16 import preprocess_input
import numpy as np
model_vgg = VGG16(weights='imagenet', include_top=False)
def modelProcess(img_path, model):img = load_img(img_path,target_size=(224,224))img = img_to_array(img)x = np.expand_dims(img, axis = 0)x = preprocess_input(x)x_vgg = model_vgg.predict(x)x_vgg = x_vgg.reshape(1,25088)return x_vgg
import os
folder = "./cats_and_dogs_filtered/train/cats"
dirs = os.listdir(folder)#generate path for the images
img_path = []
for i in dirs:if os.path.splitext(i)[1] == ".jpg":img_path.append(i)
img_path = [folder+"//"+i for i in img_path]#preprocess multiple images
features1 = np.zeros([len(img_path), 25088])
for i in range(len(img_path)):feature_i = modelProcess(img_path[i], model_vgg)print('preprocessed: ', img_path[i])features1[i] = feature_ifolder = "./cats_and_dogs_filtered/train/dogs"
dirs = os.listdir(folder)#generate path for the images
img_path = []
for i in dirs:if os.path.splitext(i)[1] == ".jpg":img_path.append(i)
img_path = [folder+"//"+i for i in img_path]#preprocess multiple images
features2 = np.zeros([len(img_path), 25088])
for i in range(len(img_path)):feature_i = modelProcess(img_path[i], model_vgg)print('preprocessed: ', img_path[i])features2[i] = feature_i#label the result
print(features1.shape, features2.shape)
y1 = np.zeros(1000)
y2 = np.ones(1000)X = np.concatenate((features1,features2),axis=0)
y = np.concatenate((y1,y2), axis=0)
2 分離測試和訓練數據
#split the traing and test data
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.3,random_state=50)
print(X_train.shape, X_test.shape, X.shape)
3 建立模型
#set up the mlp model
from keras.models import Sequential
from keras.layers import Dense
model = Sequential()
model.add(Dense(units=10, activation='relu',input_dim=25088))
model.add(Dense(units=1, activation='sigmoid'))
model.summary()
Layer (type) Output Shape Param # =================================================================dense_12 (Dense) (None, 10) 250890 dense_13 (Dense) (None, 1) 11 ================================================================= Total params: 250,901 Trainable params: 250,901 Non-trainable params: 0
4 配置和訓練模型
#confg the model
model.compile(optimizer='adam', loss = 'binary_crossentropy', metrics=['accuracy'])
#train the model
model.fit(X_train, y_train, epochs=50)
5 訓練集測試
from sklearn.metrics import accuracy_score
y_train_predict_probs= model.predict(X_train)
y_train_predict = (y_train_predict_probs > 0.5).astype(int)
accuracy_train = accuracy_score(y_train,y_train_predict)
print(accuracy_train)
6?測試集測試
from sklearn.metrics import accuracy_score
y_test_predict_probs= model.predict(X_test)
y_test_predict = (y_test_predict_probs > 0.5).astype(int)
accuracy_test = accuracy_score(y_test,y_test_predict)
print(accuracy_test)
7 下載圖片測試
#load the data cat
from keras.utils import load_img, img_to_array
pic_path = './cats_and_dogs_filtered/test_from_ie_cat.jpeg'
pic_cat = load_img(pic_path,target_size=(224,224))
img = img_to_array(pic_cat)
x = np.expand_dims(img, axis = 0)
x = preprocess_input(x)
features = model_vgg.predict(x)
features = features.reshape(1,7*7*512)
result_tmp = model.predict(features)
result = (result_tmp > 0.5).astype(int)
print(result)