參考文章:cs231n assignment1——SVM
SVM
訓練階段,我們的目的是為了得到合適的 𝑊 和 𝑏 ,為實現這一目的,我們需要引進損失函數,然后再通過梯度下降來訓練模型。
def svm_loss_naive(W, X, y, reg): #梯度矩陣初始化dW = np.zeros(W.shape) # initialize the gradient as zero# compute the loss and the gradient#計算損失和梯度num_classes = W.shape[1]num_train = X.shape[0]loss = 0.0for i in range(num_train):#W*Xiscore = X[i].dot(W)correct_score = score[y[i]]for j in range(num_classes):#預測正確if j == y[i]:continue#W*Xi-Wyi*Xi+1margin = score[j] - correct_score + 1 # 拉格朗日if margin > 0:loss += margin#平均損失loss /= num_train#加上正則化λ||W||2# Add regularization to the loss.loss += reg * np.sum(W * W) dW /= num_traindW += reg * W return loss, dW
向量形式計算損失函數
def svm_loss_vectorized(W, X, y, reg):loss = 0.0dW = np.zeros(W.shape)num_train=X.shape[0]classes_num=X.shape[1]score = X.dot(W)#矩陣大小變化,大小不同的矩陣不可以加減correct_scores = score[range(num_train), list(y)].reshape(-1, 1) #[N, 1]margin = np.maximum(0, score - correct_scores + 1)margin[range(num_train), list(y)] = 0#正則化loss = np.sum(margin) / num_trainloss += 0.5 * reg * np.sum(W * W)#大于0的置1,其余為0margin[margin>0] = 1margin[range(num_train),list(y)] = 0margin[range(num_train),y] -= np.sum(margin,1)dW=X.T.dot(margin)dW=dW/num_traindW=dW+reg*Wreturn loss, dW
SGD優化損失函數
使用批量隨機梯度下降法來更新參數,每次隨機選取batchsize個樣本用于更新參數 𝑊 和 𝑏 。
for it in range(num_iters):X_batch = Noney_batch = Noidxs = np.random.choice(num_train, batch_size, replace=True)X_batch = X[idxs]y_batch = y[idxloss, grad = self.loss(X_batch, y_batch, reg)loss_history.append(losself.W -= learning_rate * grif verbose and it % 100 == 0:print("iteration %d / %d: loss %f" % (it, num_iters, loss))return loss_history
交叉驗證調整超參數
為了獲取最優的超參數,我們可以將整個訓練集劃分為訓練集和驗證集,然后選取在驗證集上準確率最高的一組超參數。