svm機器學習算法_SVM機器學習算法介紹

svm機器學習算法

According to OpenCV's "Introduction to Support Vector Machines", a Support Vector Machine (SVM):

根據OpenCV“支持向量機簡介”，支持向量機(SVM)：

...is a discriminative classifier formally defined by a separating hyperplane. In other words, given labeled training data (supervised learning), the algorithm outputs an optimal hyperplane which categorizes new examples.

...是由分離的超平面正式定義的判別式分類器。換句話說，給定帶標簽的訓練數據(監督學習)，該算法會輸出對新示例進行分類的最佳超平面。

An SVM cost function seeks to approximate the logistic function with a piecewise linear. This machine learning algorithm is used for classification problems and is part of the subset of supervised learning algorithms.

SVM成本函數試圖以分段線性近似邏輯函數。這種機器學習算法用于分類問題，是監督學習算法子集的一部分。

成本函數 (The Cost Function)

The Cost Function is used to train the SVM. By minimizing the value of J(theta), we can ensure that the SVM is as accurate as possible. In the equation, the functions cost1 and cost0 refer to the cost for an example where y=1 and the cost for an example where y=0. For SVMs, cost is determined by kernel (similarity) functions.

成本函數用于訓練SVM。通過最小化J(theta)的值，我們可以確保SVM盡可能準確。在等式中，函數cost1和cost0表示y = 1的示例的成本和y = 0的示例的成本。對于SVM，成本由內核(相似性)函數確定。

核仁 (Kernels)

Polynomial features tend to be computationally expensive, and may increase runtime with large datasets. Instead of adding more polynomial features, it's better to add landmarks to test the proximity of other datapoints against. ?Each member of the training set can be considered a landmark, and a kernel is the similarity function that measures how close an input is to said landmarks.

多項式特征在計算上趨向于昂貴，并且對于大型數據集可能會增加運行時間。與其添加更多的多項式特征，不如添加界標來測試其他數據點的接近度。訓練集的每個成員都可以被視為地標，并且核是相似度函數，其測量輸入與所述地標的接近程度。

大保證金分類器 (Large Margin Classifier)

An SVM will find the line or hyperplane that splits the data with the largest margin possible. Though there will be outliers that sway the line in a certain direction, a C value that is small enough will enforce regularization throughout.

SVM將找到可能以最大裕量分割數據的線或超平面。盡管會有異常值在一定方向上影響直線，但足夠小的C值將在整個過程中強制進行正則化。

The following is code written for training, predicting and finding accuracy for SVM in Python:

以下是編寫的用于訓練，預測和發現Python中SVM準確性的代碼：

import numpy as npclass Svm (object):"""" Svm classifier """def __init__ (self, inputDim, outputDim):self.W = None# - Generate a random svm weight matrix to compute loss                 ##   with standard normal distribution and Standard deviation = 0.01.    #sigma =0.01self.W = sigma * np.random.randn(inputDim,outputDim)def calLoss (self, x, y, reg):"""Svm loss functionD: Input dimension.C: Number of Classes.N: Number of example.Inputs:- x: A numpy array of shape (batchSize, D).- y: A numpy array of shape (N,) where value < C.- reg: (float) regularization strength.Returns a tuple of:- loss as single float.- gradient with respect to weights self.W (dW) with the same shape of self.W."""loss = 0.0dW = np.zeros_like(self.W)# - Compute the svm loss and store to loss variable.                        ## - Compute gradient and store to dW variable.                              ## - Use L2 regularization                                                  ##Calculating score matrixs = x.dot(self.W)#Score with yis_yi = s[np.arange(x.shape[0]),y]#finding the deltadelta = s- s_yi[:,np.newaxis]+1#loss for samplesloss_i = np.maximum(0,delta)loss_i[np.arange(x.shape[0]),y]=0loss = np.sum(loss_i)/x.shape[0]#Loss with regularizationloss += reg*np.sum(self.W*self.W)#Calculating dsds = np.zeros_like(delta)ds[delta > 0] = 1ds[np.arange(x.shape[0]),y] = 0ds[np.arange(x.shape[0]),y] = -np.sum(ds, axis=1)dW = (1/x.shape[0]) * (x.T).dot(ds)dW = dW + (2* reg* self.W)return loss, dWdef train (self, x, y, lr=1e-3, reg=1e-5, iter=100, batchSize=200, verbose=False):"""Train this Svm classifier using stochastic gradient descent.D: Input dimension.C: Number of Classes.N: Number of example.Inputs:- x: training data of shape (N, D)- y: output data of shape (N, ) where value < C- lr: (float) learning rate for optimization.- reg: (float) regularization strength.- iter: (integer) total number of iterations.- batchSize: (integer) number of example in each batch running.- verbose: (boolean) Print log of loss and training accuracy.Outputs:A list containing the value of the loss at each training iteration."""# Run stochastic gradient descent to optimize W.lossHistory = []for i in range(iter):xBatch = NoneyBatch = None# - Sample batchSize from training data and save to xBatch and yBatch   ## - After sampling xBatch should have shape (batchSize, D)              ##                  yBatch (batchSize, )                                 ## - Use that sample for gradient decent optimization.                   ## - Update the weights using the gradient and the learning rate.        ##creating batchnum_train = np.random.choice(x.shape[0], batchSize)xBatch = x[num_train]yBatch = y[num_train]loss, dW = self.calLoss(xBatch,yBatch,reg)self.W= self.W - lr * dWlossHistory.append(loss)# Print loss for every 100 iterationsif verbose and i % 100 == 0 and len(lossHistory) is not 0:print ('Loop {0} loss {1}'.format(i, lossHistory[i]))return lossHistorydef predict (self, x,):"""Predict the y output.Inputs:- x: training data of shape (N, D)Returns:- yPred: output data of shape (N, ) where value < C"""yPred = np.zeros(x.shape[0])# -  Store the predict output in yPred                                    #s = x.dot(self.W)yPred = np.argmax(s, axis=1)return yPreddef calAccuracy (self, x, y):acc = 0# -  Calculate accuracy of the predict value and store to acc variable    yPred = self.predict(x)acc = np.mean(y == yPred)*100return acc

翻譯自: https://www.freecodecamp.org/news/support-vector-machines/

svm機器學習算法

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/news/390590.shtml
繁體地址，請注明出處：http://hk.pswp.cn/news/390590.shtml
英文地址，請注明出處：http://en.pswp.cn/news/390590.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！