感知器 機器學習
In this post, we are going to have a look at a program written in Python3
using numpy
. We will discuss the basics of what a perceptron is, what is the delta rule and how to use it to converge the learning of the perceptron.
在本文中,我們將看一下使用numpy
用Python3
編寫的程序。 我們將討論什么是感知器 ,什么是增量規則以及如何使用它來融合感知器學習的基礎知識。
什么是感知器? (What is a perceptron?)
The perceptron is an algorithm for supervised learning of binary classifiers (let’s assumer {1, 0}
). We have a linear combination of weight vector and the input data vector that is passed through an activation function and then compared to a threshold value. If the linear combination is greater than the threshold, we predict the class as 1
otherwise 0. Mathematically,
感知器是一種用于監督學習二進制分類器(讓我們的假設{1, 0}
)的算法。 我們具有權重向量和輸入數據向量的線性組合,該向量通過激活函數傳遞,然后與閾值進行比較。 如果線性組合大于閾值,則我們將類別預測為1
否則預測為0. Mathematically,

Perceptrons only represent linearly separable problems. They fail to converge if the training examples are not linearly separable. This brings into picture the delta rule.
感知器僅代表線性可分離的問題。 如果訓練示例不是線性可分離的,則它們無法收斂。 這使增量規則成為現實。
The delta rule converges towards a best-fit approximation of the target concept. The key idea is to use gradient descent to search the hypothesis space of all possible weight vectors.
增量法則趨向于達到目標概念的最佳擬合。 關鍵思想是使用梯度下降 搜索所有可能的權重向量的假設空間。
Note: This provides the basis for “Backpropogation” algorithm.
注意:這為“反向傳播”算法提供了基礎。
Now, let’s discuss the problem at hand. The program will read a dataset(tab separated file) and treat the first column as the target concept. The values present in the target concept are A and B, we will consider A as +ve class or 1
and B as -ve class or 0
. The program implements the perceptron training rule in batch mode with a constant learning rate and an annealing(decreasing as the number of iterations increase) learning rate, starting with learning rate as 1.
現在,讓我們討論眼前的問題。 該程序將讀取數據集 (制表符分隔的文件),并將第一列視為目標概念 。 目標概念中的值是A和B,我們將A視為+ ve類或1
,將B視為-ve類或0
。 該程序以恒定的學習率和退火(隨著迭代次數增加而減少)的學習率以批處理模式實施感知器訓練規則,從學習率開始為1。

where Y(x, w) is the set of samples which are misclassified. We will use the count or the number of misclassified points as our error rate(i.e. | Y(x, w)|). The output will also be a tab separated(tsv) file containing the error for each iteration, i.e. it will have 100 columns. Also, it will have 2 rows, one for normal learning rate and one for annealing learning rate.
其中Y(x,w)是錯誤分類的樣本集。 我們將使用計數或錯誤分類的點數作為錯誤率(即| Y(x,w)|)。 輸出還將是一個制表符分隔的(tsv)文件,其中包含每次迭代的錯誤,即它將有100列。 另外,它將有兩行,一行用于正常學習率,另一行用于退火學習率。
Now, the understanding of what perceptron is, what delta rule and how we are going to use it. Let’s get started with Python3
implementation.
現在,了解什么是感知器,什么是增量規則以及我們將如何使用它。 讓我們開始使用Python3
實現。
In the program, we are providing two inputs from the command line. They are:
在程序中,我們從命令行提供了兩個輸入。 他們是:
1. data — The location of the data file.
1. 數據 —數據文件的位置。
2. output — Where to write the tsv solution to
2. 輸出 -將tsv解決方案寫入何處
Therefore, the program should be able to start like this:
因此,該程序應該能夠像這樣啟動:
python3 perceptron.py --data data.tsv --output solution.tsv
The program consists of 8 parts and we are going to have a look at them one at a time.
該程序包括8個部分,我們將一次對其進行介紹。
導入聲明 (The import statements)
import argparse # to read inputs from command line
import csv # to read and process dataset
import numpy as np # to perform mathematical functions
代碼執行初始化塊 (The code execution initializer block)
# initialise argument parser and read arguments from command line with the respective flags and then call the main() functionif __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument("-d", "--data", help="Data File")
parser.add_argument("-o", "--output", help="output")
main()
main()
函數 (The main()
function)
def main():args = parser.parse_args()file, outputFile = args.data, args.outputlearningRate = 1with open(file) as tsvFile:reader = csv.reader(tsvFile, delimiter='\t')X = []Y = []for row in reader:X.append([1.0] + row[1:])if row[0] == 'A':Y.append([1])else:Y.append([0])n = len(X)X = np.array(X).astype(float)Y = np.array(Y).astype(float)W = np.zeros(X.shape[1]).astype(float)W = W.reshape(X.shape[1], 1).astype(float)normalError = calculateNormalBatchLearning(X, Y, W, learningRate)annealError = calculateAnnealBatchLearning(X, Y, W, learningRate)with open(outputFile, 'w') as tsvFile:writer = csv.writer(tsvFile, delimiter='\t')writer.writerow(normalError)writer.writerow(annealError)
The flow of the main()
function is as follows:
main()
函數的流程如下:
- Save respective command line inputs into variables 將相應的命令行輸入保存到變量中
- Set starting learningRate = 1 設置開始學習率= 1
Read the dataset using
csv
anddelimiter='\t'
, store independent variables inX
and dependent variable inY
. We are adding1.0
as bias to our independent data使用
csv
和delimiter='\t'
讀取數據集 ,在X
存儲自變量,在Y
存儲因變量。 我們將1.0
作為對獨立數據的偏見- The independent and dependent data is converted to float 獨立和從屬數據轉換為浮點型
The weight vector is initialised with zeroes with same dimensions as
X
權重向量以與
X
相同尺寸的零初始化The normalError and annealError are calculated by calling their respective methods
normalError 和annealError 通過調用它們各自的方法來計算
- Finally, the output is saved into a tsv file 最后,將輸出保存到tsv文件中
computeNormalBatchLearning()函數 (The calculateNormalBatchLearning() function)
def calculateNormalBatchLearning(X, Y, W, learningRate):e = []for i in range(101):f_x = calculatePredicatedValue(X, W)errorCount = calculateError(Y, f_x)e.append(errorCount)gradient, W = calculateGradient(W, X, Y, f_x, learningRate)return e
The flow of calculateNormalBatchLearning()
is as follows:
calculateNormalBatchLearning()
的流程如下:
Initialisation of a variable
e
to store the error count初始化變量
e
以存儲錯誤計數- A loop is run for 100 iterations 循環運行100次迭代
Predicted value is computed based on the perceptron rule described earlier using calculatePredicatedValue() method
預測值是根據前面所述的感知器規則使用calculatePredicatedValue()方法計算的
Error count is calculated using the calculateError() method
錯誤計數是使用calculateError()方法計算的
Weights are updated based on the equation above using calculateGradient() method
權重根據上面的等式使用calculateGradient()方法進行更新
computeAnnealBatchLearning()函數 (The calculateAnnealBatchLearning() function)
def calculateAnnealBatchLearning(X, Y, W, learningRate):e = []for i in range(101):f_x = calculatePredicatedValue(X, W)errorCount = calculateError(Y, f_x)e.append(errorCount)learningRate = 1 / (i + 1)gradient, W = calculateGradient(W, X, Y, f_x, learningRate)return e
The flow of calculateNormalBatchLearning()
is as follows:
calculateNormalBatchLearning()
的流程如下:
Initialisation of a variable
e
to store the error count初始化變量
e
以存儲錯誤計數- A loop is run for 100 iterations 循環運行100次迭代
Predicted value is computed based on the perceptron rule described earlier using calculatePredicatedValue() method
預測值是根據前面所述的感知器規則使用calculatePredicatedValue()方法計算的
Error count is calculated using the calculateError() method
錯誤計數是使用calculateError()方法計算的
- Learning rate is divided by the number of the iteration 學習率除以迭代次數
Weights are updated based on the equation above using calculateGradient() method
權重根據上面的等式使用calculateGradient()方法進行更新
computePredictedValue()函數 (The calculatePredictedValue() function)
def calculatePredicatedValue(X, W):f_x = np.dot(X, W)for i in range(len(f_x)):if f_x[i][0] > 0:f_x[i][0] = 1else:f_x[i][0] = 0return f_x
As described in the perceptron image, if the linear combination of W
and X
is greater than 0
, then we predict the class as 1
otherwise 0
.
如感知器圖像中所述,如果W
和X
的線性組合大于0
,則我們將類別預測為1
否則為0
。
computeError()函數 (The calculateError() function)
def calculateError(Y, f_x):errorCount = 0for i in range(len(f_x)):if Y[i][0] != f_x[i][0]:errorCount += 1return errorCount
We count the number of instances where the predicted value and the true value do not match and this becomes our error count.
我們計算了預測值和真實值不匹配的實例數,這就是我們的錯誤計數。
computeGradient()函數 (The calculateGradient() function)
def calculateGradient(W, X, Y, f_x, learningRate):gradient = (Y - f_x) * Xgradient = np.sum(gradient, axis=0)# gradient = np.array([float("{0:.4f}".format(val)) for val in gradient])temp = np.array(learningRate * gradient).reshape(W.shape)W = W + tempreturn gradient, W.astype(float)
This method is translation of the weight update formula mentioned above.
該方法是上述權重更新公式的轉換。
Now, that the whole code is out there. Let’s have a look at the execution of the program.
現在,整個代碼就在那里。 讓我們看一下程序的執行。

Here is how the output looks like:
輸出結果如下所示:
The final program
最終程序
import argparse
import csv
import numpy as npdef main():args = parser.parse_args()file, outputFile = args.data, args.outputlearningRate = 1with open(file) as tsvFile:reader = csv.reader(tsvFile, delimiter='\t')X = []Y = []for row in reader:X.append([1.0] + row[1:])if row[0] == 'A':Y.append([1])else:Y.append([0])n = len(X)X = np.array(X).astype(float)Y = np.array(Y).astype(float)W = np.zeros(X.shape[1]).astype(float)W = W.reshape(X.shape[1], 1).astype(float)normalError = calculateNormalBatchLearning(X, Y, W, learningRate)annealError = calculateAnnealBatchLearning(X, Y, W, learningRate)with open(outputFile, 'w') as tsvFile:writer = csv.writer(tsvFile, delimiter='\t')writer.writerow(normalError)writer.writerow(annealError)def calculateNormalBatchLearning(X, Y, W, learningRate):e = []for i in range(101):f_x = calculatePredicatedValue(X, W)errorCount = calculateError(Y, f_x)e.append(errorCount)gradient, W = calculateGradient(W, X, Y, f_x, learningRate)return edef calculateAnnealBatchLearning(X, Y, W, learningRate):e = []for i in range(101):f_x = calculatePredicatedValue(X, W)errorCount = calculateError(Y, f_x)e.append(errorCount)learningRate = 1 / (i + 1)gradient, W = calculateGradient(W, X, Y, f_x, learningRate)return edef calculateGradient(W, X, Y, f_x, learningRate):gradient = (Y - f_x) * Xgradient = np.sum(gradient, axis=0)# gradient = np.array([float("{0:.4f}".format(val)) for val in gradient])temp = np.array(learningRate * gradient).reshape(W.shape)W = W + tempreturn gradient, W.astype(float)def calculateError(Y, f_x):errorCount = 0for i in range(len(f_x)):if Y[i][0] != f_x[i][0]:errorCount += 1return errorCountdef calculatePredicatedValue(X, W):f_x = np.dot(X, W)for i in range(len(f_x)):if f_x[i][0] > 0:f_x[i][0] = 1else:f_x[i][0] = 0return f_xif __name__ == '__main__':parser = argparse.ArgumentParser()parser.add_argument("-d", "--data", help="Data File")parser.add_argument("-o", "--output", help="output")main()
翻譯自: https://towardsdatascience.com/machine-learning-perceptron-implementation-b867016269ec
感知器 機器學習
本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。 如若轉載,請注明出處:http://www.pswp.cn/news/391521.shtml 繁體地址,請注明出處:http://hk.pswp.cn/news/391521.shtml 英文地址,請注明出處:http://en.pswp.cn/news/391521.shtml
如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!