tensorflow入門
by Daniel Deutsch
由Daniel Deutsch
TensorFlow法律和統計入門 (Get started with TensorFlow on law and statistics)
What this is about
這是關于什么的
What we will use
我們將使用什么
Get started
開始吧
Shell commands for installing everything you need
用于安裝所需內容的Shell命令
Get data and draw a plot
獲取數據并繪制圖
Import everything you need
導入您需要的一切
Create and plot some numbers
創建并繪制一些數字
Build a TensorFlow model
建立一個TensorFlow模型
Prepare data
準備數據
Set up variables and operations for TensorFlow
為TensorFlow設置變量和操作
Start the calculations with a TensorFlow session
從TensorFlow會話開始計算
Visualize the result and process
可視化結果和過程
這是關于什么的 (What this is about)
As I am exploring TensorFlow, I wanted build a beginner example and document it. This is a very basic example that uses a gradient descent optimization to train parameters with TensorFlow. The key variables are evidence and convictions. It will illustrate:
在探索TensorFlow時,我想構建一個初學者示例并記錄下來。 這是一個非常基本的示例,該示例使用梯度下降優化來使用TensorFlow訓練參數。 關鍵變量是證據和信念 。 它將說明:
- how the number of convictions depend upon the number of pieces of evidence 定罪的數量如何取決于證據的數量
- how to predict the number of convictions using a regression model 如何使用回歸模型預測定罪人數
The Python file is in my repository on GitHub.
Python文件位于我在GitHub上的存儲庫中 。
See the article in better formatting on GitHub.
在GitHub上以更好的格式查看文章。
我們將使用什么 (What we will use)
1. TensorFlow(as tf) (1. TensorFlow (as tf))
Tensors
張量
- tf.placeholders tf.placeholders
- tf.Variables tf。變量
Helper function
輔助功能
- tf.global_variables_initializer tf.global_variables_initializer
Math Operations
數學運算
- tf.add tf.add
- tf.multiply tf.multiply
- tf.reduce_sum tf.reduce_sum
- tf.pow tf.pow
Building a graph
建立圖
- tf.train.GradientDescentOptimizer tf.train.GradientDescentOptimizer
Session
屆會
- tf.Session 會話
2.脾氣暴躁(as np) (2. Numpy (as np))
- np.random.seed np.random.seed
- np.random.zeros np.random.zeros
- np.random.randint np.random.randint
- np.random.randn np.random.randn
- np.random.asanyarray np.random.asanyarray
3. Matplotlib (3. Matplotlib)
4.數學 (4. Math)
入門 (Getting started)
Install TensorFlow with virtualenv. See the guide on the TF website.
使用virtualenv安裝TensorFlow。 請參閱TF網站上的指南 。
用于安裝所需內容的Shell命令 (Shell commands for installing everything you need)
sudo easy_install pip
pip3 install --upgrade virtualenv
virtualenv --system-site-packages <targetDirectory>
cd <targetDirectory>
source ./bin/activate
easy_install -U pip3
pip3 install tensorflow
pip3 install matplotlib
獲取數據并繪制圖 (Get data and draw a plot)
導入您需要的一切 (Import everything you need)
import tensorflow as tfimport numpy as npimport mathimport matplotlibmatplotlib.use('TkAgg')import matplotlib.pyplot as pltimport matplotlib.animation as animation
As you can see I am using the “TkAgg” backend from matplotlib. This allows me to debug with my vsCode and macOS setup without any further complicated installments.
如您所見,我正在使用matplotlib中的“ TkAgg”后端。 這使我可以使用vsCode和macOS設置進行調試,而無需進行任何其他復雜的安裝。
創建并繪制一些數字 (Create and plot some numbers)
# Generate evidence numbers between 10 and 20# Generate a number of convictions from the evidence with a random noise addednp.random.seed(42)sampleSize = 200numEvid = np.random.randint(low=10, high=50, size=sampleSize)numConvict = numEvid * 10 + np.random.randint(low=200, high=400, size=sampleSize)
# Plot the data to get a feelingplt.title("Number of convictions based on evidence")plt.plot(numEvid, numConvict, "bx")plt.xlabel("Number of Evidence")plt.ylabel("Number of Convictions")plt.show(block=False) # Use the keyword 'block' to override the blocking behavior
I am creating random values for the evidence. The number of convictions depends on the amount (number) of evidence, with random noise. Of course those numbers are made up, but they are just used to prove a point.
我正在為證據創建隨機值。 定罪的數量取決于證據的數量(數量)以及隨機噪聲。 當然,這些數字是虛構的,但是它們只是用來證明這一點。
建立一個TensorFlow模型 (Build a TensorFlow model)
To build a basic machine learning model, we need to prepare the data. Then we make predictions, measure the loss, and optimize by minimizing the loss.
要構建基本的機器學習模型,我們需要準備數據。 然后我們進行預測,測量損失,并通過最小化損失進行優化。
準備數據 (Prepare data)
# create a function for normalizing values# use 70% of the data for training (the remaining 30% shall be used for testing)def normalize(array): return (array - array.mean()) / array.std()
numTrain = math.floor(sampleSize * 0.7)
# convert list to an array and normalize arraystrainEvid = np.asanyarray(numEvid[:numTrain])trainConvict = np.asanyarray(numConvict[:numTrain])trainEvidNorm = normalize(trainEvid)trainConvictdNorm = normalize(trainConvict)
testEvid = np.asanyarray(numEvid[numTrain:])testConvict = np.asanyarray(numConvict[numTrain:])testEvidNorm = normalize(testEvid)testConvictdNorm = normalize(testConvict)
We are splitting the data into training and testing portions. Afterwards, we normalize the values, as this is necessary for machine learning projects. (See also “feature scaling”.)
我們將數據分為訓練和測試部分。 之后,我們將值標準化,因為這對于機器學習項目是必需的。 (另請參閱“ 功能縮放 ”。)
為TensorFlow設置變量和操作 (Set up variables and operations for TensorFlow)
# define placeholders and variablestfEvid = tf.placeholder(tf.float32, name="Evid")tfConvict = tf.placeholder(tf.float32, name="Convict")tfEvidFactor = tf.Variable(np.random.randn(), name="EvidFactor")tfConvictOffset = tf.Variable(np.random.randn(), name="ConvictOffset")
# define the operation for predicting the conviction based on evidence by adding both values# define a loss function (mean squared error)tfPredict = tf.add(tf.multiply(tfEvidFactor, tfEvid), tfConvictOffset)tfCost = tf.reduce_sum(tf.pow(tfPredict - tfConvict, 2)) / (2 * numTrain)
# set a learning rate and a gradient descent optimizerlearningRate = 0.1gradDesc = tf.train.GradientDescentOptimizer(learningRate).minimize(tfCost)
The pragmatic differences between tf.placeholder
and tf.Variable
are:
tf.placeholder
和tf.Variable
之間的實用差異是:
- placeholders are allocated storage for data, and initial values are not required 占位符被分配用于數據存儲,并且不需要初始值
- variables are used for parameters to learn, and initial values are required. The values can be derived from training. 變量用于學習參數,并且需要初始值。 這些值可以從訓練中得出。
I use the TensorFlow operators precisely as tf.add(…)
, because it is pretty clear what library is used for the calculation. This is instead of using the +
operator.
我將TensorFlow運算符精確地用作tf.add(…)
,因為很清楚使用哪個庫進行計算。 這不是使用+
運算符。
從TensorFlow會話開始計算 (Start the calculations with a TensorFlow session)
# initialize variablesinit = tf.global_variables_initializer()
with tf.Session() as sess: sess.run(init)
# set up iteration parameters displayEvery = 2 numTrainingSteps = 50
# Calculate the number of lines to animation # define variables for updating during animation numPlotsAnim = math.floor(numTrainingSteps / displayEvery) evidFactorAnim = np.zeros(numPlotsAnim) convictOffsetAnim = np.zeros(numPlotsAnim) plotIndex = 0
# iterate through the training data for i in range(numTrainingSteps):
# ======== Start training by running the session and feeding the gradDesc for (x, y) in zip(trainEvidNorm, trainConvictdNorm): sess.run(gradDesc, feed_dict={tfEvid: x, tfConvict: y})
# Print status of learning if (i + 1) % displayEvery == 0: cost = sess.run( tfCost, feed_dict={tfEvid: trainEvidNorm, tfConvict: trainConvictdNorm} ) print( "iteration #:", "%04d" % (i + 1), "cost=", "{:.9f}".format(cost), "evidFactor=", sess.run(tfEvidFactor), "convictOffset=", sess.run(tfConvictOffset), )
# store the result of each step in the animation variables evidFactorAnim[plotIndex] = sess.run(tfEvidFactor) convictOffsetAnim[plotIndex] = sess.run(tfConvictOffset) plotIndex += 1
# log the optimized result print("Optimized!") trainingCost = sess.run( tfCost, feed_dict={tfEvid: trainEvidNorm, tfConvict: trainConvictdNorm} ) print( "Trained cost=", trainingCost, "evidFactor=", sess.run(tfEvidFactor), "convictOffset=", sess.run(tfConvictOffset), "\n", )
Now we come to the actual training and the most interesting part.
現在我們來進行實際的培訓和最有趣的部分。
The graph is now executed in a tf.Session
. I am using "feeding" as it lets you inject data into any Tensor in a computation graph. You can see more on reading data here.
該圖現在在tf.Session
執行。 我正在使用“饋送”,因為它可以讓您將數據注入計算圖中的任何張量。 您可以在此處查看有關讀取數據的更多信息。
tf.Session()
is used to create a session that is automatically closed on exiting the context. The session also closes when an uncaught exception is raised.
tf.Session()
用于創建一個會話,該會話在退出上下文時自動關閉。 當引發未捕獲的異常時,會話也會關閉。
The tf.Session.run
method is the main mechanism for running a tf.Operation
or evaluating a tf.Tensor
. You can pass one or more tf.Operation
or tf.Tensor
objects to tf.Session.run
, and TensorFlow will execute the operations that are needed to compute the result.
tf.Session.run
方法是運行tf.Operation
或評估tf.Tensor
的主要機制。 您可以將一個或多個tf.Operation
或tf.Tensor
對象傳遞給tf.Session.run
,TensorFlow將執行計算結果所需的操作。
First, we are running the gradient descent training while feeding it the normalized training data. After that, we are calculating the the loss.
首先,我們在進行梯度下降訓練的同時將其歸一化訓練數據。 之后,我們正在計算損失。
We are repeating this process until the improvements per step are very small. Keep in mind that the tf.Variables
(the parameters) have been adapted throughout and now reflect an optimum.
我們將重復此過程,直到每個步驟的改進很小為止。 請記住,已經對tf.Variables
(參數)進行了全面調整,現在反映了最優值。
可視化結果和過程 (Visualize the result and process)
# de-normalize variables to be plotable again trainEvidMean = trainEvid.mean() trainEvidStd = trainEvid.std() trainConvictMean = trainConvict.mean() trainConvictStd = trainConvict.std() xNorm = trainEvidNorm * trainEvidStd + trainEvidMean yNorm = ( sess.run(tfEvidFactor) * trainEvidNorm + sess.run(tfConvictOffset) ) * trainConvictStd + trainConvictMean
# Plot the result graph plt.figure()
plt.xlabel("Number of Evidence") plt.ylabel("Number of Convictions")
plt.plot(trainEvid, trainConvict, "go", label="Training data") plt.plot(testEvid, testConvict, "mo", label="Testing data") plt.plot(xNorm, yNorm, label="Learned Regression") plt.legend(loc="upper left")
plt.show()
# Plot an animated graph that shows the process of optimization fig, ax = plt.subplots() line, = ax.plot(numEvid, numConvict)
plt.rcParams["figure.figsize"] = (10, 8) # adding fixed size parameters to keep animation in scale plt.title("Gradient Descent Fitting Regression Line") plt.xlabel("Number of Evidence") plt.ylabel("Number of Convictions") plt.plot(trainEvid, trainConvict, "go", label="Training data") plt.plot(testEvid, testConvict, "mo", label="Testing data")
# define an animation function that changes the ydata def animate(i): line.set_xdata(xNorm) line.set_ydata( (evidFactorAnim[i] * trainEvidNorm + convictOffsetAnim[i]) * trainConvictStd + trainConvictMean ) return (line,)
# Initialize the animation with zeros for y def initAnim(): line.set_ydata(np.zeros(shape=numConvict.shape[0])) return (line,)
# call the animation ani = animation.FuncAnimation( fig, animate, frames=np.arange(0, plotIndex), init_func=initAnim, interval=200, blit=True, )
plt.show()
To visualize the process, it is helpful to plot the result and maybe even the optimization process.
為了使過程可視化,對結果甚至優化過程進行繪圖很有幫助。
Check out this Pluralsight course which helped me a lot to get started. :)
查看此Pluralsight課程 ,該課程對我有很多幫助。 :)
Thanks for reading my article! Feel free to leave any feedback!
感謝您閱讀我的文章! 隨時留下任何反饋!
Daniel is a LL.M. student in business law, working as a software engineer and organizer of tech related events in Vienna. His current personal learning efforts focus on machine learning.
丹尼爾(Daniel)是法學碩士。 商業法專業的學生,??在維也納擔任軟件工程師和技術相關活動的組織者。 他目前的個人學習重點是機器學習。
Connect on:
連接:
LinkedIn
領英
Github
Github
Medium
中
Twitter
推特
Steemit
Steemit
Hashnode
哈希節點
翻譯自: https://www.freecodecamp.org/news/tensorflow-starter-on-law-and-statistics-646072b93b5a/
tensorflow入門