python多項式回歸_在python中實現多項式回歸

python多項式回歸

Video Link

影片連結

You can view the code used in this Episode here: SampleCode

您可以在此處查看 此劇 集中使用的代碼: SampleCode

導入我們的數據 (Importing our Data)

The first step is to import our data into python.

第一步是將我們的數據導入python。

We can do that by going on the following link: Data

我們可以通過以下鏈接來做到這一點: 數據

Click on “code” and download ZIP.

單擊“代碼”并下載ZIP。

Locate WeatherDataP.csv and copy it into your local disc under a new file called ProjectData

找到WeatherDataP.csv并將其復制到本地磁盤下名為ProjectData的新文件下

Note: WeatherData.csv and WeahterDataM.csv were used in Simple Linear Regression and Multiple Linear Regression.

注意:WeatherData.csv和WeahterDataM.csv用于簡單線性回歸和多重線性回歸 。

Now we are ready to import our data into our Notebook:

現在我們準備將數據導入到筆記本中:

How to set up a new Notebook can be found at the start of Episode 4.3

如何設置新筆記本可以在第4.3節開始時找到

Note: Keep this medium post on a split screen so you can read and implement the code yourself.

注意:請將此帖子張貼在分屏上,以便您自己閱讀和實現代碼。

# Import Pandas Library, used for data manipulation
# Import matplotlib, used to plot our data
# Import numpy for linear algebra operationsimport pandas as pd
import matplotlib.pyplot as plt
import numpy as np
# Import our WeatherDataP.csv and store it in the variable rweather_data_pweather_data_p = pd.read_csv("D:\ProjectData\WeatherDataP.csv")
# Display the data in the notebookweather_data_p
Image for post

繪制數據 (Plotting our Data)

In order to check what kind of relationship Pressure forms with Humidity, we plot our two variables.

為了檢查壓力與濕度之間的關系,我們繪制了兩個變量。

# Set our input x to Pressure, use [[]] to convert to 2D array suitable for model inputX = weather_data_p[["Pressure (millibars)"]]
y = weather_data_p.Humidity
# Produce a scatter graph of Humidity against Pressureplt.scatter(X, y, c = "black")
plt.xlabel("Pressure (millibars)")
plt.ylabel("Humidity")
Image for post

Here we see Humidity vs Pressure forms a bowl shaped relationship, reminding us of the function: y = 𝑥2 .

在這里,我們看到濕度與壓力之間呈碗形關系,使我們想起了函數y =𝑥2。

預處理我們的數據 (Preprocessing our Data)

This is the additional step we apply to polynomial regression, where we add the feature 𝑥2 to our Model.

這是我們應用于多項式回歸的附加步驟 ,在此步驟中將特征𝑥2添加到模型中。

# Import the function "PolynomialFeatures" from sklearn, to preprocess our data
# Import LinearRegression model from sklearnfrom sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
# Set PolynomialFeatures to degree 2 and store in the variable pre_process
# Degree 2 preprocesses x to 1, x and x^2
# Degree 3 preprocesses x to 1, x, x^2 and x^3
# and so on..pre_process = PolynomialFeatures(degree=2)# Transform our x input to 1, x and x^2X_poly = pre_process.fit_transform(X)# Show the transformation on the notebookX_poly
Image for post

e+.. refers to the position of the decimal place.

e + ..指小數位的位置。

E.g1.0e+00 = 1.0 [ keep the decimal point where it is ]1.0144e+03 = 1014.4 [ Move the decimal place 3 places to the right ]1.02900736e+06 = 1029007.36 [ Move the decimal place 6 to places to the right ]

例如 1.0e + 00 = 1.0 [保留小數點處的位置] 1.0144e + 03 = 1014.4 [將小數位右移3位] 1.02900736e + 06 = 1029007.36 [將小數點6向右移動]

— — — — — — — — — — — — — — — — — —

— — — — — — — — — — — — — — — — — — — —

The code above makes the following Conversion:

上面的代碼進行了以下轉換:

Image for post

Notice that there is a hidden column of 1’s which can be thought of as the variable associated with θ?. Since θ? × 1 = θ? this is often left out.

請注意,有一個隱藏的1列,可以將其視為與θ?相關的變量。 由于θ?×1 =θ?,因此經常被忽略。

— — — — — — — — — — — — — — — — — —

— — — — — — — — — — — — — — — — — — — —

實現多項式回歸 (Implementing Polynomial Regression)

The method here remains the same as multiple linear regression in python, but here we are fitting our regression model to the preprocessed data:

此處的方法與python中的多元線性回歸相同,但此處我們將回歸模型擬合為預處理的數據:

pr_model = LinearRegression()# Fit our preprocessed data to the polynomial regression modelpr_model.fit(X_poly, y)# Store our predicted Humidity values in the variable y_newy_pred = pr_model.predict(X_poly)# Plot our model on our dataplt.scatter(X, y, c = "black")
plt.xlabel("Pressure (millibars)")
plt.ylabel("Humidity")
plt.plot(X, y_pred)
Image for post

We can extract θ?, θ? and θ? using the following code:

我們可以使用以下代碼提取θ?,θ?和θ2

theta0 = pr_model.intercept_
_, theta1, theta2 = pr_model.coef_
theta0, theta1, theta2

A “_” is used to ignore the first value in pr_model.coef as this is given by default as 0. The other two co-efficients are labelled theta1 and theta 2 respectively.

_”用于忽略pr_model.coef中的第一個值,因為默認情況下該值為0。其他兩個系數分別標記為theta1和theta 2。

Image for post

Giving our polynomial regression model roughly as:

大致給出我們的多項式回歸模型:

Image for post

使用我們的回歸模型進行預測 (Using our Regression Model to make predictions)

# Predict humidity for a pressure of 1007 millibars
# Tranform 1007 to 1, 1007, 1007^2 suitable for input, using
# pre_process.fit_transformy_new = pr_model.predict(pre_process.fit_transform([[1007]]))
y_new
Image for post

Here we expect a Humidity value of 0.7164631 for a pressure reading of 1007 millibars.

在這里,對于1007毫巴的壓力讀數,我們期望的濕度值為0.7164631。

We can plot this point on our data plot using the following code:

我們可以使用以下代碼在數據圖上繪制該點:

plt.scatter(1007, y_new, c = "red")
Image for post

評估我們的模型 (Evaluating our Model)

To evaluate our model we are going to be using mean squared error (MSE), discussed in the previous episode, the function can easily be imported from sklearn.

為了評估我們的模型,我們將使用上一集中討論的均方誤差(MSE) ,可以輕松地從sklearn導入函數。

from sklearn.metrics import mean_squared_error
mean_squared_error(y, y_pred)
Image for post

The mean squared error for our regression model is given by: 0.003358..

我們的回歸模型的均方誤差為:0.003358 ..

Image for post

If we want to change our model to include 𝑥3 we can do so by simply changing PolynomialFeatures to degree 3:

如果要更改模型以包括𝑥3 ,可以通過將PolynomialFeatures更改為3級來實現

pre_process = PolynomialFeatures(degree=3)

Let’s check if this has decreased our mean squared error:

讓我們檢查一下這是否降低了均方誤差:

Image for post

Indeed it has.

確實有。

You can change the degree used in PolynomialFeatures to anything you like and see for yourself what effect this has on our MSE.

您可以將PolynomialFeatures中使用的度數更改為您喜歡的任何值,并親自查看這對我們的MSE有什么影響。

Ideally we want to choose the model that:

理想情況下,我們要選擇以下模型:

  • Has the lowest MSE

    MSE最低

  • Does not over-fit our data

    不會過度擬合我們的數據

It is important that we plot our model on our data to ensure we don’t end up with the model shown at the end of Episode 4.6, which had an extremely low MSE but over-fitted our data.

重要的是, 我們需要在數據上繪制模型,以確保最終不會出現第4.6集末顯示的模型,該模型的MSE極低,但數據過擬合。

上一集 - 下一集 (Prev Episode — Next Episode)

如有任何疑問,請留在下面! (If you have any questions please leave them below!)

Image for post

翻譯自: https://medium.com/ai-in-plain-english/implementing-polynomial-regression-in-python-d9aedf520d56

python多項式回歸

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/news/389738.shtml
繁體地址,請注明出處:http://hk.pswp.cn/news/389738.shtml
英文地址,請注明出處:http://en.pswp.cn/news/389738.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

Uboot 命令是如何被使用的?

有什么問題請 發郵件至syyxyoutlook.com, 歡迎交流~ 在uboot代碼中命令的模式是這個樣子: 這樣是如何和命令行交互的呢? 在command.h 中, 我們可以看到如下宏定義 將其拆分出來: #define U_BOOT_CMD(name,maxargs,rep,cmd,usage,help) \ U_…

2029. 石子游戲 IX

2029. 石子游戲 IX Alice 和 Bob 再次設計了一款新的石子游戲。現有一行 n 個石子,每個石子都有一個關聯的數字表示它的價值。給你一個整數數組 stones ,其中 stones[i] 是第 i 個石子的價值。 Alice 和 Bob 輪流進行自己的回合,Alice 先手…

大數據可視化應用_在數據可視化中應用種族平等意識

大數據可視化應用The following post is a summarized version of the article accepted to the 2020 Visualization for Communication workshop as part of the 2020 IEEE VIS conference to be held in October 2020. The full paper has been published as an OSF Preprint…

Windows10電腦系統時間校準

有時候新安裝電腦系統,系統時間不對,需要主動去校準系統時間。1、點擊時間 2、日期和時間設置 3、其他日期、時間和區域設置 4、設置時間和日期 5、Internet 時間 6、點擊立即更新,如果更新失敗就查電腦是否已聯網,重試點擊立即更…

webpack問題

Cannot find module webpack/lib/node/NodeTemplatePlugin 全局:npm i webpack -g npm link webpack --save-dev 轉載于:https://www.cnblogs.com/doing123/p/8994269.html

397. 整數替換

397. 整數替換 給定一個正整數 n ,你可以做如下操作: 如果 n 是偶數,則用 n / 2替換 n 。 如果 n 是奇數,則可以用 n 1或n - 1替換 n 。 n 變為 1 所需的最小替換次數是多少? 示例 1: 輸入:…

pd種知道每個數據的類型_每個數據科學家都應該知道的5個概念

pd種知道每個數據的類型意見 (Opinion) 目錄 (Table of Contents) Introduction 介紹 Multicollinearity 多重共線性 One-Hot Encoding 一站式編碼 Sampling 采樣 Error Metrics 錯誤指標 Storytelling 評書 Summary 摘要 介紹 (Introduction) I have written about common ski…

td

單元格td設置padding,而不能設置margin。轉載于:https://www.cnblogs.com/fpcbk/p/9617629.html

清除浮動的幾大方法

對于剛接觸到html的一些人經常會用到浮動布局,但對于浮動的使用和清除浮動來說是大為頭痛的,在這里介紹幾個關于清除浮動的的方法。如果你說你要的就是浮動為什么要清除浮動的話,我就真的無言以對了,那你就當我沒說。 關于我們在布…

xgboost keras_用catboost lgbm xgboost和keras預測財務交易

xgboost kerasThe goal of this challenge is to predict whether a customer will make a transaction (“target” 1) or not (“target” 0). For that, we get a data set of 200 incognito variables and our submission is judged based on the Area Under Receiver Op…

2017. 網格游戲

2017. 網格游戲 給你一個下標從 0 開始的二維數組 grid ,數組大小為 2 x n ,其中 grid[r][c] 表示矩陣中 (r, c) 位置上的點數。現在有兩個機器人正在矩陣上參與一場游戲。 兩個機器人初始位置都是 (0, 0) ,目標位置是 (1, n-1) 。每個機器…

HUST軟工1506班第2周作業成績公布

說明 本次公布的成績對應的作業為: 第2周個人作業:WordCount編碼和測試 如果同學對作業成績存在異議,在成績公布的72小時內(截止日期4月26日0點)可以進行申訴,方式如下: 畢博平臺的第二周在線答…

幣氪共識指數排行榜0910

幣氪量化數據在今天的報告中給出DASH的近期買賣信號,可以看出從今年4月中旬起到目前為止,DASH_USDT的價格總體呈現出下降的趨勢。 轉載于:https://www.cnblogs.com/tokpick/p/9621821.html

走出囚徒困境的方法_囚徒困境的一種計算方法

走出囚徒困境的方法You and your friend have committed a murder. A few days later, the cops pick the two of you up and put you in two separate interrogation rooms such that you have no communication with each other. You think your life is over, but the polic…

2016. 增量元素之間的最大差值

2016. 增量元素之間的最大差值 給你一個下標從 0 開始的整數數組 nums &#xff0c;該數組的大小為 n &#xff0c;請你計算 nums[j] - nums[i] 能求得的 最大差值 &#xff0c;其中 0 < i < j < n 且 nums[i] < nums[j] 。 返回 最大差值 。如果不存在滿足要求的…

Zookeeper系列四:Zookeeper實現分布式鎖、Zookeeper實現配置中心

一、Zookeeper實現分布式鎖 分布式鎖主要用于在分布式環境中保證數據的一致性。 包括跨進程、跨機器、跨網絡導致共享資源不一致的問題。 1. 分布式鎖的實現思路 說明&#xff1a; 這種實現會有一個缺點&#xff0c;即當有很多進程在等待鎖的時候&#xff0c;在釋放鎖的時候會有…

resize 按鈕不會被偽元素遮蓋

textarea默認有個resize樣式&#xff0c;效果就是下面這樣 讀 《css 揭秘》時發現兩個亮點&#xff1a; 其實這個屬性不僅適用于 textarea 元素&#xff0c;適用于下面所有元素&#xff1a;elements with overflow other than visible, and optionally replaced elements repre…

平臺api對數據收集的影響_收集您的數據不是那么怪異的api

平臺api對數據收集的影響A data analytics cycle starts with gathering and extraction. I hope my previous blog gave an idea about how data from common file formats are gathered using python. In this blog, I’ll focus on extracting the data from files that are…

709. 轉換成小寫字母

709. 轉換成小寫字母 給你一個字符串 s &#xff0c;將該字符串中的大寫字母轉換成相同的小寫字母&#xff0c;返回新的字符串。 示例 1&#xff1a;輸入&#xff1a;s "Hello" 輸出&#xff1a;"hello"示例 2&#xff1a;輸入&#xff1a;s "here…

前端技術周刊 2018-09-10:Redux Mobx

前端快爆 在 Chrome 10 周年之際&#xff0c;正式發布 69 版本&#xff0c;整體 UI 重新設計&#xff0c;同時iOS 版本重新將工具欄放置在了底部。API 層面&#xff0c;支持了 CSS Scroll Snap、前端資源鎖 Web Lock API、WebWorker 里面可以跑的 OffscreenCanvas API、toggleA…