python多項式回歸_如何在Python中實現多項式回歸模型

python多項式回歸

Let’s start with an example. We want to predict the Price of a home based on the Area and Age. The function below was used to generate Home Prices and we can pretend this is “real-world data” and our “job” is to create a model which will predict the Price based on Area and Age:

讓我們從一個例子開始。 我們想根據面積和年齡來預測房屋價格。 下面的函數用于生成房屋價格,我們可以假裝這是“真實數據”,而我們的“工作”是創建一個模型,該模型將根據面積和年齡預測價格:

價格= -3 *面積-10 *年齡+ 0.033 *面積2-0.0000571 *面積3+ 500 (Price = -3*Area -10*Age + 0.033*Area2 -0.0000571*Area3 + 500)

Image for post
Home Prices vs Area & Age
房屋價格與面積和年齡

線性模型 (Linear Model)

Let’s suppose we just want to create a very simple Linear Regression model that predicts the Price using slope coefficients c1 and c2 and the y-intercept c0:

假設我們只想創建一個非常簡單的線性回歸模型,該模型使用斜率系數c1和c2以及y軸截距 c0來預測價格:

Price = c1*Area+c2*Age + c0

價格= c1 *面積+ c2 *年齡+ c0

We’ll load the data and implement Scikit-Learn’s Linear Regression. Behind the scenes, model coefficients (c0, c1, c2) are computed by minimizing the sum of squares of individual errors between target variable y and the model prediction:

我們將加載數據并實現Scikit-Learn的線性回歸 。 在幕后,通過最小化目標變量y與模型預測之間的各個誤差的平方和來計算模型系數(c0,c1,c2):

But you see we don’t do a very good job with this model.

但是您會看到我們在此模型上做得不好。

Image for post
Simple Linear Regression Model (Mean Relative Error: 9.5%)
簡單線性回歸模型(平均相對誤差:9.5%)

多項式回歸模型 (Polynomial Regression Model)

Next, let’s implement the Polynomial Regression model because it’s the right tool for the job. Rewriting the initial function used to generate the home Prices, where x1 = Area, and x2 = Age, we get the following:

接下來,讓我們實現多項式回歸模型,因為它是這項工作的正確工具。 重寫用于生成房屋價格的初始函數,其中x1 =面積,x2 =年齡,我們得到以下信息:

價格= -3 * x1 -10 * x2 + 0.033 *x12-0.0000571 *x13+ 500 (Price = -3*x1 -10*x2 + 0.033*x12 -0.0000571*x13 + 500)

So now instead of the Linear model (Price = c1*x1 +c2*x2 + c0), Polynomial Regression requires we transform the variables x1 and x2. For example, if we want to fit a 2nd-degree polynomial, the input variables are transformed as follows:

因此,現在多項式回歸代替線性模型(價格= c1 * x1 + c2 * x2 + c0),需要轉換變量x1和x2。 例如,如果要擬合二階多項式,則輸入變量的轉換如下:

1, x1, x2, x12, x1x2, x22

1,x1,x2,x12,x1x2,x22

But our 3rd-degree polynomial version will be:

但是我們的三階多項式將是:

1, x1, x2, x12, x1x2, x22, x13, x12x2, x1x22, x23

1,x1,x2,x12,x1x2,x22,x13,x12x2,x1x22,x23

Then we can use the Linear model with the polynomially transformed input features and create a Polynomial Regression model in the form of:

然后,我們可以將線性模型與多項式轉換后的輸入特征一起使用,并創建以下形式的多項式回歸模型:

Price = 0*1 + c1*x1 + c2*x2 +c3*x12 + c4*x1x2 + … + cn*x23 + c0

價格= 0 * 1 + c1 * x1 + c2 * x2 + c3 *x12+ c4 * x1x2 +…+ cn *x23+ c0

(0*1 relates to the bias (1s) column)

(0 * 1與偏置(1s)列有關)

After training the model on the data we can check the coefficients and see if they match our original function used to generate home prices:

在對數據進行模型訓練之后,我們可以檢查系數,看看它們是否與用于生成房屋價格的原始函數匹配:

Original Function:

原始功能:

價格= -3 * x1 -10 * x2 + 0.033 *x12-0.0000571 *x13+ 500 (Price = -3*x1 -10*x2 + 0.033*x12 -0.0000571*x13 + 500)

Polynomial Regression model coefficients:

多項式回歸模型系數:

Image for post
Image for post

and indeed they match!

確實匹配!

Now you can see we do a much better job.

現在您可以看到我們做得更好。

Image for post
Polynomial Regression Model (Mean Relative Error: 0%)
多項式回歸模型(平均相對誤差:0%)

And there you have it, now you know how to implement a Polynomial Regression model in Python. Entire code can be found here.

有了它,現在您知道如何在Python中實現多項式回歸模型。 完整的代碼可以在這里找到。

結束語 (Closing remarks)

  • If this were a real-world ML task, we should have split data into training and testing sets, and evaluated the model on the testing set.

    如果這是現實世界中的ML任務,我們應該將數據分為訓練和測試集,并在測試集上評估模型。
  • It’s better to use other accuracy metrics such as RMSE because MRE will be undefined if there’s a 0 in the y values.

    最好使用其他精度度量標準,例如RMSE,因為如果y值中為0,則MRE將不確定。

翻譯自: https://medium.com/@nikola.kuzmic945/how-to-implement-a-polynomial-regression-model-in-python-6250ce96ba61

python多項式回歸

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/news/389367.shtml
繁體地址,請注明出處:http://hk.pswp.cn/news/389367.shtml
英文地址,請注明出處:http://en.pswp.cn/news/389367.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

充分利用UC berkeleys數據科學專業

By Kyra Wong and Kendall Kikkawa黃凱拉(Kyra Wong)和菊川健多 ( Kendall Kikkawa) 什么是“數據科學”? (What is ‘Data Science’?) Data collection, an important aspect of “data science”, is not a new idea. Before the tech boom, every industry al…

文本二叉樹折半查詢及其截取值

using System;using System.ComponentModel;using System.Data;using System.Drawing;using System.Text;using System.Windows.Forms;using System.Collections;using System.IO;namespace CS_ScanSample1{ /// <summary> /// Logic 的摘要說明。 /// </summary> …

nn.functional 和 nn.Module入門講解

本文來自《20天吃透Pytorch》 一&#xff0c;nn.functional 和 nn.Module 前面我們介紹了Pytorch的張量的結構操作和數學運算中的一些常用API。 利用這些張量的API我們可以構建出神經網絡相關的組件(如激活函數&#xff0c;模型層&#xff0c;損失函數)。 Pytorch和神經網絡…

10.30PMP試題每日一題

SC>0&#xff0c;CPI<1&#xff0c;說明項目截止到當前&#xff1a;A、進度超前&#xff0c;成本超值B、進度落后&#xff0c;成本結余C、進度超前&#xff0c;成本結余D、無法判斷 答案將于明天和新題一起揭曉&#xff01; 10.29試題答案&#xff1a;A轉載于:https://bl…

02-web框架

1 while True:print(server is waiting...)conn, addr server.accept()data conn.recv(1024) print(data:, data)# 1.得到請求的url路徑# ------------dict/obj d["path":"/login"]# d.get(”path“)# 按著http請求協議解析數據# 專注于web業…

ai驅動數據安全治理_AI驅動的Web數據收集解決方案的新起點

ai驅動數據安全治理Data gathering consists of many time-consuming and complex activities. These include proxy management, data parsing, infrastructure management, overcoming fingerprinting anti-measures, rendering JavaScript-heavy websites at scale, and muc…

從Text文本中讀值插入到數據庫中

/// <summary> /// 轉換數據&#xff0c;從Text文本中導入到數據庫中 /// </summary> private void ChangeTextToDb() { if(File.Exists("Storage Card/Zyk.txt")) { try { this.RecNum.Visibletrue; SqlCeCommand sqlCreateTable…

Dataset和DataLoader構建數據通道

重點在第二部分的構建數據通道和第三部分的加載數據集 Pytorch通常使用Dataset和DataLoader這兩個工具類來構建數據管道。 Dataset定義了數據集的內容&#xff0c;它相當于一個類似列表的數據結構&#xff0c;具有確定的長度&#xff0c;能夠用索引獲取數據集中的元素。 而D…

鐵拳nat映射_鐵拳如何重塑我的數據可視化設計流程

鐵拳nat映射It’s been a full year since I’ve become an independent data visualization designer. When I first started, projects that came to me didn’t relate to my interests or skills. Over the past eight months, it’s become very clear to me that when cl…

Django2 Web 實戰03-文件上傳

作者&#xff1a;Hubery 時間&#xff1a;2018.10.31 接上文&#xff1a;接上文&#xff1a;Django2 Web 實戰02-用戶注冊登錄退出 視頻是一種可視化媒介&#xff0c;因此視頻數據庫至少應該存儲圖像。讓用戶上傳文件是個很大的隱患&#xff0c;因此接下來會討論這倆話題&#…

BZOJ.2738.矩陣乘法(整體二分 二維樹狀數組)

題目鏈接 BZOJ洛谷 整體二分。把求序列第K小的樹狀數組改成二維樹狀數組就行了。 初始答案區間有點大&#xff0c;離散化一下。 因為這題是一開始給點&#xff0c;之后詢問&#xff0c;so可以先處理該區間值在l~mid的修改&#xff0c;再處理詢問。即二分標準可以直接用點的標號…

從數據庫里讀值往TEXT文本里寫

/// <summary> /// 把預定內容導入到Text文檔 /// </summary> private void ChangeDbToText() { this.RecNum.Visibletrue; //建立文件&#xff0c;并打開 string oneLine ""; string filename "Storage Card/YD" DateTime.Now.…

DengAI —如何應對數據科學競賽? (EDA)

了解機器學習 (Understanding ML) This article is based on my entry into DengAI competition on the DrivenData platform. I’ve managed to score within 0.2% (14/9069 as on 02 Jun 2020). Some of the ideas presented here are strictly designed for competitions li…

Pytorch模型層簡單介紹

模型層layers 深度學習模型一般由各種模型層組合而成。 torch.nn中內置了非常豐富的各種模型層。它們都屬于nn.Module的子類&#xff0c;具備參數管理功能。 例如&#xff1a; nn.Linear, nn.Flatten, nn.Dropout, nn.BatchNorm2d nn.Conv2d,nn.AvgPool2d,nn.Conv1d,nn.Co…

有效溝通的技能有哪些_如何有效地展示您的數據科學或軟件工程技能

有效溝通的技能有哪些What is the most important thing to do after you got your skills to be a data scientist? It has to be to show off your skills. Otherwise, there is no use of your skills. If you want to get a job or freelance or start a start-up, you ha…

java.net.SocketException: Software caused connection abort: socket write erro

場景&#xff1a;接口測試 編輯器&#xff1a;eclipse 版本&#xff1a;Version: 2018-09 (4.9.0) testng版本&#xff1a;TestNG version 6.14.0 執行testng.xml時報錯信息&#xff1a; 出現此報錯原因之一&#xff1a;網上有人說是testng版本與eclipse版本不一致造成的&#…

[博客..配置?]博客園美化

博客園搞定時間 -> 18年6月27日 [讓我歇會兒 搞這個費腦子 代碼一個都看不懂] 轉載于:https://www.cnblogs.com/Steinway/p/9235437.html

使用K-Means對美因河畔法蘭克福的社區進行聚類

介紹 (Introduction) This blog post summarizes the results of the Capstone Project in the IBM Data Science Specialization on Coursera. Within the project, the districts of Frankfurt am Main in Germany shall be clustered according to their venue data using t…

Pytorch損失函數losses簡介

一般來說&#xff0c;監督學習的目標函數由損失函數和正則化項組成。(Objective Loss Regularization) Pytorch中的損失函數一般在訓練模型時候指定。 注意Pytorch中內置的損失函數的參數和tensorflow不同&#xff0c;是y_pred在前&#xff0c;y_true在后&#xff0c;而Ten…

讀取Mc1000的 唯一 ID 機器號

先引用Symbol.ResourceCoordination 然后引用命名空間 using System;using System.Security.Cryptography;using System.IO; 以下為類程序 /// <summary> /// 獲取設備id /// </summary> /// <returns></returns> public static string GetDevi…