邏輯回歸 概率回歸_概率規劃的多邏輯回歸

邏輯回歸 概率回歸

There is an interesting dichotomy in the world of data science between machine learning practitioners (increasingly synonymous with deep learning practitioners), and classical statisticians (both Frequentists and Bayesians). There is generally no overlap between the techniques used in these two camps. However, there are some interesting tools and libraries that are trying to bridge the gap between the two camps, especially using Bayesian inference techniques to estimate the uncertainty of deep learning models. See this post and this paper to know more about the historical and recent trends in this exciting new area. The biggest benefit to adopting Bayesian thinking is it forces us to explicitly layout all the assumptions that go into the model. It is hard to perform Bayesian inference without fully being aware of all the modeling choices throughout the way. The biggest downside to Bayesian inference is the time needed to run even moderately sized models.

在機器學習從業者(越來越多地與深度學習從業者同義)與古典統計學家(包括頻率論者和貝葉斯主義者)之間,數據科學領域存在著一種有趣的二分法。 在這兩個陣營中使用的技術之間通常沒有重疊。 但是,有一些有趣的工具和庫正試圖彌合兩個陣營之間的鴻溝,尤其是使用貝葉斯推理技術來估計深度學習模型的不確定性。 請參閱這篇文章和本文,以了解有關這個令人興奮的新領域的歷史和最近趨勢的更多信息。 采用貝葉斯思想的最大好處是,它迫使我們明確設計模型中的所有假設。 在沒有完全了解整個過程中所有建模選擇的情況下,很難執行貝葉斯推理。 貝葉斯推斷最大的缺點是運行中等大小的模型所需的時間。

There are several probabilistic programming languages/frameworks out there that are becoming more popular due to the recent advances in computing hardware. The most common and mature language is Stan which has APIs to work with other common programming languages like Python (PyStan) and R (RStan). There are also some newer players in the field like PyMC3 (Theano), Pyro (PyTorch), and Turing (Julia). Of these, Turing, written in Julia potentially seems to be an interesting option. It brings with it all the advantages of Julia, and combining it with Flux can theoretically make it “easy” to estimate the uncertainties of any deep learning model.

由于計算硬件的最新發展,有幾種概率性編程語言/框架正在變得越來越流行。 最常見和最成熟的語言是Stan,它具有與其他常見編程語言(例如Python( PyStan )和R( RStan ))一起使用的API。 該領域中還有一些較新的玩家,例如PyMC3 (Theano), Pyro (PyTorch)和Turing (Julia)。 其中,用Julia(Julia)編寫的圖靈似乎是一個有趣的選擇。 它帶來了Julia的所有優點 ,并且將其與Flux結合使用在理論上可以很輕松地估計任何深度學習模型的不確定性。

There are some amazing books to get you up and running with Bayesian data analysis and the bible in the field is definitely the book by the great Andrew Gelman. He also writes short articles/opinions on his blog which is worth following. I personally think the book “Statistical Rethinking” by Richard McElreath is the best introduction to the field for any newcomer. He walks you from the garden of forking paths all the way to multi-level models. He even has his entertaining and engaging lectures up on Youtube! No reason not to get your daily dose of Bayesian 😄

有一些很棒的書可以幫助您開始使用貝葉斯數據分析,并且該領域的圣經絕對是偉大的安德魯·蓋爾曼(Andrew Gelman) 所著的書 。 他還在自己的博客上寫了一些簡短的文章/觀點,值得關注。 我個人認為,Richard McElreath撰寫的“ Statistical Rethinking”一書對于任何新手來說都是該領域的最佳介紹。 他會帶您從分叉路徑的花園一直到多層模型。 他甚至在YouTube上進行有趣而有趣的演講 ! 沒有理由不每天服用貝葉斯😄

In this blog post, I just wanted to get my feet wet with Julia and Turing. I will use both PyStan and Turing to build multi-category logistic models to predict the species of penguins based on their features like bill-length, island, sex, etc. This is similar to the more popular Iris dataset that is used so commonly in data science tutorials. For more details on the Palmer penguin dataset see here.

在這篇博客中,我只是想和Julia和Turing在一起。 我將同時使用PyStan和Turing來建立多類別的物流模型,根據帳單長度,島嶼,性別等特征來預測企鵝的種類。這類似于在Iso中常用的更流行的Iris數據集。數據科學教程。 有關Palmer企鵝數據集的更多詳細信息,請參見此處 。

y斯坦 (PyStan)

First, let's use PyStan to build a multi-logit model. Code for the Stan model looks like this:

首先,讓我們使用PyStan構建多登錄模型。 Stan模型的代碼如下所示:

data {
int N; //the number of training observations
int N2; //the number of test observations
int D; //the number of features
int K; //the number of classes
int y[N]; //the response
matrix[N,D] x; //the model matrix
matrix[N2,D] x_new; //the matrix for the predicted values
}
parameters {
matrix[D,K] beta; //the regression parameters
}
model {
matrix[N, K] x_beta = x * beta;
to_vector(beta) ~ normal(0, 1);
for (n in 1:N)
y[n] ~ categorical_logit(x_beta[n]');
}

This is exactly similar to the example in Stan’s documentation. We are using a standard normal prior on all parameters. In the case of our penguin dataset, we have a total of 9 different features; four of them are continuous features namely bill-length, bill-depth, flipper-length, and body-mass, and 5 are one-hot encoded features for the island and sex categorical values. Therefore, the number of parameters to estimate is 9 per category. Since we have 3 categories, that would be a total of 27 parameters to estimate. For each category, the sum of the coefficients and the feature values are calculated:

這與Stan文檔中的示例完全相似。 我們在所有參數上都使用標準普通優先級。 就我們的企鵝數據集而言,我們共有9種不同的功能; 其中四個是連續特征,即鈔票長度,鈔票深度,鰭狀肢長度和身體質量,另外五個是島和性別分類值的一鍵編碼特征。 因此,每個類別要估計的參數數量為9。 由于我們有3個類別,因此總共需要估算27個參數。 對于每個類別,計算系數和特征值的總和:

The final category for each data point is computed using softmax:

使用softmax計算每個數據點的最終類別:

Image for post

We could have also let the parameters for one category to be all zeros, and only estimate the remaining 9*2 parameters. This is the same idea as the binary classification models, where we only have one coefficient present:

我們也可以讓一個類別的參數全為零,而僅估計剩余的9 * 2參數。 這與二進制分類模型的想法相同,在二進制分類模型中,我們只有一個系數:

Image for post

I will show how that looks like when we get to the Julia code using the Turing library

我將展示使用圖靈庫訪問Julia代碼時的情況

Now we have the model ready, let's go ahead and perform sampling to get the posteriors for all the parameters:

現在我們已經準備好模型,讓我們繼續進行采樣以獲取所有參數的后驗:

These are the parameters for Sampling:

這些是用于采樣的參數:

Algorithm: No-U-Turn Sampler (NUTS)

算法:禁止掉頭采樣器(NUTS)

Warmup: 500 iterations

預熱:500次迭代

Samples: 500 iterations

樣本:500次迭代

Chains: 4

鏈數:4

Max Tree Depth: 10

最大樹深:10

Time elapsed per chain: ~140 seconds

每條鏈經過的時間:?140秒

Image for post
The posterior distributions for some parameters and their corresponding trace plots for 500 iterations. The samples are too unstable to be reliable
對于500次迭代,某些參數的后驗分布及其對應的跡線圖。 樣品太不穩定而不能可靠

The chains show poor mixing and stability, and the recommendation from Stan is to go higher with the max tree depth for the NUTS sampler to get better stability between and across chains

鏈條顯示出不良的混合和穩定性,Stan的建議是增加NUTS采樣器的最大樹深度,以在鏈條之間和跨鏈獲得更好的穩定性。

Image for post
Summary of samples for some parameters. Rhat is definitely too high for the samples to be useful
一些參數的樣本摘要。 Rhat絕對太高,無法用于樣本

The poor stability of the chains is also reflected in the number of effective samples (n_eff), which is quite low for some parameters. The Rhat is significantly above the recommended value of 1.05 for most parameters.

鏈的不良穩定性還反映在有效樣本數(n_eff)中,對于某些參數而言,該數目非常低。 對于大多數參數,Rhat明顯高于建議值1.05。

In general though, this is not generally an issue for most cases and the samples are usable as is shown below for predicting the train and test set classes

通常,在大多數情況下,這通常不是問題 ,并且可以使用樣本,如下所示,用于預測訓練和測試集的類別

Image for post
Training set predictions
訓練集預測
Image for post
Test set predictions
測試集預測

Now, lets increase the maximum tree depth for the NUTS sample from 10 to 12. This increases the time taken for each chain to converge

現在,讓NUTS樣本的最大樹深度從10增加到12。這增加了每個鏈收斂所需的時間。

Max Tree Depth: 12

最大樹深:12

Time elapsed per chain: ~570 seconds

每條鏈經過的時間:?570秒

Image for post
The posterior distributions for some parameters and their corresponding trace plots for 500 iterations
500次迭代的某些參數的后驗分布及其對應的軌跡圖

The chains show much better mixing and stability, and we could still go higher with the max tree depth for the NUTS sampler to get better stability between and across chains

鏈條顯示出更好的混合和穩定性,對于NUTS采樣器,我們仍然可以使用最大樹深度來提高鏈條之間和跨鏈條的穩定性。

Image for post
Summary of samples for some parameters. Rhat is on the higher end
一些參數的樣本摘要。 Rhat在高端

As we can see, the number of effective samples (n_eff) has also increased considerably for some parameters, and the Rhat is approaching the recommended value of 1.05 for some parameters. These samples as expected provide good classification predictions

如我們所見,某些參數的有效樣本數(n_eff)也大大增加,Rhat接近某些參數的建議值1.05。 預期這些樣本提供了良好的分類預測

Image for post
Training set predicitons
訓練集謂詞
Image for post
Test set predictions
測試集預測

Increasing the max tree depth further to 15 significantly improves the chain stability (data not shown) but also increases the computational time ~25 fold.

將最大樹深度進一步增加到15,可以顯著改善鏈的穩定性(數據未顯示),但還會增加約25倍的計算時間。

The code for running the above models is here. For the full project that includes setup for AWS, Sagemaker, and XGBoost models refer to my earlier blog post and Github repo.

運行上述模型的代碼在這里 。 有關包含適用于AWS,Sagemaker和XGBoost模型的設置的完整項目,請參閱我先前的博客文章和Github repo 。

Julia: (Julia:)

Now, I will show you the equivalent model using Julia and Turing. The code can be found here in the main project repo. The model is defined like so:

現在,我將向您展示使用Julia和Turing的等效模型。 該代碼可以發現這里的主要項目回購。 該模型的定義如下:

@model logistic_regression(x, y, n, σ) = begin
intercept_Adelie ~ Normal(0, σ)
intercept_Gentoo ~ Normal(0, σ)
intercept_Chinstrap ~ Normal(0, σ) bill_length_mm_Adelie ~ Normal(0, σ)
bill_length_mm_Gentoo ~ Normal(0, σ)
bill_length_mm_Chinstrap ~ Normal(0, σ) bill_depth_mm_Adelie ~ Normal(0, σ)
bill_depth_mm_Gentoo ~ Normal(0, σ)
bill_depth_mm_Chinstrap ~ Normal(0, σ) flipper_length_mm_Adelie ~ Normal(0, σ)
flipper_length_mm_Gentoo ~ Normal(0, σ)
flipper_length_mm_Chinstrap ~ Normal(0, σ) body_mass_g_Adelie ~ Normal(0, σ)
body_mass_g_Gentoo ~ Normal(0, σ)
body_mass_g_Chinstrap ~ Normal(0, σ) island_Biscoe_Adelie ~ Normal(0, σ)
island_Biscoe_Gentoo ~ Normal(0, σ)
island_Biscoe_Chinstrap ~ Normal(0, σ)
island_Dream_Adelie ~ Normal(0, σ)
island_Dream_Gentoo ~ Normal(0, σ)
island_Dream_Chinstrap ~ Normal(0, σ)
island_Torgersen_Adelie ~ Normal(0, σ)
island_Torgersen_Gentoo ~ Normal(0, σ)
island_Torgersen_Chinstrap ~ Normal(0, σ) sex_female_Adelie ~ Normal(0, σ)
sex_female_Gentoo ~ Normal(0, σ)
sex_female_Chinstrap ~ Normal(0, σ)
sex_male_Adelie ~ Normal(0, σ)
sex_male_Gentoo ~ Normal(0, σ)
sex_male_Chinstrap ~ Normal(0, σ)for i = 1:n
v = softmax([intercept_Adelie +
bill_length_mm_Adelie*x[i, 1] +
bill_depth_mm_Adelie*x[i, 2] +
flipper_length_mm_Adelie*x[i, 3] +
body_mass_g_Adelie*x[i, 4] +
island_Biscoe_Adelie*x[i, 5] +
island_Dream_Adelie*x[i, 6] +
island_Torgersen_Adelie*x[i, 7] +
sex_female_Adelie*x[i,8] +
sex_male_Adelie*x[i,9],
intercept_Gentoo +
bill_length_mm_Gentoo*x[i, 1] +
bill_depth_mm_Gentoo*x[i, 2] +
flipper_length_mm_Gentoo*x[i, 3] +
body_mass_g_Gentoo*x[i, 4] +
island_Biscoe_Gentoo*x[i, 5] +
island_Dream_Gentoo*x[i, 6] +
island_Torgersen_Gentoo*x[i, 7] +
sex_female_Gentoo*x[i,8] +
sex_male_Gentoo*x[i,9],
intercept_Chinstrap + bill_length_mm_Chinstrap*x[i, 1] +
bill_depth_mm_Chinstrap*x[i, 2] +
flipper_length_mm_Chinstrap*x[i, 3] +
body_mass_g_Chinstrap*x[i, 4] +
island_Biscoe_Chinstrap*x[i, 5] +
island_Dream_Chinstrap*x[i, 6] +
island_Torgersen_Chinstrap*x[i, 7] +
sex_female_Chinstrap*x[i,8] +
sex_male_Chinstrap*x[i,9]])
y[i, :] ~ Multinomial(1, v)
end
end;

I used the default HMC sampler as recommended in the Turing tutorial. One thing that I noticed is the much better stability of the chains when using the HMC sampler from Turing:

我使用了圖靈教程中推薦的默認HMC采樣器。 我注意到的一件事是,使用圖靈的HMC采樣器時,鏈條的穩定性更好:

Image for post
The posterior distributions for some parameters and their corresponding trace plots for 1000 iterations
一些參數的后驗分布及其對應的1000次迭代軌跡圖

And the summary of the samples:

以及樣本摘要:

Image for post
The r_hat values look better
r_hat值看起來更好

Overall, the HMC samples from Turing seem to do a lot better compared to the NUTS samples from PyStan. Of course, this is not an apples-to-apples comparison, but these are interesting results. In addition, the HMC sampler also was much faster compared to the max_tree_depth=12 run from PyStan shown above. This is something to dig into more.

總體而言,與來自PyStan的NUTS樣本相比,來自Turing的HMC樣本似乎要好得多。 當然,這不是一個蘋果對蘋果的比較,但這是有趣的結果。 此外,與上面顯示的PyStan運行的max_tree_depth = 12相比,HMC采樣器還快得多。 這是需要進一步研究的東西。

The predictions from Turing are perfect on both the Training and Test sets as expected since this is an easy prediction problem.

Turing的預測在訓練和測試集上都是理想的,因為這是一個容易預測的問題。

In conclusion, I like Julia and Turing so far! Another great (and fast) tool for Probabilistic Programming!

總之,到目前為止,我喜歡Julia和圖靈! 概率編程的另一個出色(快速)工具!

Some good things:

一些好東西:

  1. Turing is fast! (at least in this example with default samplers)

    圖靈快! (至少在此示例中使用默認采樣器)
  2. 1-based indexing for Julia and Turing vs Python’s 0-based indexing which makes it harder to co-ordinate with Stan’s 1-based indexing

    Julia和Turing的基于1的索引與Python的基于0的索引相比,這使得與Stan的基于1的索引更難以協調
  3. Symbolic math ability with Turing and Julia

    圖靈和Julia的符號數學能力

Some disadvantages compared to PyStan:

與PyStan相比,有些缺點:

  1. Not enough libraries to make pre-processing easy

    沒有足夠的庫來簡化預處理
  2. Stan has parsimonous model declaration syntax compared to Turing (probably just my ignorance with Turing)

    與Turing相比,Stan具有簡約的模型聲明語法(可能只是我對Turing的無知)
  3. Not a straightforward way to combine with Python (PyJulia is an option worth exploring)

    這不是與Python結合的直接方法( PyJuli a是值得探索的選擇)

*****************************************************************

****************************************************** ***************

Image for post
https://www.azquotes.com/quote/655174https://www.azquotes.com/quote/655174

翻譯自: https://medium.com/swlh/multi-logistic-regression-with-probabilistic-programming-db9a24467c0d

邏輯回歸 概率回歸

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/news/389714.shtml
繁體地址,請注明出處:http://hk.pswp.cn/news/389714.shtml
英文地址,請注明出處:http://en.pswp.cn/news/389714.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

sys.modules[__name__]的一個實例

關于sys.modules[__name__]的用法,百度上閱讀量比較多得一個帖子是:https://www.cnblogs.com/robinunix/p/8523601.html 對于里面提到的基礎性的知識點這里就不再重復了,大家看原貼就好。這里為大家提供一個詳細的例子,幫助大家更…

ajax不利于seo_利于探索移動選項的界面

ajax不利于seoLately, my parents will often bring up in conversation their desire to move away from their California home and find a new place to settle down for retirement. Typically they will cite factors that they perceive as having altered the essence o…

C#調用WebKit內核

原文:C#調用WebKit內核版權聲明:本文為博主原創文章,未經博主允許不得轉載。 https://blog.csdn.net/u013564470/article/details/80255954 系統要求 Windows與.NET框架 由于WebKit庫和.NET框架的要求,WebKit .NET只能在Windows系統上運行。從…

數據分析入門:如何訓練數據分析思維?

本文由 網易云 發布。 作者:吳彬彬(本篇文章僅限知乎內部分享,如需轉載,請取得作者同意授權。) 我們在生活中,會經常聽說兩種推理模式,一種是歸納 一種是演繹,這兩種思維模式能夠幫…

2011. 執行操作后的變量值

2011. 執行操作后的變量值 存在一種僅支持 4 種操作和 1 個變量 X 的編程語言: X 和 X 使變量 X 的值 加 1 –X 和 X-- 使變量 X 的值 減 1 最初,X 的值是 0 給你一個字符串數組 operations ,這是由操作組成的一個列表,返回執行…

crontab的坑

使用crontab的話,任何命令都需要采用絕對路徑!!包括輸出文件位置 如:nohup /usr/sbin/tcpdump -i flannel.1 -nn -q -n tcp > /home/linjj/conns.log & 轉載于:https://www.cnblogs.com/linjj/p/9006419.html

559. N 叉樹的最大深度

559. N 叉樹的最大深度 給定一個 N 叉樹,找到其最大深度。 最大深度是指從根節點到最遠葉子節點的最長路徑上的節點總數。 N 叉樹輸入按層序遍歷序列化表示,每組子節點由空值分隔(請參見示例)。 示例 1: 輸入&#…

python Tags 母板 組件 靜態文件相關 自定義simpletag inclusion_tag

一.Tags(一)for 1.基本用法 <ul> {% for user in user_list %} <li>{{ user.name }}</li> {% endfor %} </ul> 2.for循環可用的一些參數 forloop.counter 當前循環的索引值&#xff08;從1開始&#xff09; …

el表達式取值優先級

不同容器中存在同名值時&#xff0c;從作用范圍小到大的順序依次嘗試取值&#xff1a;pageContext->request->session->application 轉載于:https://www.cnblogs.com/wrencai/p/9006880.html

數據探索性分析_探索性數據分析

數據探索性分析When we hear about Data science or Analytics , the first thing that comes to our mind is Modelling , Tuning etc. . But one of the most important and primary steps before all of these is Exploratory Data Analysis or EDA.當我們聽到有關數據科學或…

5930. 兩棟顏色不同且距離最遠的房子

5930. 兩棟顏色不同且距離最遠的房子 街上有 n 棟房子整齊地排成一列&#xff0c;每棟房子都粉刷上了漂亮的顏色。給你一個下標從 0 開始且長度為 n 的整數數組 colors &#xff0c;其中 colors[i] 表示第 i 棟房子的顏色。 返回 兩棟 顏色 不同 房子之間的 最大 距離。 第 …

一起了解原型模式

原型模式 原型模式&#xff0c;用起來其實就是做clone操作&#xff0c;clone一個對象&#xff0c;越過構造器&#xff0c;在特定使用場景下增加效率。 UML 使用場景&#xff1a; 類初始化需要消耗很多資源&#xff0c;比較耗時。new方式非常繁瑣&#xff0c;還涉及到權限之類的…

c++與c語言的區別部分

1.new <malloc> delete <free> 2.多態&#xff1a; 重載 <函數 操作符> 類似于c中的變化參數 虛函數3.模板 4.class類<面向對象> 繼承 5.名空間 &#xff08;防止數據沖突問題 &#xff0c; 數據安全&#xff09; 6.引用 &a…

stata中心化處理_帶有stata第2部分自定義配色方案的covid 19可視化

stata中心化處理This guide will cover an important, yet, under-explored part of Stata: the use of custom color schemes. In summary, we will learn how to go from this graph:本指南將涵蓋Stata的一個重要但尚未充分研究的部分&#xff1a;自定義配色方案的使用。 總而…

5201. 給植物澆水

5201. 給植物澆水 你打算用一個水罐給花園里的 n 株植物澆水。植物排成一行&#xff0c;從左到右進行標記&#xff0c;編號從 0 到 n - 1 。其中&#xff0c;第 i 株植物的位置是 x i 。x -1 處有一條河&#xff0c;你可以在那里重新灌滿你的水罐。 每一株植物都需要澆特定…

Anaconda配置和使用

為什么80%的碼農都做不了架構師&#xff1f;>>> 原來一直使用原生python和pip的方式&#xff0c;換了新電腦&#xff0c;準備折騰下Anaconda。 安裝過程就不說了&#xff0c;全程可視化安裝&#xff0c;很簡單。 安裝后用“管理員權限”打開“Anaconda Prompt”命令…

qml: C++調用qml函數

C調用qml函數&#xff0c;是通過下面的函數實現的&#xff1a; bool QMetaObject::invokeMethod(QObject *obj, const char *member, Qt::ConnectionType type, QGenericReturnArgument ret, QGenericArgument val0 QGenericArgument( Q_NULLPTR ), QGenericArgument val1 QG…

python 插補數據_python 2020中缺少數據插補技術的快速指南

python 插補數據Most machine learning algorithms expect complete and clean noise-free datasets, unfortunately, real-world datasets are messy and have multiples missing cells, in such cases handling missing data becomes quite complex.大多數機器學習算法期望完…

5186. 區間內查詢數字的頻率

5186. 區間內查詢數字的頻率 請你設計一個數據結構&#xff0c;它能求出給定子數組內一個給定值的 頻率 。 子數組中一個值的 頻率 指的是這個子數組中這個值的出現次數。 請你實現 RangeFreqQuery 類&#xff1a; RangeFreqQuery(int[] arr) 用下標從 0 開始的整數數組 ar…

NIO 學習筆記

0. 介紹 參考 關于Java IO與NIO知識都在這里 &#xff0c;在其基礎上進行修改與補充。 1. NIO介紹 1.1 NIO 是什么 Java NIO 是 java 1.4, 之后新出的一套IO接口. NIO中的N可以理解為Non-blocking&#xff0c;不單純是New。 1.2 NIO的特性/NIO與IO區別 IO是面向流的&#x…