python 可視化工具
Disclaimer: I work for Datapane
免責聲明:我為Datapane工作
動機 (Motivation)
There are amazing articles on data visualization on Medium every day. Although this comes at the cost of information overload, it shouldn’t prevent you from exploring interesting articles since you can learn many new techniques for creating effective visualizations for your projects.
每天都有大量關于Medium上的數據可視化的驚人文章。 盡管這是以信息過載為代價的,但它不應阻止您瀏覽有趣的文章,因為您可以學習許多用于為項目創建有效的可視化效果的新技術。
I have gone through the recent Medium posts on Python visualization and put together the best ones — with the hope that it will make it easier for you to explore them yourself. I’ve submitted these to the Datapane gallery, which is hosting them for us.
我瀏覽了最近有關Python可視化的中級文章,并匯總了最佳文章-希望它可以使您自己更輕松地探索它們。 我已將它們提交給Datapane畫廊 ,該畫廊正在為我們托管。
If you don’t know Datapane already, it is an open-source framework for people who analyze data in Python and need a way to share their results. Datapane hosts a free public platform with a gallery and community of people who share and collaborate on Python data visualization techniques.
如果您還不了解Datapane ,那么它是一個開放源代碼框架,供那些使用Python分析數據并需要一種共享結果的人員使用。 Datapane擁有一個免費的公共平臺,該平臺帶有畫廊和社區,這些社區和社區的人們共享和協作使用Python數據可視化技術。
In this article, I will use plots in the gallery as examples to show what factors make up an effective plot, introduce different kinds of plots, and how to create them yourself!
在本文中,我將以畫廊中的地塊為例來說明構成有效地塊的因素,介紹各種類型的地塊,以及如何自己創建它們!
使用Python的管道中的主成分分析和SVM (Principal Component Analysis and SVM in a Pipeline with Python)
有趣的主意 (Interesting Idea)
In this article, Saptashwa Bhattacharyya combines SVM, PCA, and Grid-search Cross-Validation to create a pipeline to find the best parameters for binary classification. He then plots a decision boundary to present how well our algorithm has performed.
在本文中 , Saptashwa Bhattacharyya結合了SVM,PCA和網格搜索交叉驗證來創建管道,以找到用于二進制分類的最佳參數。 然后,他繪制了一個決策邊界,以展示我們的算法的性能。
令人印象深刻的可視化 (Impressive Visualization)
Joint-plots
聯合圖
Joint-plot is really helpful in showing both the distribution and the relationship between 2 variables in one plot. The darker the hexagon, the more number of points (observations) fall in that region
聯合圖確實有助于顯示一個圖中兩個變量的分布和關系。 六邊形越深,該區域內的點(觀測值)越多
Contour plot and Kernel density estimation: KDE (Kernel density estimation) is a useful statistical tool that lets you create a smooth curve given a set of data. This can be useful if you want to visualize just the “shape” of some data (instead of the discrete histogram).
等高線圖和內核密度估計: KDE(內核密度估計)是一個有用的統計工具,可讓您根據一組數據創建平滑曲線。 如果您只想可視化某些數據的“形狀”(而不是離散的直方圖),這將很有用。
Pair plots: By looking at the pair plots, it is much easier to compare the correlation among different pairs of variables. In the plot below, you could see that the mean area has a strong correlation with the mean radius. The difference in color is also helpful to know the behavior of each label in each pair — it is really clear!
配對圖 :通過查看配對圖,比較不同變量對之間的相關性要容易得多。 在下面的圖中,您可以看到平均面積與平均半徑有很強的相關性。 顏色上的差異也有助于了解每對標簽中每個標簽的行為-確實很清楚!
Contour plot of SVM: This contour plot is really helpful to find the percentage that the point lies in that area actually belongs to that area. I especially like the contour plot below because I can see which regions the malignant cells, benign cells, and support vectors lie and it helps me understand how SVM works.
SVM的等高線圖:該等高線圖確實有助于發現該點實際位于該區域內的百分比。 我特別喜歡下面的輪廓圖,因為我可以看到惡性細胞,良性細胞和支持向量位于哪些區域,這有助于我了解SVM的工作原理。
3D SVM plot: Even though we often see 2D SVM plots, most of the time, data is multi-dimensional. Seeing this 3D plot is really helpful to understand how SVM works in multi-dimensional space.
3D SVM圖:即使我們經常看到2D SVM圖,大多數情況下,數據還是多維的。 看到此3D圖確實有助于理解SVM在多維空間中的工作方式。
探索資源 (Resources to Explore)
Medium article
中篇
Run the code on Binder
在活頁夾上運行代碼
使用Plotly可視化Gapmind和Basketball數據集 (Visualize Gapmind and Basketball Dataset with Plotly)
有趣的主意 (Interesting Idea)
A good plot is not only a beautiful, but also provides the right message to viewers. Without the right proportions in the graph, viewers will interpret the message in a different way.
一個好的情節不僅是美麗的,而且還向觀眾提供正確的信息。 如果圖表中沒有正確的比例,查看者將以不同的方式解釋消息。
That is why having the right proportions is so important. In this article, JP Hwang shows you how to create effective plots with the right proportions.
這就是為什么擁有正確比例如此重要的原因。 在本文中 , JP Hwang向您展示了如何以正確的比例創建有效的地塊。
令人印象深刻的可視化 (Impressive Visualization)
Plots show the change over time
圖表顯示了隨著時間的變化
This plot effectively represents the percentage of different continents in the world in a snapshot of time:
該圖有效地表示了時間快照中世界各大洲的百分比:
But how do you create the plots to show the change of the proportions of population of different continents over time?
但是,如何創建顯示不同大陸人口比例隨時間變化的圖?
The author shows you can do exactly so with the plots below
作者顯示您可以使用以下圖表完全做到這一點
As you can see from the plots above, the time dimension is added to the plot. Now the plots do not only show the distribution but also show how the overall number and distribution change over time! Neat!
從上面的圖中可以看到,時間維度已添加到圖中。 現在,這些圖不僅顯示分布,而且還顯示總數和分布隨時間的變化! 整齊!
Bubble graph
氣泡圖
As the data points grow across both dimensions, it becomes harder to visualize either the bar chart or stacked bar chart because the size of the bar charts is too small to deliver any meaningful information.
隨著數據點在兩個維度上的增長,條形圖或堆疊條形圖的可視化變得越來越困難,因為條形圖的尺寸太小,無法傳遞任何有意義的信息。
That is why it is clever of the author to use a bubble chart, an effective way to see many more data points in one chart but still clearly show the change in proportion over time.
這就是為什么使用氣泡圖是作者的明智之舉,氣泡圖是一種有效的方法,可以在一個圖表中查看更多數據點,但仍能清楚地顯示隨時間變化的比例。
探索資源 (Resources to Explore)
Medium article
中篇
Run the code on Binder
在活頁夾上運行代碼
Altair圖解構:可視化天氣數據的相關結構 (Altair plot deconstruction: visualizing the correlation structure of weather data)
有趣的主意 (Interesting Idea)
You might know how heatmap is used to show the correlation between different variables in the data, but what if you want to get more insight out of the number .71 in the heatmap? In this article, Paul Hiemstra shows how you could do that by combining heatmap and a 2d histogram to explore the structure of a weather dataset.
您可能知道如何使用熱圖來顯示數據中不同變量之間的相關性,但是如果您想從熱圖中的數字.71中獲得更多的見解,該怎么辦? 在本文中 , Paul Hiemstra展示了如何通過結合使用熱圖和2d直方圖來探索天氣數據集的結構。
令人印象深刻的可視化 (Impressive Visualization)
Linked plots with Altair
與Altair關聯的地塊
Heatmaps and 2d histograms are both effective in showing the correlation — but it would be much more effective if you could combine them both. The linked plots below do just that.
熱圖和2d直方圖都可以有效地顯示相關性,但是如果將兩者結合使用,效果會更好。 下面的鏈接圖就是這樣做的。
As you click each square in the heatmap, you will see 2D histogram representation of that heatmap on the right-hand side!
當您單擊熱圖中的每個正方形時,您將在右側看到該熱圖的2D直方圖表示!
How does the correlation of .98 look like in 2d histogram? We expect a linear correlation and we prove it by looking at the plot on the right. In contrast, there seems not to be any pattern on the 2D histogram for the correlation of .12. Very intuitive and easy to understand.
.98的相關度在二維直方圖中的樣子如何? 我們期望線性相關,并通過查看右邊的圖來證明這一點。 相反,在2D直方圖上似乎沒有任何與.12相關的模式。 非常直觀,易于理解。
探索資源 (Resources to Explore)
Medium article
中篇
Run the code on Binder
在活頁夾上運行代碼
具有Python的Plotly的Sankey圖基礎 (Sankey Diagram Basics with Python’s Plotly)
有趣的主意 (Interesting Idea)
How do you visualize a network with different sources of inflow and outflow? For example, what services are the government’s revenues such as taxes, utilities are spent on? And what is the percentage of expenditure from one service compared to expenditures from other services?
您如何可視化具有不同流入和流出源的網絡? 例如,政府的收入用于稅收,公用事業等哪些服務? 一項服務的支出與其他服務的支出相比,百分比是多少?
In this article, Thiago Carvalho shows how to visualize such networks effectively with Sankey Diagrams
在本文中 , Thiago Carvalho展示了如何使用Sankey Diagrams有效地可視化此類網絡
令人印象深刻的可視化 (Impressive Visualization)
Sankey Diagrams
桑基圖
If you click on each network of the diagram, you can see clearly which services the revenues are spent on. If you look solely at the nodes on the left-hand side, you can compare the proportions between different revenues. And you can do the same thing with nodes on the right-hand side. This technique is particularly useful for mapping processes — such as a sales pipeline, or the paths visitors take on your website. It is amazing how one diagram can convey so much information.
如果單擊該圖的每個網絡,則可以清楚地看到收入用于購買哪些服務。 如果僅查看左側的節點,則可以比較不同收入之間的比例。 您可以對右側的節點執行相同的操作。 該技術對于映射過程(例如銷售渠道或訪問者在您網站上采用的路徑)特別有用。 令人驚訝的是,一張圖可以傳達這么多信息。
探索資源 (Resources to Explore)
Medium article
中篇
Run the code on Binder
在活頁夾上運行代碼
COVID-19對不同社會群體的美國失業率的影響 (COVID-19’s Impact on U.S. Unemployment Rate of Different Social Groups)
有趣的主意 (Interesting Idea)
You might know badly COVID-19 affects the economy but do you how it affect different social groups? This article by Shinichi Okada aims to answer that question with animated bar chart.
您可能非常了解COVID-19對經濟的影響,但是您如何影響不同的社會群體? 岡田慎一的 這篇文章旨在用動畫條形圖回答這個問題。
令人印象深刻的可視化 (Impressive Visualization)
Animated Bar Chart
動畫條形圖
The most common way to see the change of bar chart over time is to use a slide bar where you slide the button to see the change yourself. But your rate of sliding changes so you will not see the change over time at the same rate. That is why the animated bar chart is so effective.
查看條形圖隨時間變化的最常見方法是使用滑動條,在其中滑動按鈕可自己查看更改。 但是您的滑動速度會發生變化,因此您將不會看到相同時間的變化。 這就是為什么動畫條形圖如此有效的原因。
Click the play button on the left-hand side to see how the bar chart changes over time! Now you can clearly see how COVID-19 affects different social groups differently in different time periods.
單擊左側的播放按鈕,以查看條形圖隨時間的變化! 現在,您可以清楚地看到COVID-19在不同時間段內如何不同地影響不同的社會群體。
探索資源 (Resources to Explore)
Medium article
中篇
Run the coder on Binder to create the bar chart yourself!
在Binder上運行編碼器,自己創建條形圖!
結論 (Conclusion)
I hope this article provides you a good start to explore interesting medium articles on visualization. The best way to learn anything is to try them yourself. Pick one visualization, run the code, and observe the magic!
我希望本文為您提供一個有趣的中型可視化文章的良好起點。 學習任何東西的最好方法是自己嘗試。 選擇一個可視化文件,運行代碼,然后觀察魔術!
I like to write about basic data science concepts and play with different algorithms and data science tools. You could connect with me on LinkedIn and Twitter.
我喜歡寫有關基本數據科學概念的文章,并喜歡使用不同的算法和數據科學工具。 您可以在LinkedIn和Twitter上與我聯系。
Star this repo if you want to check out the codes for all of the articles I have written. Follow me on Medium to stay informed with my latest data science articles like these
如果您想查看我編寫的所有文章的代碼,請給此回購加注星號。 在Medium上關注我,以了解有關這些最新數據科學文章的最新信息
Originally published at https://datapane.com on March 16, 2020.
最初于 2020年3月16日 發布在 https://datapane.com 。
翻譯自: https://towardsdatascience.com/best-python-visualizations-on-medium-a04921f61559
python 可視化工具
本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。 如若轉載,請注明出處:http://www.pswp.cn/news/389268.shtml 繁體地址,請注明出處:http://hk.pswp.cn/news/389268.shtml 英文地址,請注明出處:http://en.pswp.cn/news/389268.shtml
如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!