seaborn添加數據標簽_常見Seaborn圖的數據標簽快速指南

seaborn添加數據標簽

In the course of my data exploration adventures, I find myself looking at such plots (below), which is great for observing trend but it makes it difficult to make out where and what each data point is.

在進行數據探索的過程中,我發現自己正在查看此類圖(如下),這對于觀察趨勢非常有用,但是很難確定每個數據點的位置和位置。

A line plot showing the total number of passengers yearly.
How many passengers are there in 1956?
1956年有多少乘客?

The purpose of this piece of writing is to provide a quick guide in labelling common data exploration seaborn graphs. All the code used can be found here.

本文的目的是提供一個快速指南,以標記常見的數據探索海洋圖。 所有使用的代碼都可以在這里找到。

建立 (Set-Up)

Seaborn’s flights dataset will be used for the purposes of demonstration.

Seaborn的航班數據集將用于演示。

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline# load dataset
flights = sns.load_dataset(‘flights’)
flights.head()
Dataframe showing the first 5 rows of the data in flights.
First 5 rows of the the data in flights
排期中數據的前5行

For increased ease and convenience in creating some plots, some additional data frames can be created.

為了增加創建某些繪圖的便利性和便利性,可以創建一些其他數據框。

# set up flights by year dataframe
year_flights = flights.groupby(‘year’).sum().reset_index()
year_flights
Dataframe showing each year and the total number of flight passengers that year.
Total number of passengers for each year
每年的乘客總數
# set up average number of passengers by month dataframe
month_flights = flights.groupby(‘month’).agg({‘passengers’: ‘mean’}).reset_index()
month_flights
Dataframe showing each month of the year and the average number of flight passengers for that month.
Total number of passengers for each month
每個月的乘客總數

線圖 (Line Plot)

Plotting a graph of passengers per year:

繪制每年的乘客圖:

# plot line graph
sns.set(rc={‘figure.figsize’:(10,5)})
ax = sns.lineplot(x=’year’, y=’passengers’, data=year_flights, marker=’*’, color=’#965786')
ax.set(title=’Total Number of Passengers Yearly’)# label points on the plot
for x, y in zip(year_flights[‘year’], year_flights[‘passengers’]):
# the position of the data label relative to the data point can be adjusted by adding/subtracting a value from the x &/ y coordinates
plt.text(x = x, # x-coordinate position of data label
y = y-150, # y-coordinate position of data label, adjusted to be 150 below the data point
s = ‘{:.0f}’.format(y), # data label, formatted to ignore decimals
color = ‘purple’) # set colour of line
A line plot showing the total number of passengers yearly with data labels.
Line plot showing the total number of passengers yearly.
折線圖顯示了每年的乘客總數。

At times, it would be preferable for the data label to be more visible, which can be achieved by adding a background colour to the data labels:

有時,最好使數據標簽更可見,這可以通過向數據標簽添加背景色來實現:

# add set_backgroundcolor(‘color’) after plt.text(‘…’)
plt.text(x, y-150, ‘{:.0f}’.format(y), color=’white’).set_backgroundcolor(‘#965786’)
A line plot showing the total number of passengers yearly with data labels that have a background colour.
Line plot showing the total number of passengers yearly.
折線圖顯示了每年的乘客總數。

直方圖 (Histogram)

Plotting a histogram of the frequency of passengers on each flight:

繪制每次航班上乘客頻率的直方圖:

# plot histogram 
ax = sns.distplot(flights[‘passengers’], color=’#9d94ba’, bins=10, kde=False)
ax.set(title=’Distribution of Passengers’)# label each bar in histogram
for p in ax.patches:
height = p.get_height() # get the height of each bar
# adding text to each bar
ax.text(x = p.get_x()+(p.get_width()/2), # x-coordinate position of data label, padded to be in the middle of the bar
y = height+0.2, # y-coordinate position of data label, padded 0.2 above bar
s = ‘{:.0f}’.format(height), # data label, formatted to ignore decimals
ha = ‘center’) # sets horizontal alignment (ha) to center
Histogram showing the frequency of passengers on each flight.
Histogram showing the number of passengers on each flight.
直方圖顯示每次航班上的乘客人數。

An additional information that might be beneficial to reflect in the graph as well is the mean line of the dataset:

可能也有益于在圖中反映的其他信息是數據集的平均線:

# plot histogram 
# …# adding a vertical line for the average passengers per flight
plt.axvline(flights[‘passengers’].mean(), color=’purple’, label=’mean’)# adding data label to mean line
plt.text(x = flights[‘passengers’].mean()+3, # x-coordinate position of data label, adjusted to be 3 right of the data point
y = max([h.get_height() for h in ax.patches]), # y-coordinate position of data label, to take max height
s = ‘mean: {:.0f}’.format(flights[‘passengers’].mean()), # data label
color = ‘purple’) # colour of the vertical mean line# label each bar in histogram
# …
Histogram showing the frequency of passengers on each flight with a vertical line indicating the mean.
Histogram showing the number of passengers on each flight and a line indicating the mean.
直方圖顯示每次航班上的乘客人數,線表示平均值。

條形圖 (Bar Plot)

Vertical Bar Plot

垂直條形圖

Plotting the total number of passengers for each year:

繪制每年的乘客總數:

# plot vertical barplot
sns.set(rc={‘figure.figsize’:(10,5)})
ax = sns.barplot(x=’year’, y=’passengers’, data=year_flights)
ax.set(title=’Total Number of Passengers Yearly’) # title barplot# label each bar in barplot
for p in ax.patches:
# get the height of each bar
height = p.get_height()
# adding text to each bar
ax.text(x = p.get_x()+(p.get_width()/2), # x-coordinate position of data label, padded to be in the middle of the bar
y = height+100, # y-coordinate position of data label, padded 100 above bar
s = ‘{:.0f}’.format(height), # data label, formatted to ignore decimals
ha = ‘center’) # sets horizontal alignment (ha) to center
Bar Plot with vertical bars showing the total number of passengers yearly.
Bar plot with vertical bars showing the total number of passengers yearly
豎線條形圖,顯示每年的乘客總數

Horizontal Bar Plot

水平條形圖

Plotting the average number of passengers on flights each month:

繪制每月航班的平均乘客數:

# plot horizontal barplot
sns.set(rc={‘figure.figsize’:(10,5)})
ax = sns.barplot(x=’passengers’, y=’month’, data=month_flights, orient=’h’)
ax.set(title=’Average Number of Flight Passengers Monthly’) # title barplot# label each bar in barplot
for p in ax.patches:
height = p.get_height() # height of each horizontal bar is the same
width = p.get_width() # width (average number of passengers)
# adding text to each bar
ax.text(x = width+3, # x-coordinate position of data label, padded 3 to right of bar
y = p.get_y()+(height/2), # # y-coordinate position of data label, padded to be in the middle of the bar
s = ‘{:.0f}’.format(width), # data label, formatted to ignore decimals
va = ‘center’) # sets vertical alignment (va) to center
Bar plot with horizontal bars showing the average number of passengers for each month.
Bar plot with horizontal bars showing the average number of passengers for each month
帶有水平條的條形圖,顯示每個月的平均乘客人數

使用注意事項 (Notes on Usage)

It might be beneficial to add data labels to some plots (especially bar plots), it would be good to experiment and test out different configurations (such as using labels only for certain meaningful points, instead of labelling everything) and not overdo the labelling, especially if there are many points. A clean and informative graph is usually more preferable than a cluttered one.

將數據標簽添加到某些圖(尤其是條形圖)可能是有益的,嘗試并測試不同的配置(例如僅對某些有意義的點使用標簽,而不是對所有內容進行標簽),并且不要過度標注,特別是如果有很多要點的話。 通常,干凈整潔的圖表比混亂的圖表更可取。

# only labelling some points on graph# plot line graph
sns.set(rc={‘figure.figsize’:(10,5)})
ax = sns.lineplot(x=’year’, y=’passengers’, data=year_flights, marker=’*’, color=’#965786')# title the plot
ax.set(title=’Total Number of Passengers Yearly’)mean = year_flights[‘passengers’].mean()# label points on the plot only if they are higher than the mean
for x, y in zip(year_flights[‘year’], year_flights[‘passengers’]):
if y > mean:
plt.text(x = x, # x-coordinate position of data label
y = y-150, # y-coordinate position of data label, adjusted to be 150 below the data point
s = ‘{:.0f}’.format(y), # data label, formatted to ignore decimals
color = ‘purple’) # set colour of line
A line plot showing the total number of passengers yearly.
Line plot showing the total number of passengers yearly.
折線圖顯示了每年的乘客總數。

翻譯自: https://medium.com/swlh/quick-guide-to-labelling-data-for-common-seaborn-plots-736e10bf14a9

seaborn添加數據標簽

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/news/389210.shtml
繁體地址,請注明出處:http://hk.pswp.cn/news/389210.shtml
英文地址,請注明出處:http://en.pswp.cn/news/389210.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

使用python pandas dataframe學習數據分析

?? Note — This post is a part of Learning data analysis with python series. If you haven’t read the first post, some of the content won’t make sense. Check it out here.Note? 注意 -這篇文章是使用python系列學習數據分析的一部分。 如果您還沒有閱讀第一篇文…

實現TcpIp簡單傳送

private void timer1_Tick(object sender, EventArgs e) { IPAddress ipstr IPAddress.Parse("192.168.0.106"); TcpListener serverListener new TcpListener(ipstr,13);//創建TcpListener對象實例 ser…

SQLServer之函數簡介

用戶定義函數定義 與編程語言中的函數類似,SQL Server 用戶定義函數是接受參數、執行操作(例如復雜計算)并將操作結果以值的形式返回的例程。 返回值可以是單個標量值或結果集。 用戶定義函數準則 在函數中,將會區別處理導致語句被…

無向圖g的鄰接矩陣一定是_矩陣是圖

無向圖g的鄰接矩陣一定是To study structure,tear away all flesh soonly the bone shows.要研究結構,請盡快撕掉骨頭上所有的肉。 Linear algebra. Graph theory. If you are a data scientist, you have encountered both of these fields in your study or work …

移動pc常用Meta標簽

移動常用 <meta charset"UTF-8"><title>{$configInfos[store_title]}</title><meta content"widthdevice-width,minimum-scale1.0,maximum-scale1.0,shrink-to-fitno,user-scalableno,minimal-ui" name"viewport"><m…

前端繪制繪制圖表_繪制我的文學風景

前端繪制繪制圖表Back when I was a kid, I used to read A LOT of books. Then, over the last couple of years, movies and TV series somehow stole the thunder, and with it, my attention. I did read a few odd books here and there, but not with the same ferocity …

Rapi

本頁內容 ●引言●SMARTPHONE SDK API 庫●管理設備中的目錄文件●取系統信息●遠程操作電話和短信功能 Windows Mobile日益成熟&#xff0c;開發者隊伍也越來越壯大。作為一個10年的計算機熱愛者和程序員&#xff0c;我也經受不住新技術的誘惑&#xff0c;倒騰起Mobile這個玩具…

android 字符串特殊字符轉義

XML轉義字符 以下為XML標志符的數字和字符串轉義符 " ( 或 &quot;) ( 或 &apos;) & ( 或 &amp;) lt(<) (< 或 <) gt(>) (> 或 >) 如題&#xff1a; 比如&#xff1a;在string.xml中定義如下一個字符串&#xff0c;…

如何描繪一個vue的項目_描繪了一個被忽視的幽默來源

如何描繪一個vue的項目Source)來源 ) Data visualization is a great way to celebrate our favorite pieces of art as well as reveal connections and ideas that were previously invisible. More importantly, it’s a fun way to connect things we love — visualizing …

數據存儲加密和傳輸加密_將時間存儲網絡應用于加密預測

數據存儲加密和傳輸加密I’m not going to string you along until the end, dear reader, and say “Didn’t achieve anything groundbreaking but thanks for reading ;)”.親愛的讀者&#xff0c;我不會一直待到最后&#xff0c;然后說&#xff1a; “沒有取得任何開創性的…

熊貓分發_熊貓新手:第一部分

熊貓分發For those just starting out in data science, the Python programming language is a pre-requisite to learning data science so if you aren’t familiar with Python go make yourself familiar and then come back here to start on Pandas.對于剛接觸數據科學的…

多線程 進度條 C# .net

前言  在我們應用程序開發過程中&#xff0c;經常會遇到一些問題&#xff0c;需要使用多線程技術來加以解決。本文就是通過幾個示例程序給大家講解一下多線程相關的一些主要問題。 執行長任務操作  許多種類的應用程序都需要長時間操作&#xff0c;比如&#xff1a;執行一…

window 10 多版本激活工具

window 10 通用版激活工具 云盤地址&#xff1a;https://pan.baidu.com/s/1bo3L4Kn 激活工具網站&#xff1a;http://www.tudoupe.com/win10/win10jihuo/2017/0516/6823.html 轉載于:https://www.cnblogs.com/ipyanthony/p/9288007.html

android 動畫總結筆記 一

終于有時間可以詳細去了解一下 android動畫&#xff0c;先從android動畫基礎著手。在android 3.0之前android動畫api主要是android.view.Animation包下的內容&#xff0c;來先看看這個包里面主要的類![Animation成員](https://img-blog.csdn.net/20150709115201928 "Anima…

《Linux內核原理與分析》第六周作業

課本&#xff1a;第五章 系統調用的三層機制&#xff08;下&#xff09; 中斷向量0x80和system_call中斷服務程序入口的關系 0x80對應著system_call中斷服務程序入口&#xff0c;在start_kernel函數中調用了trap_init函數&#xff0c;trap_init函數中調用了set_system_trap_gat…

使用C#調用外部Ping命令獲取網絡連接情況

使用C#調用外部Ping命令獲取網絡連接情況 以前在玩Windows 98的時候&#xff0c;幾臺電腦連起來&#xff0c;需要測試網絡連接是否正常&#xff0c;經常用的一個命令就是Ping.exe。感覺相當實用。 現在 .net為我們提供了強大的功能來調用外部工具&#xff0c;并通過重定向輸…

Codeforces Round 493

心情不好&#xff0c;被遣散回學校 &#xff0c;心態不好 &#xff0c;為什么會累&#xff0c;一直微笑就好了 #include<bits/stdc.h> using namespace std; int main() {freopen("in","r",stdin);\freopen("out","w",stdout);i…

android動畫筆記二

從android3.0&#xff0c;系統提供了一個新的動畫&#xff0d;property animation, 為什么系統會提供這樣一個全新的動畫包呢&#xff0c;先來看看之前的補間動畫都有什么缺陷吧1、傳統的補間動畫都是固定的編碼&#xff0c;功能是固定的&#xff0c;擴展難度大。比如傳統動畫只…

回歸分析檢驗_回歸分析

回歸分析檢驗Regression analysis is a reliable method in statistics to determine whether a certain variable is influenced by certain other(s). The great thing about regression is also that there could be multiple variables influencing the variable of intere…