matplotlib可視化
Nowadays, everyone is immersed with plenty of data from news sources, cellphones, laptops, workplaces, and so on. Data conveys with tons of information from different data variables like date, string, numeric, and geographical format. How to effectively grasp the core value from a huge dataset that is easily interpreted by users? The answer would be the Exploratory Data Analysis (EDA). EDA comes as a tool to visualize and analyze data to extract insights from the dataset. Viewers are able to have a better understanding of the dataset from the important characteristics summarized through the process of EDA.
如今,每個人都沉浸在來自新聞來源,手機,筆記本電腦,工作場所等的大量數據中。 數據傳遞著來自不同數據變量(如日期,字符串,數字和地理格式)的大量信息。 如何從用戶易于解釋的龐大數據集中有效地把握核心價值? 答案將是探索性數據分析(EDA)。 EDA是一種可視化和分析數據以從數據集中提取見解的工具。 通過EDA流程總結的重要特征,觀眾可以更好地理解數據集。
In this article, you will learn:
在本文中,您將學習:
(1) Dynamic geographical plot with Geopandas and Bokeh
(1)帶有Geopandas和Bokeh的動態地理圖
(2) Analytics on worldwide dataset from 2016 to 2019
(2)2016年至2019年全球數據集的分析
(3) Visualization in Matplotlib and Bokeh
(3)Matplotlib和Bokeh中的可視化
動態膽量圖 (Dynamic choropleth Plot)
Choropleth map provides various patterns and symbols on geographic areas (i.e. countries) which shows a good representation of measurement across regions. To create a global choropleth map, we’ll focus on the survey of the state of global happiness, which ranks 155 countries by their happiness levels, released at the United Nations. Links to the Kaggle website: World Happiness Report. The creation of the plot is using Python libraries and packages — Pandas, Geopandas and Bokeh.
Choropleth地圖在地理區域(即國家/地區)上提供了各種模式和符號,可以很好地表示跨地區的測量結果。 要創建全球choropleth地圖,我們將集中于對全球幸福狀況的調查,該調查在聯合國發布的155個國家的幸福度排名中。 鏈接到Kaggle網站: 世界幸福報告 。 使用Python庫和程序包-Pandas,Geopandas和Bokeh來創建情節。
下載世界地圖文件 (Download world map File)
To render a world map, it is needed to have a shapefile with world coordinates. Natural Earth is a great source to download geospatial data, filled with various public domain map dataset. For the generation of dynamic geographical plot, 1–110m small scale data comes as a good map dataset.
要渲染世界地圖,需要具有一個帶有世界坐標的shapefile。 Natural Earth是下載包含各種公共領域地圖數據集的地理空間數據的理想資源。 為了生成動態地理圖, 1-110m的小比例尺數據是一個很好的地圖數據集。
將shp文件轉換為Geopandas數據框 (Convert shp file into Geopandas Dataframes)
Geopandas can convert ESRI shapefile into a GeoDataframe object with read_file
function. Geopandas can read almost any vector-based spatial data format including ESRI shapefile using read_file
command which returns a GeoDataframe object. You can specify the columns while reading the dataset with geopands function.
Geopandas可以使用read_file
函數將ESRI shapefile轉換為GeoDataframe對象。 Geopandas可以使用read_file
命令讀取幾乎任何基于矢量的空間數據格式,包括ESRI shapefile, read_file
命令返回GeoDataframe對象。 您可以在使用geopands函數讀取數據集時指定列。
2015年的靜態Choropleth地圖 (Static choropleth map for year 2015)
First, we create a data frame of the world happiness report and specify the year of 2015. The resulting data frame df_2015 can then be merged to the GeoDataframe gdf. For later use of Bokeh to create the visualization, we need to have geojson format data for the source of plotting. A collection of features contains points, lines, and polygons from GeoJSON data. Therefore, we convert the data frame into JSON and converts it to string-like object.
首先,我們創建世界幸福報告的數據框并指定2015年。然后可以將所得數據框df_2015合并到GeoDataframe gdf中。 為了以后使用Bokeh創建可視化,我們需要有geojson格式的數據作為繪圖源。 要素集合包含來自GeoJSON數據的點,線和面。 因此,我們將數據幀轉換為JSON并將其轉換為類似字符串的對象。
The merged file is a GeoDataframe object that can be rendered using geopandas module. However, since we want to incorporate data visualization interactivity, we will use the Bokeh library. Bokeh consumes GeoJSON format which represents geographical features with JSON. GeoJSON describes points, lines, and polygons (called Patches in Bokeh) as a collection of features. We therefore convert the merged file to the GeoJSON format.
合并的文件是一個GeoDataframe對象,可以使用geopandas模塊進行渲染。 但是,由于我們要合并數據可視化交互性,因此我們將使用Bokeh庫。 散景使用GeoJSON格式,該格式代表JSON的地理特征。 GeoJSON將點,線和面(在Bokeh中稱為Patches)描述為要素集合。 因此,我們將合并后的文件轉換為GeoJSON格式。
Then, we are ready to create a static choropleth map with the Bokeh module. We first read in geojson data withGeoJSONDataSource
package. Next, we assign a color palette as ‘YlGnBu’ and reverse the color order to match the darkest color for the highest happiness score. Then, we apply custom tick labels for color bars. For the color bar, we map the color mapper, orientation, and tick labels into the ColorBar package.
然后,我們準備使用Bokeh模塊創建一個靜態的Choropleth貼圖。 我們首先使用GeoJSONDataSource
包讀取geojson數據。 接下來,我們將調色板指定為“ YlGnBu”,并顛倒顏色順序以匹配最深的顏色以獲得最高的幸福分數。 然后,我們為色條應用自定義刻度標簽。 對于顏色欄,我們將顏色映射器,方向和刻度標簽映射到ColorBar包中。
We create the figure object with the assignment of plot height and width. Then, we add patches for the figure with x and y coordinates, and specify the field and transform columns in the fill_colors parameter. To display the bokeh plot in the Jupyter notebook, we need to put the output_notebook() module and have the figure displayed in the show() module.
我們通過分配繪圖高度和寬度來創建圖形對象。 然后,為帶有x和y坐標的圖形添加補丁,并在fill_colors參數中指定字段和轉換列。 要在Jupyter筆記本中顯示散景圖,我們需要放置output_notebook()模塊,并在show()模塊中顯示該圖。
分析: (Analytics:)
From the plot below, we see that countries like Canada, Mexico, and Australia have a higher happiness score. For South America, and European countries, the overall score is distributed around Index 5 and 6. In Contrast, African countries like Niger, Chad, Mali, and Benin show a much lower happiness index.
從下面的圖中可以看出,加拿大,墨西哥和澳大利亞等國家的幸福感得分較高。 對于南美和歐洲國家,總體得分圍繞指數5和6進行分配。相反,非洲國家(如尼日爾,乍得,馬里和貝寧)的幸福指數要低得多。

2015年至2019年的交互式Choropleth地圖 (Interactive choropleth map from year 2015 to 2019)
There are two parts added for the interactive choropleth map. One is the creation of a hover tool. We assign the columns for the information displayed on the graph. The other is the creation of the callback function. For the plot interaction, we specify the year through the slider to update the data. We pass the slider value to the callback and have the data adjusted. Then, we pass the slider object to the widgetbox parameter in the bokeh Column class. Finally, we add the curdoc
class to create interactive web applications that can connect front-end UI events to real, running Python code.
交互式Choropleth映射添加了兩個部分。 一種是創建懸停工具。 我們為圖表上顯示的信息分配列。 另一個是回調函數的創建。 對于繪圖交互,我們通過滑塊指定年份以更新數據。 我們將滑塊值傳遞給回調并調整數據。 然后,將滑塊對象傳遞給bokeh Column類中的widgetbox參數。 最后,我們添加curdoc
類來創建交互式Web應用程序,該應用程序可以將前端UI事件連接到實際的,正在運行的Python代碼。
For those who have error to run the choropleth map in the Jupyter notebook, there’s an alternative to run the script in the terminal.
對于那些無法在Jupyter筆記本中運行Choropleth映射的人,還有另一種方法可以在終端中運行腳本。
bokeh serve --show EDA_Plot.py
2015年至2019年《世界幸福報告》的分析圖表 (Analytics Plots on World Happiness Report from 2015 to 2019)
2016年GDP和幸福指數的散點圖 (Scatter Plot of GDP & Happiness_Score Index in 2016)
分析: (Analytics:)
We look into the correlation of GDP Growth and happiness levels score in 2016. As the countries are color-coded by regions, we can see that southeast countries have lower GDP growth followed by underlying happiness scores. Most countries in central and eastern Europe have GDP growth fall within 0.8 and 1.4 with a happiness score between 5 and 6. For the region of Western Europe, they tend to show a higher range of economic growth along with the happiness index.
我們研究了2016年GDP增??長與幸福度得分之間的相關性。由于這些國家按地區進行了顏色編碼,因此我們可以看到,東南部國家的GDP增長率較低,其次是基本幸福度得分。 中歐和東歐的大多數國家的GDP增長率都落在0.8到1.4之間,幸福指數在5到6之間。在西歐地區,它們的幸福指數趨向于顯示出更大的經濟增長范圍。

前十名和后十名經濟體指數(人均GDP) (Top and Bottom 10 Countries of Economy Index (GDP per capita))
分析: (Analytics:)
For the top 10 economy trend countries, ‘United Arab Emirate’ has shown the increasing trend with 0.68 growth on the economy from 2015 to 2018. ‘Myanmar’ has a rising rate with 0.41 on GDP per Capita growth as one only Asian country. Surprisingly, Sub-Saharan Africa countries like ‘Malawi’, ‘Guinea’, ‘Tanzania’ are the top 5 countries with the upward economic trend.
在十大經濟趨勢國家中,“阿拉伯聯合酋長國”呈現出上升趨勢,2015年至2018年經濟增長率為0.68。“緬甸”的人均GDP增長率為0.41,是唯一的亞洲國家。 令人驚訝的是,撒哈拉以南非洲國家(如“馬拉維”,“幾內亞”,“坦桑尼亞”)是經濟趨勢排名前五的國家。
We can see that countries with decreased economic trends are mostly in Africa. Bottom 5 countries like ‘Libya’, ‘Yemen’, ‘Kuwait’, ‘Jordan’, ‘Sierra Leone’ have lower Economy Index from 2015 to 2018. Four of those countries are located in the Middle East and Northern Africa.
我們可以看到經濟趨勢下降的國家大多在非洲。 “利比亞”,“也門”,“科威特”,“約旦”,“塞拉利昂”等排在后5位的國家在2015年至2018年的經濟指數較低。其中四個國家位于中東和北非。

阿聯酋GDP年度變化 (UAE Yearly GDP Change)
分析: (Analytics:)
Seeing the top and Bottom 10 Countries of Economy Index (GDP per capita growth), we closely look into the United Arab Emirate’s economic trend. In 1980, UAE shows the max GDP growth value among 40 years. However, the growth becomes negative in the range of the year 1982 to 1986. In the next 10 years, UAE shows a quite stable GDP growth around 0.1 to 0.2 rise. In the year 2009, there’s a plunge on GDP growth followed by the impact of the financial crisis.
看到前十名和后十名經濟體國家(人均GDP增長),我們將密切關注阿拉伯聯合酋長國的經濟趨勢。 1980年,阿聯酋顯示了40年以來的最大GDP增長值。 但是,在1982年至1986年的范圍內,該增長率將變為負值。在接下來的10年中,阿聯酋的GDP增長率將保持在0.1至0.2左右的穩定水平。 2009年,國內生產總值(GDP)暴跌,隨后是金融危機的影響。

結論: (In Conclusion:)
- To create a choropleth map, geopands can convert shp files into the data frame object. For the creation of visualization, bokeh works well with the geopandas package. However, it’s better to mind that countries need to be matched from the ship file with the outsource data when merging both datasets. 要創建一個Choropleth貼圖,geopands可以將shp文件轉換為數據框對象。 為了創建可視化效果,散景與geopandas軟件包很好地配合使用。 但是,最好記住的是,合并兩個數據集時,需要將船舶文件中的國家與外包數據進行匹配。
- Matplotlib and Bokeh are two great packages for visualization tool in Python. Scatter plot better shows the correlation of 2 variables with numeric values. In terms of the diverging plot, it better shows the downward and upward trend of the dataset. For the DateTime format variable, it’s better to take care of date with a missing value for the plot creation. The line graph displays a distinct trend on the time series data. Matplotlib和Bokeh是Python中可視化工具的兩個很好的軟件包。 散點圖更好地顯示了2個變量與數值的相關性。 就散布圖而言,它更好地顯示了數據集的下降趨勢和上升趨勢。 對于DateTime格式變量,最好在創建繪圖時注意缺少值的日期。 折線圖在時間序列數據上顯示明顯的趨勢。
翻譯自: https://towardsdatascience.com/eda-visualization-in-geopandas-matplotlib-bokeh-9bf93e6469ec
matplotlib可視化
本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。 如若轉載,請注明出處:http://www.pswp.cn/news/276033.shtml 繁體地址,請注明出處:http://hk.pswp.cn/news/276033.shtml 英文地址,請注明出處:http://en.pswp.cn/news/276033.shtml
如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!