java 在底圖上繪制線條_使用底圖和geonamescache繪制k表示聚類

java 在底圖上繪制線條

This is the third of four stories that aim to address the issue of identifying disease outbreaks by extracting news headlines from popular news sources.

這是四個故事中的第三個,旨在通過從流行新聞來源中提取新聞頭條來解決識別疾病暴發的問題。

This article aims to determine an easy way to view the clusters determined (in the second article) on a global and US-level scale. First, a list of large cities are gathered, and placed with their corresponding latitude and longitude inside a dataset. Next, a function is made that plots the cluster points on a map with different colors for each respective cluster. Lastly, the function is called for the points in the United States, the centers of the clusters in the United States, the points globally, and the centers of the clusters globally.

本文旨在確定一種簡單的方法來查看在全球和美國范圍內確定的集群(在第二篇文章中)。 首先,收集大城市列表,并將其對應的緯度和經度放置在數據集中。 接下來,創建一個函數,在每個地圖上用不同的顏色繪制地圖上的聚類點。 最后,該函數用于美國的點,美國的聚類中心,全球的點以及全球的聚類中心。

A detailed explanation is shown below for how this is implemented:

下面顯示了有關如何實現的詳細說明:

Step 1: Compiling a List of the Largest Cities in the US

步驟1:編制美國最大城市清單

First, the city name, latitude, longitude, and population are extracted from ‘largest_us_cities.csv’, a file containing the cities in the US with a population over 30,000. Cities with a population over 200,000 were added to the dictionary, and Anchorage and Honolulu were excluded as they skewed the positioning of the map. Next, using the haversine distance formula, which determines the distance between pairs of cities, cities close to one another were excluded and used a population heuristic to determine which city should should be kept.

首先,從“ largest_us_cities.csv”中提取城市名稱,緯度,經度和人口,該文件包含美國人口超過30,000的城市。 人口超過200,000的城市被添加到詞典中,并且由于錨定地圖和地圖的位置偏斜,因此將安克雷奇和檀香山排除在外。 接下來,使用Haversine距離公式確定兩對城市之間的距離,將彼此靠近的城市排除在外,并使用人口啟發法確定應保留的城市。

file2 = open('largest_us_cities.csv', 'r') 
large_cities = file2.readlines()
large_city_data = {}for i in range(1, len(large_cities)):
large_city_values = large_cities[i].strip().split(';')
lat_long = large_city_values[-1].split(',')if ((int(large_city_values[-2]) >= 200000) and (large_city_values[0] != "Anchorage") and (large_city_values[0] != "Honolulu") and (large_city_values[0] != "Greensboro")):
large_city_data[large_city_values[0]] = [lat_long[0], lat_long[1], large_city_values[-2]]def haversine(point_a, point_b):
lon1, lat1 = point_a[0], point_a[1]
lon2, lat2 = point_b[0], point_b[1]
lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])
dlon = lon2 - lon1
dlat = lat2 - lat1
a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
c = 2 * asin(sqrt(a))
r = 6371return c * rfor i in list(large_city_data.keys()):for j in list(large_city_data.keys()):if ((i != j) and haversine((float(large_city_data[i][0]), float(large_city_data[i][1])), (float(large_city_data[j][0]),
float(large_city_data[j][1]))) < 80.0):if (large_city_data[j][2] > large_city_data[i][2]):
large_city_data[i] = [np.nan, np.nan, large_city_data[i][2]]else:
large_city_data[j] = [np.nan, np.nan, large_city_data[j][2]]
large_city_data['Chicago'] = [41.8781136, -87.6297982, 2718782]

Step 2: Plotting K-Means Clusters and Cluster Centers Using Basemap

步驟2:使用底圖繪制K均值聚類和聚類中心

First, a function is created with seven parameters: df1, num_cluster, typeof, path, size, add_large_city, and figsize. Using the basemap library, depending on the typeof parameter, geographic models of the US and world are generated. Furthermore, the figsize parameter changes the model size depending on its value. A dictionary is created where the keys are the cluster labels, subdivided by latitude and longitude. The values contain the latitude and longitude for each headline for each cluster label.

首先,使用七個參數創建一個函數:df1,num_cluster,typeof,路徑,大小,add_large_city和figsize。 使用底圖庫,根據參數的typeof,可以生成美國和世界的地理模型。 此外,figsize參數根據其值更改模型大小。 將創建一個字典,其中鍵是聚類標簽,按緯度和經度細分。 該值包含每個群集標簽的每個標題的緯度和經度。

A list of colors is intitialized, and specific colors are assigned to each cluster label. The latitude and longitude points are plotted using these color values on the geographic models made above. If the add_large_city parameter is true, the largest cities will also be added to the graph. The figure is saved to a “.png” file using the path parameter.

顏色列表被初始化,并且特定的顏色分配給每個群集標簽。 使用這些顏色值在上述地理模型上繪制緯度和經度點。 如果add_large_city參數為true,則最大的城市也將添加到圖形中。 使用path參數將圖形保存到“ .png”文件中。

def print_k_means(df1, num_cluster, typeof, path, size, add_large_city, figsize):if (typeof == "US"):
map_plotter = Basemap(projection='lcc', lon_0=-95, llcrnrlon=-119, llcrnrlat=22, urcrnrlon=-64, urcrnrlat=49, lat_1=33, lat_2=45)else:
map_plotter = Basemap()if (figsize):
fig = plt.figure(figsize = (24,16))else:
fig = plt.figure(figsize = (12,8))
coordinates = []for index in df1.index:
coordinates.append([df1['latitude'][index], df1['longitude'][index], df1['cluster_label'][index]])
cluster_vals = {}for i in range(num_cluster):
cluster_vals[str(i)+"_long"] = []
cluster_vals[str(i)+"_lat"] = []for index in df1.index:
cluster_vals[str(df1['cluster_label'][index])+'_long'].append(float(df1['longitude'][index]))
cluster_vals[str(df1['cluster_label'][index])+'_lat'].append(float(df1['latitude'][index]))
num_list = [i for i in range(num_cluster)]
color_list = ['rosybrown', 'lightcoral', 'indianred', 'brown',
'maroon', 'red', 'darksalmon', 'sienna', 'chocolate', 'sandybrown', 'peru',
'darkorange', 'burlywood', 'orange', 'tan', 'darkgoldenrod', 'goldenrod', 'gold', 'darkkhaki',
'olive', 'olivedrab', 'yellowgreen', 'darkolivegreen', 'chartreuse',
'darkseagreen', 'forestgreen', 'darkgreen', 'mediumseagreen', 'mediumaquamarine',
'turquoise', 'lightseagreen', 'darkslategrey', 'darkcyan',
'cadetblue', 'deepskyblue', 'lightskyblue', 'steelblue', 'lightslategrey',
'midnightblue', 'mediumblue', 'blue', 'slateblue', 'darkslateblue', 'mediumpurple', 'rebeccapurple',
'thistle', 'plum', 'violet', 'purple', 'fuchsia', 'orchid', 'mediumvioletred', 'deeppink', 'hotpink',
'palevioletred']
colors = [color_list[i] for i in range(num_cluster+1)]for target,color in zip(num_list, colors):
map_plotter.scatter(cluster_vals[str(target)+'_long'], cluster_vals[str(target)+'_lat'], latlon=True, s = size, c = color)
map_plotter.shadedrelief()if (add_large_city):for index in list(large_city_data.keys()):if (large_city_data[index][1] != np.nan):
x, y = map_plotter(large_city_data[index][1], large_city_data[index][0])
plt.plot(x, y, "ok", markersize = 4)
plt.text(x, y, index, fontsize = 16)
plt.show()
fig.savefig(path)

Step 3: Running the Function

步驟3:運行功能

The print_k_means function is run on the df_no_us dataframe to make a scatterplot of the latitude and longitudes for headlines pertaining to the US. Next, a geographic center to each cluster is determined and stored in another dataframe called df_center_us. The print_k_means function is run on the df_center_us dataframe and adds large cities to determine the cities closest to the disease outbreak centers. Additionally, the size is increased for easier readability. A similar process is run for df_no_world. Each of the dataframes are stored in a “.csv” file.

在df_no_us數據幀上運行print_k_means函數,以制作與美國相關的標題的經度和緯度散點圖。 接下來,確定每個群集的地理中心,并將其存儲在另一個名為df_center_us的數據框中。 print_k_means函數在df_center_us數據幀上運行,并添加大城市以確定最接近疾病爆發中心的城市。 此外,增加了大小以更易于閱讀。 df_no_world運行類似的過程。 每個數據幀都存儲在“ .csv”文件中。

print_k_means(df_no_us, us_clusters, "US", "corona_disease_outbreaks_us.png", 50, False, False)
df_no_us.to_csv("corona_disease_outbreaks_us.csv")
df_center_us = {'latitude': [], 'longitude':[] , 'cluster_label': []}for i in range(us_clusters):
df_1 = df_no_us.loc[df_no_us['cluster_label'] == i]
df_1 = df_1.reset_index()del df_1['index']
latitude = []
longitude = []for index in df_1.index:
latitude.append(float(df_1['latitude'][index]))
longitude.append(float(df_1['longitude'][index]))
df_1['latitude'] = latitude
df_1['longitude'] = longitude
sum_latitude = df_1['latitude'].sum()
sum_longitude = df_1['longitude'].sum()if (len(df_1['latitude']) >= 20):
df_center_us['latitude'].append(sum_latitude/(len(df_1['latitude'])))
df_center_us['cluster_label'].append(i)
df_center_us['longitude'].append(sum_longitude/(len(df_1['longitude'])))
df_center_us = pd.DataFrame(data = df_center_us)for index in df_center_us.index:
df_center_us['cluster_label'][index] = index
print_k_means(df_center_us, len(df_center_us['latitude']), "US", "corona_disease_outbreaks_us_centers.png", 500, True, True)
df_center_us.to_csv("corona_disease_outbreaks_us_centers.csv")
df_center_world = {'latitude': [], 'longitude':[] , 'cluster_label': []}for i in range(world_clusters):
df_1 = df_no_world.loc[df_no_world['cluster_label'] == i]
df_1 = df_1.reset_index()del df_1['index']
latitude = []
longitude = []for index in df_1.index:
latitude.append(float(df_1['latitude'][index]))
longitude.append(float(df_1['longitude'][index]))
df_1['latitude'] = latitude
df_1['longitude'] = longitude
sum_latitude = df_1['latitude'].sum()
sum_longitude = df_1['longitude'].sum()if (len(df_1['latitude']) >= 10):
df_center_world['latitude'].append(sum_latitude/(len(df_1['latitude'])))
df_center_world['cluster_label'].append(i)
df_center_world['longitude'].append(sum_longitude/(len(df_1['longitude'])))
df_center_world = pd.DataFrame(data = df_center_world)for index in df_center_world.index:
df_center_world['cluster_label'][index] = index
print_k_means(df_center_world, len(df_center_world['latitude']), "world", "corona_disease_outbreaks_world_centers.png", 500, False, True)
df_center_us.to_csv("corona_disease_outbreaks_world_centers.csv")

Click this link for access to the Github repository for a detailed explanation of the code: Github.

單擊此鏈接可訪問Github存儲庫,以獲取代碼的詳細說明: Github 。

翻譯自: https://medium.com/@neuralyte/using-basemap-and-geonamescache-to-plot-k-means-clusters-995847513fc2

java 在底圖上繪制線條

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/news/389972.shtml
繁體地址,請注明出處:http://hk.pswp.cn/news/389972.shtml
英文地址,請注明出處:http://en.pswp.cn/news/389972.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

python selenium處理JS只讀(12306)

12306為例 js "document.getElementById(train_date).removeAttribute(readonly);" driver.execute_script(js)time2獲取當前時間tomorrow_time 獲取明天時間 from selenium import webdriver import time import datetime time1datetime.datetime.now().strftime(&…

Mac上使用Jenv管理多個JDK版本

使用Java時會接觸到不同的版本。大多數時候我在使用Java 8&#xff0c;但是因為某些框架或是工具的要求&#xff0c;這時不得不讓Java 7上前線。一般情況下是配置JAVA_HOME&#xff0c;指定不同的Java版本&#xff0c;但是這需要人為手動的輸入。如果又要選擇其他版本&#xff…

交互式和非交互式_發布交互式劇情

交互式和非交互式Python中的Visual EDA (Visual EDA in Python) I like to learn about different tools and technologies that are available to accomplish a task. When I decided to explore data regarding COVID-19 (Coronavirus), I knew that I would want the abilit…

5886. 如果相鄰兩個顏色均相同則刪除當前顏色

5886. 如果相鄰兩個顏色均相同則刪除當前顏色 總共有 n 個顏色片段排成一列&#xff0c;每個顏色片段要么是 ‘A’ 要么是 ‘B’ 。給你一個長度為 n 的字符串 colors &#xff0c;其中 colors[i] 表示第 i 個顏色片段的顏色。 Alice 和 Bob 在玩一個游戲&#xff0c;他們 輪…

Sunisoft.IrisSkin.SkinEngine 設置winform皮膚

Sunisoft.IrisSkin.SkinEngine se; se new Sunisoft.IrisSkin.SkinEngine { SkinAllForm true, SkinFile "..\..\skin\EmeraldColor2.ssk" };Sunisoft.IrisSkin.SkinEngine skin new Sunisoft.IrisSkin.SkinEngine(); //具體樣式文件 地址&#xff0c;可以自行修…

docker 相關操作

docker-compose down //關閉所有容器 docker-compose up //開啟所有容器docker-compose restart //重啟所有容器單獨更新某個容器時用腳本$ docker ps // 查看所有正在運行容器 $ docker stop containerId // containerId 是容器的ID$ docker ps -a // 查看所有容器 $…

電子表格轉換成數據庫_創建數據庫,將電子表格轉換為關系數據庫,第1部分...

電子表格轉換成數據庫Part 1: Creating an Entity Relational Diagram (ERD)第1部分&#xff1a;創建實體關系圖(ERD) A Relational Database Management System (RDMS) is a program that allows us to create, update, and manage a relational database. Structured Query …

【Vue.js學習】生命周期及數據綁定

一、生命后期 官網的圖片說明&#xff1a; Vue的生命周期總結 var app new Vue({el:"#app", beforeCreate: function(){console.log(1-beforeCreate 初始化之前);//加載loading},created: function(){console.log(2-created 創建完成);//關閉loading},be…

5885. 使每位學生都有座位的最少移動次數

5885. 使每位學生都有座位的最少移動次數 一個房間里有 n 個座位和 n 名學生&#xff0c;房間用一個數軸表示。給你一個長度為 n 的數組 seats &#xff0c;其中 seats[i] 是第 i 個座位的位置。同時給你一個長度為 n 的數組 students &#xff0c;其中 students[j] 是第 j 位…

Springboot(2.0.0.RELEASE)+spark(2.1.0)框架整合到jar包成功發布(原創)!!!

一、前言 首先說明一下&#xff0c;這個框架的整合可能對大神來說十分容易&#xff0c;但是對我來說十分不易&#xff0c;踩了不少坑。雖然整合的時間不長&#xff0c;但是值得來紀念下&#xff01;&#xff01;&#xff01;我個人開發工具比較喜歡IDEA&#xff0c;創建的sprin…

求一個張量的梯度_張量流中離散策略梯度的最小工作示例2 0

求一個張量的梯度Training discrete actor networks with TensorFlow 2.0 is easy once you know how to do it, but also rather different from implementations in TensorFlow 1.0. As the 2.0 version was only released in September 2019, most examples that circulate …

docker環境 快速使用elasticsearch-head插件

docker環境 快速使用elasticsearch-head插件 #elasticsearch配置 #進入elk容器 docker exec -it elk /bin/bash #head插件訪問配置 echo #head插件訪問# http.cors.enabled: true http.cors.allow-origin: "*" >>/etc/elasticsearch/elasticsearch.yml#重啟el…

476. 數字的補數

476. 數字的補數 給你一個 正 整數 num &#xff0c;輸出它的補數。補數是對該數的二進制表示取反。 例 1&#xff1a;輸入&#xff1a;num 5 輸出&#xff1a;2 解釋&#xff1a;5 的二進制表示為 101&#xff08;沒有前導零位&#xff09;&#xff0c;其補數為 010。所以你…

zabbix網絡發現主機

1 功能介紹 默認情況下&#xff0c;當我在主機上安裝agent&#xff0c;然后要在server上手動添加主機并連接到模板&#xff0c;加入一個主機組。 如果有很多主機&#xff0c;并且經常變動&#xff0c;手動操作就很麻煩。 網絡發現就是主機上安裝了agent&#xff0c;然后server自…

python股市_如何使用python和破折號創建儀表板來主導股市

python股市始終關注大局 (Keep Your Eyes on the Big Picture) I’ve been fascinated with the stock market since I was a little kid. There is certainly no shortage of data to analyze, and if you find an edge you can make some easy money. To stay on top of the …

阿里巴巴開源 Sentinel,進一步完善 Dubbo 生態

為什么80%的碼農都做不了架構師&#xff1f;>>> 阿里巴巴開源 Sentinel&#xff0c;進一步完善 Dubbo 生態 Sentinel 開源地址&#xff1a;https://github.com/alibaba/Sentinel 轉載于:https://my.oschina.net/dyyweb/blog/1925839

數據結構與算法 —— 鏈表linked list(01)

鏈表(維基百科) 鏈表&#xff08;Linked list&#xff09;是一種常見的基礎數據結構&#xff0c;是一種線性表&#xff0c;但是并不會按線性的順序存儲數據&#xff0c;而是在每一個節點里存到下一個節點的指針(Pointer)。由于不必須按順序存儲&#xff0c;鏈表在插入的時候可以…

離群值如何處理_有理處理離群值的局限性

離群值如何處理ARIMA models can be quite adept when it comes to modelling the overall trend of a series along with seasonal patterns.ARIMA模型可以很好地建模一系列總體趨勢以及季節性模式。 In a previous article titled SARIMA: Forecasting Seasonal Data with P…

網絡爬蟲基礎練習

0.可以新建一個用于練習的html文件&#xff0c;在瀏覽器中打開。 1.利用requests.get(url)獲取網頁頁面的html文件 import requests newsurlhttp://news.gzcc.cn/html/xiaoyuanxinwen/ res requests.get(newsurl) #返回response對象 res.encodingutf-8 2.利用BeautifulSoup的H…

10生活便捷:購物、美食、看病時這樣搜,至少能省一半心

本次課程介紹實實在在能夠救命、省錢的網站&#xff0c;解決了眼前這些需求后&#xff0c;還有“詩和遠方”——不花錢也能點亮自己的生活&#xff0c;獲得美的享受&#xff01; 1、健康醫療這么搜&#xff0c;安全又便捷 現在的醫療市場確實有些混亂&#xff0c;由于醫療的專業…