mongodb數據可視化
Using Python to connect to Taiwan Government PM2.5 open data API, and schedule to update data in real time to MongoDB — Part 2
使用Python連接到臺灣政府PM2.5開放數據API,并計劃將數據實時更新到MongoDB —第2部分
目標 (Goal)
This time I’m using the same PM2.5 open data API (used in Part 1) to showcase how to refresh real time data into mongoDB for every 2 min (because it’s the time for the government’s portal to refresh its API). The strength of mongoDB is it’s simple to use, especially with JSON document format data. This makes connecting to open data much easier. Also we can directly show real time data changes from our database using its Charts & Dashboard features.
這次,我使用相同的PM2.5開放數據API(在第1部分中使用過)來展示如何每2分鐘將實時數據刷新到mongoDB中(因為這是政府門戶網站刷新其API的時間)。 mongoDB的優勢在于它易于使用,尤其是使用JSON文檔格式數據時。 這使得連接打開的數據變得更加容易。 我們還可以使用其“圖表和儀表板”功能直接從數據庫中顯示實時數據更改。
How convenient!
多么方便!
The below demo uses Taipei City (the capital city of Taiwan) as example:
下面的演示以臺北市(臺灣省會城市)為例:
Skills covered:
涵蓋技能:
- Connect to API with required parameters to filter out all sensors data in Taipei City 連接到具有所需參數的API,以過濾掉臺北市中的所有傳感器數據
- Insert the first batch of data into mongoDB 將第一批數據插入mongoDB
- Set a schedule to extract new batch of PM2.5 data from API into mongoDB 設置時間表以從API將新的PM2.5數據批次提取到mongoDB中
- Create charts into dashboard 將圖表創建到儀表板
So, let’s get started.
因此,讓我們開始吧。
處理 (Process)
Import all required libraries:
導入所有必需的庫:
# connect to mongoDB cloud cluster
import pymongo
from pymongo import MongoClient# convert timezone
import pytz, dateutil.parser# connect to government open data API
import requests
Connect to API with required parameters to filter out all sensors data in Taipei City. Raw data looks like below (total count of sensors is 100):
使用必需的參數連接到API,以過濾掉臺北市中的所有傳感器數據。 原始數據如下所示(傳感器總數為100):

All data was stored in “first_batch” variable:
所有數據都存儲在“ first_batch”變量中:
# Parameters: the latest data, observation value > 0, PM2.5 data only, Taipei city
# https://sta.ci.taiwan.gov.tw/STA_AirQuality_EPAIoT/v1.0/Datastreams?$expand=Thing,Observations($top=1)&$filter=name eq'PM2.5' and Observations/result gt 0 and Thing/properties/city eq '臺北市'&$count=truedef API_data():API_URL = "https://sta.ci.taiwan.gov.tw/STA_AirQuality_EPAIoT/v1.0/Datastreams?$expand=Thing,Observations($top=1)&$filter=name%20eq%27PM2.5%27%20and%20Observations/result%20gt%200%20and%20Thing/properties/city%20eq%20%27%E8%87%BA%E5%8C%97%E5%B8%82%27&$count=true"total = requests.get(API_URL).json()data = total['value']first_batch = []for item in data:dic = {}dic['_id'] = item['Thing']['properties']['stationID']dic['name'] = item['name']dic['areaDescription'] = item['Thing']['properties']['areaDescription']dic['city'] = item['Thing']['properties']['city']dic['township'] = item['Thing']['properties']['township']dic['observedArea'] = item['observedArea']dic['iso8601_UTC_0'] = item['Observations'][0]['phenomenonTime']UTC_0 = dateutil.parser.parse(dic['iso8601_UTC_0'])dic['UTC_0'] = str(UTC_0)UTC_8 = UTC_0.astimezone(pytz.timezone("Asia/Taipei"))dic['UTC_8'] = str(UTC_8)dic['result'] = item['Observations'][0]['result']dic['unitOfMeasurement'] = item['unitOfMeasurement']['symbol']first_batch.append(dic)return first_batchfirst_batch = API_data()
The first value within “first_batch” list is a sensor station’s data read:
“ first_batch”列表中的第一個值是讀取的傳感器站數據:
print(first_batch[0])# output:
{'_id': '10189360662', 'name': 'PM2.5', 'areaDescription': '營建混合物土資場', 'city': '臺北市', 'township': '北投區', 'observedArea': {'type': 'Point', 'coordinates': [121.4871916, 25.121195]}, 'iso8601_UTC_0': '2020-08-20T05:22:58.000Z', 'UTC_0': '2020-08-20 05:22:58+00:00', 'UTC_8': '2020-08-20 13:22:58+08:00', 'result': 22.0, 'unitOfMeasurement': 'μg/m3'}
Then connect to my mongoDB Atlas and insert the first batch of data:
然后連接到我的mongoDB Atlas并插入第一批數據:
# connect to my mongoDB cloud clustercluster = MongoClient("mongodb+srv://<username>:<password>@cluster0.dd7sd.mongodb.net/<dbname>?retryWrites=true&w=majority")# my database name
db = cluster["test"]# my collection's name
collection = db["test2"]results = collection.insert_many(first_batch)
Next, set a scheduler to pull out latest PM2.5 data read from API (every 2 min and stop at a time whenever we wanted) and update data by “_id” on mongoDB i.e. “stationID” of each station:
接下來,設置一個調度程序以提取從API讀取的最新PM2.5數據(每2分鐘一次,并在需要時停止一次),并在mongoDB上通過“ _id”更新數據,即每個站的“ stationID”:
import schedule
import time
import datetime
import sysdef update_content():# get a new batchnew_batch = API_data() for item in new_batch:update_data = {"iso8601_UTC_0": item['iso8601_UTC_0'], "UTC_0": item['UTC_0'], "UTC_8": item['UTC_8'], "result": item['result']}results = collection.update_one({"_id": item['_id']}, {"$set": update_data}, upsert=True)def stop_update():sys.exit()schedule.every(2).minutes.do(update_content)
schedule.every(5).minutes.do(stop_update)while True: schedule.run_pending() time.sleep(1)
In mongoDB it will look like this:
在mongoDB中,它將如下所示:


Lastly, we created each charts on dashboard as following:
最后,我們在儀表板上創建了每個圖表,如下所示:







結論 (Conclusion)
With the above 4 charts, our dashboard is ready:
有了以上4個圖表,我們的儀表板已準備就緒:


We can further modify the color according to the intensity level set by government e.g. in Taiwan, 0–30 μg/m3 is low, 30–50 μg/m3 is medium, etc. Below I set within 5 min, how much the PM2.5 intensity changed “slightly” across different sensors in Taipei City on both maps. This clip was recorded later than the previous demo, around 19:00–19:30, but still on the same day.
我們可以根據政府設定的強度水平進一步修改顏色,例如在臺灣,0–30μg/ m3低,30–50μg/ m3中度等。在5分鐘內低于我設定的PM2。在兩張地圖上,臺北市的不同傳感器上的5個強度“略有變化”。 該剪輯的錄制時間比上一個演示晚,大約在19:00–19:30,但仍在同一天。
At the left-bottom corner of scatter plot, it shows how much time left for mongoDB to refresh the data input again, or just stare at the below clip for 10 sec you may spot the difference :D
在散點圖的左下角,它顯示了mongoDB再次刷新數據輸入還有多少時間,或者只是盯著下面的剪輯10秒鐘,您可能會發現差異:D

That’s it. Hope you find this helpful.
而已。 希望對您有所幫助。
Have a wonderful day!
祝你有美好的一天!
翻譯自: https://medium.com/li-ting-liao-tiffany/visualize-open-data-using-mongodb-in-real-time-2cca4bcca26e
mongodb數據可視化
本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。 如若轉載,請注明出處:http://www.pswp.cn/news/391621.shtml 繁體地址,請注明出處:http://hk.pswp.cn/news/391621.shtml 英文地址,請注明出處:http://en.pswp.cn/news/391621.shtml
如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!