交互式和非交互式
Python中的Visual EDA (Visual EDA in Python)
I like to learn about different tools and technologies that are available to accomplish a task. When I decided to explore data regarding COVID-19 (Coronavirus), I knew that I would want the ability to present visualizations interactively. After all, the Coronavirus pandemic is tracked, monitored, and reported daily, from all over the world. Data science and analysis projects that involve temporal data lend themselves well to interactive plotting and timeline animation.
我喜歡學習可用于完成任務的不同工具和技術。 當我決定探索有關COVID-19(冠狀病毒)的數據時,我知道我希望能夠以交互方式呈現可視化效果。 畢竟,每天跟蹤,監視和報告來自世界各地的冠狀病毒大流行。 涉及時間數據的數據科學和分析項目非常適合交互式繪圖和時間線動畫。
To support the desired interactive capabilities, notebooks for this project were composed in Deepnote, an online, Jupyter-style environment that enables the publishing of complete Python notebooks that retain interactive outputs. The Plotly Express library was used to produce interactive plot objects. Finally, the embedding of those individual visualizations in this article is made possible by the Datapane library for Python.
為了支持所需的交互功能,該項目的筆記本由Deepnote (一種在線Jupyter風格的環境)組成,可以發布保留交互輸出的完整Python筆記本。 Plotly Express庫用于生成交互式繪圖對象。 最后,通過Python的Datapane庫,可以在本文中嵌入這些單獨的可視化文件。
This article presents a brief overview of the project, including the following.
本文簡要介紹了該項目,包括以下內容。
- Motivations for the project 項目動機
- Methods of investigation 調查方法
- Summary highlights and representative, interactive plots 摘要亮點和代表性的互動情節
Note: While this article includes interactive examples of cell outputs from project notebooks, we will not be demonstrating any code. You can, however, find links to the related repository on Github, linked below.
注意:雖然本文包括項目筆記本中單元輸出的交互式示例,但我們不會演示任何代碼。 不過,您可以在Github上找到指向相關存儲庫的鏈接,如下所示。
概述和動機 (Overview and Motivation)
Effective July 1, 2020, the state of Virginia entered the third phase of the “Forward Virginia” plan to gradually ease restrictions in place for COVID-19. On July 28, additional restrictions were imposed on restaurants and bars in the Hampton Roads area of Southeastern Virginia (Schneider, Gregory S., Virginia governor adds restrictions in Hampton Roads region after surge in coronavirus cases (July 28, 2020). The Washington Post.).
從2020年7月1日起,弗吉尼亞州進入“ Forward Virginia”計劃的第三階段,以逐步放寬對COVID-19的限制。 7月28日,對東南弗吉尼亞州漢普頓路地區的餐館和酒吧施加了額外的限制( 弗吉尼亞州州長施奈德,格雷戈里S. 在冠狀病毒病例激增之后 (2020年7月28日) 在漢普頓路地區增加了限制 。 )。
This project is inspired in part by a subsequent interest in comparing the severity of later outbreaks, in the Hampton Roads region, with the number and proportion of cases in other areas of the state. In other words, in areas where cases, hospitalizations, or deaths were decreasing, were they higher or lower than in lately restricted areas?
該項目的部分靈感來自于后來的興趣,即比較漢普頓路地區后來爆發的嚴重程度與該州其他地區的病例數量和比例。 換句話說,在病例,住院或死亡人數減少的地區,它們比最近限制的地區高還是低?
Of course, the goal of the project was not to perform a full, medical study. Along with comparing aggregated case data for various localities, the project was strongly motivated by an interest in exploring the options we employ to publish relatively simple-but-informative, animated plots.
當然,該項目的目標不是進行完整的醫學研究。 除了比較各個地區的匯總案例數據外,該項目還受到了對探索我們用來發布相對簡單但內容豐富的動畫情節的選擇的興趣的強烈推動。

數據集 (The Datasets)
Coronavirus data for this exploration is sourced from the Virginia Department of Health (VDH). The particular copy of the Virginia public COVID-19 cases dataset used in this repository was last updated on July 30, 2020. VDH is itself a robust source of data and visualizations related to this health crisis. Their dataset continues to be updated regularly.
此次勘探的冠狀病毒數據來自弗吉尼亞衛生署 (VDH)。 此存儲庫中使用的弗吉尼亞州公共COVID-19病例數據集的特定副本最近一次更新是在2020年7月30日。VDH本身是與該健康危機相關的數據和可視化的可靠來源。 他們的數據集將繼續定期更新。
Each row in the dataset represents the overall count of COVID-19 cases, hospitalizations, and deaths for each locality in Virginia by report date since reporting began.
自報告開始以來,按報告日期,數據集中的每一行代表弗吉尼亞州每個地區的COVID-19病例,住院和死亡總數。
As we progress through the project, we bring in population data for additional context and insight.
隨著項目的進展,我們會引入人口數據以獲取更多背景信息和見解。
Population estimates data was sourced from the University of Virginia’s Weldon Cooper Center for Public Service Demographics Research Group, published on January 27, 2020. The group notes that estimates are population approximations “based on a variety of observed administrative record data, such as births, deaths, school enrollment, and residential housing construction.” The above-linked site happens to include a handy, interactive map that highlights a relevant row of population data as the cursor moves over the relevant locality segment.
人口估算數據來自弗吉尼亞大學韋爾頓·庫珀公共服務人口統計研究中心,該研究組于2020年1月27日發布。該組指出,估算值是“基于各種觀察到的行政記錄數據(例如出生,死亡,入學率和住宅建設。” 上面鏈接的站點碰巧包括一個方便的交互式地圖,當光標移到相關位置區域上時,該地圖突出顯示了相關的人口數據行。
方法 (Methods)
To gauge how the Hampton Roads numbers compare to other areas of Virginia, such as the state’s capital city of Richmond, this study primarily investigates data using interactive plotting. This approach enables visualization of data for multiple localities on a single figure, with the option to hover a cursor over the plot for detail.
為了評估漢普頓公路的數量與弗吉尼亞州其他地區(例如該州的首府里士滿)的比較,該研究主要使用交互式繪圖調查數據。 這種方法可以在單個圖形上可視化多個位置的數據,并可以選擇將光標懸停在圖形上以獲取詳細信息。

The covered time period spans between two-and-four months. We include a few static plots, for the ten localities with the highest reported numbers in each statistical area; but expecting readers to take-in multiple measures for multiple areas over 60–120 days, using only static plots, seemed like an unrealistic ask. Using interactive plots will help viewers quickly understand how the data changes over time or easily isolate features of the dataset at a particular point, within the context of a broader time frame.
涵蓋的時間跨度為兩到四個月。 對于每個統計區域中報告的數字最高的十個地區,我們包括一些靜態圖; 但是,希望讀者在60-120天之內僅使用靜態圖表,對多個區域采取多種措施,似乎是不切實際的要求。 使用交互式繪圖將幫助查看者快速了解數據隨時間的變化,或在較寬的時間范圍內輕松隔離特定點的數據集特征。
The project is not a predictive analysis. Instead, it serves a comparative purpose for a limited subset of relevant data. Of course, it is topical, as we move into the 2020–2021 school year and take into account the precautions required for a safe and effective educational environment.
該項目不是預測分析。 相反,它僅對相關數據的有限子集起到比較作用。 當然,這是熱門話題,因為我們進入2020-2021學年,并考慮到安全有效的教育環境所需的預防措施。
觀察結果 (Observations)
Let’s review some of our project discoveries:
讓我們回顧一下我們的一些項目發現:
- Cases in some Northern Virginia localities exceeded those in Southeastern Virginia localities, many times over. The Fairfax locality, to the west of Washington, D.C., exceeds Southeastern Virginia localities in total cases, hospitalizations, and deaths throughout our timeframe. Total hospitalizations in Fairfax between the middle of March and the end of July 2020, number 138,320. Chesapeake’s total for the same period is 11,378. 北弗吉尼亞州某些地區的病例數比維吉尼亞州東南部地區的病例數高出許多倍。 在整個時間范圍內,華盛頓特區以西的費爾法克斯地區在總病例,住院和死亡人數方面均超過弗吉尼亞東南地區。 截至3月中旬至2020年7月底,費爾法克斯的住院總人數為138,320。 切薩皮克在同一時期的總數為11,378。

- For a more balanced comparison, we narrow our broad, preliminary view to focus on the state capital of Richmond as it compares to select independent cities and counties of the Hampton Roads region. 為了更平衡地進行比較,我們將廣義的初步觀點縮小為集中在里士滿州首府,因為它與漢普頓路地區的選定獨立城市和縣進行了比較。

Note: Each of the following plot animations may be played by selecting the triangle at the start of the timeline.
注意:可以通過選擇時間軸開始處的三角形來播放以下每個情節動畫 。
- Among the localities of interest, Richmond led in total cases from March through July, when it then was surpassed by Norfolk and Virginia Beach. 在感興趣的地區中,從3月到7月,里士滿(Richmond)領導著所有案件,隨后被諾福克(Norfolk)和弗吉尼亞海灘(Virginia Beach)超越。
- An animated plot highlights that Richmond presented a greater number of hospitalizations due to Coronavirus, even as Virginia Beach eventually surpassed it for related cases and deaths. 動畫情節突出顯示,即使弗吉尼亞海灘因相關病例和死亡最終超過了里希蒙,也由于冠狀病毒而使里士滿住院的人數增加了。
- Similarly, Richmond reports a larger proportion of hospitalization and mortality per 1,000 of the population than each of the other localities, by the end of our timeline. 同樣,到我們的時間表結束時,里士滿報告的每千人中住院和死亡率的比例高于其他每個地方。
演練之前的后退 (A Step Back Before the Walkthrough)
We will break here.
我們將在這里休息。
This article previewed our process for working with Pandas datasets in Deepnote’s online, interactive notebook environment. We also explored using Plotly and Datapane, to create interactive plots that we were then able to embed in this article.
本文預覽了我們在Deepnote的在線交互式筆記本環境中使用Pandas數據集的過程。 我們還探索了如何使用Plotly和Datapane創建交互式圖,然后將其嵌入到本文中。
In addition to interactive deployment, the full project benefits from the following:
除了交互式部署,整個項目還可以從以下方面受益:
- The merging of multiple data sources into Pandas dataframes 將多個數據源合并到Pandas數據框中
- Transformation of raw data, for comparison as a proportion of the population 轉換原始數據,以便在總人口中進行比較
- The ability to be time-scaled and to limit or expand location scope 具有時間縮放能力以及限制或擴展位置范圍的能力
Though we avoided the use of interactive choropleth maps in this project, Plotly offers significant potential for including additional, geospatial analysis using state-or-county-level maps, lat./lon. coordinates, and/or geoJSON data
盡管在該項目中我們避免使用交互式的弧??度圖,但Plotly 具有很大的潛力 ,可以使用州/縣級地圖(緯度/經度)進行其他地理空間分析。 坐標和/或geoJSON數據
You can follow me, here, to be notified when I publish new articles. In the meantime, you can find code and links to interactive notebooks available on my Github repository.
您可以在這里關注我,以便在我發表新文章時得到通知。 同時,您可以在我的Github 存儲庫中找到代碼和指向交互式筆記本的鏈接。
翻譯自: https://medium.com/the-innovation/publishing-interactive-plots-86a637c9fb74
交互式和非交互式
本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。 如若轉載,請注明出處:http://www.pswp.cn/news/389969.shtml 繁體地址,請注明出處:http://hk.pswp.cn/news/389969.shtml 英文地址,請注明出處:http://en.pswp.cn/news/389969.shtml
如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!