vs顯示堆棧數據分析
A poor craftsman blames his tools. But if all you have is a hammer, everything looks like a nail.
一個可憐的工匠責怪他的工具。 但是,如果您只有一把錘子,那么一切看起來都像釘子。
It’s common for web developers or database adminstrators to refer to their “stack” of tools used to do the job, but I’ve never heard this moniker used for data analysts. So it got me thinking, what is the data analytics stack?
Web開發人員或數據庫管理員通常會引用他們的“堆棧”工具來完成這項工作,但是我從未聽說過這個用于數據分析師的綽號。 因此,我想到了什么是數據分析堆棧?
Data analysts make range of a wide variety of software, for a wide variety of tasks. When a solution comes up short, the focus ought not to be on “blaming” tools for their shortcomings, but on possessing alternatives and choosing a better one (or ones) for the given scenario.
數據分析人員可以使用各種各樣的軟件來完成各種各樣的任務。 當解決方案出現問題時,重點不應放在針對其缺點的“責備”工具上,而在于針對給定方案擁有替代方案并選擇更好的方案。
That is, it’s better to think of these tools as “slices” of the same stack to be used concurrently, rather than as misfits to be entirely discarded.
也就是說,最好將這些工具視為要同時使用的同一堆棧的“切片”,而不是被完全丟棄的不匹配項。
To imagine what the analytics stack might look like, I used the below data products Venn diagram, placing the logos of popular data analytics tools in their respective segments.
為了想象分析堆棧的外觀,我使用了以下數據產品維恩圖 ,將流行的數據分析工具的徽標放在各自的細分中。

After stepping back from my marked-up Venn diagram, four categories or “slices” of the stack appeared to me. Let’s get to them below; but first, a caveat.
從我標記的維恩圖退后,我看到了堆棧的四個類別或“切片”。 讓我們在下面找到它們; 但首先要注意。
保持供應商不可知 (Staying vendor agnostic)
Some vendors have packaged their own “stack” of tools for data analysis; for example, Microsoft’s Power Platform or Google Data Studio. I am keeping my overview of the stack vendor-agnostic.
一些供應商已經打包了自己的“堆棧”工具來進行數據分析。 例如Microsoft的Power Platform或Google Data Studio。 我保持對堆棧供應商不可知的概述。
While you may learn that some slices fit better together, it’s better to start with the context of what category to tool to use, when, rather than what vendor. I will, however, provide a brief industry landscape of these products below, and suggestions for future learning.
雖然您可能會發現某些部分可以更好地結合在一起,但最好從使用哪種工具,何時使用的類別而不是什么供應商的上下文開始。 但是,我將在下面提供這些產品的簡要行業概況,并為以后的學習提供建議。
試算表 (Spreadsheets)
Reports of the death of spreadsheets are greatly exaggerated. For their ease of use and flexibility, spreadsheets are an excellent choice for back-of-the-envelope calculations and prototyping.
電子表格死亡的報告被大大夸大了。 由于其易用性和靈活性,電子表格是進行封底計算和原型制作的絕佳選擇。
However, spreadsheets do have their limitations. They can lack data integrity, storage and delivery functionalities. These limitations are often what cause pundits to give spreadsheets their last rites. But this misses the point of “the stack” entirely — those tasks aren’t the proper context for spreadsheets in the first place.
但是,電子表格確實有其局限性。 它們可能缺乏數據完整性,存儲和交付功能。 這些局限性通常是導致專家給電子表格提供最新服務的原因。 但這完全錯過了“堆棧”的要點-這些任務最初并不是電子表格的適當上下文。
The major spreadsheet applications are Microsoft Excel and Google Sheets. I won’t tell you outright my preference, but you may find out if you follow me on social media for long.
主要的電子表格應用程序是Microsoft Excel和Google表格。 我不會直接告訴您我的偏好,但是您可能會發現您是否在社交媒體上長期關注我。
資料庫 (Databases)
Databases are a relatively ancient technology in the analytics space, but show no signs of slowing. They offer more reliable and extensible methods for data storage and integrity, but the actual analysis easily done directly inside databases is limited.
數據庫是分析領域中相對較舊的技術,但沒有絲毫放緩的跡象。 它們為數據存儲和完整性提供了更可靠和可擴展的方法,但是直接在數據庫內部輕松進行的實際分析受到限制。
Structured query language, or SQL, is the language used to interact with relational database management systems. While many SQL platforms exist, the types of read-only operations necessary for most data analysts won’t change across them.
結構化查詢語言或SQL,是用于與關系數據庫管理系統進行交互的語言。 盡管存在許多SQL平臺,但大多數數據分析師所需的只讀操作類型不會在它們之間發生變化。
For data analysts new to SQL, I suggest SQLite or Microsoft Access as lightweight tools for learning SQL.
對于不熟悉SQL的數據分析師,我建議使用SQLite或Microsoft Access作為學習SQL的輕量級工具。
商業智能和儀表板平臺 (Business intelligence & dashboard platforms)
This is a broad swathe of tools and it’s likely the most ambiguous slice of the stack, but here I mean enterprise tools that allow users to gather, model and display data.
這是各種各樣的工具,可能是堆棧中最模糊的部分,但是這里我指的是允許用戶收集,建模和顯示數據的企業工具。
Data warehousing tools like MicroStrategy and SAP BusinessObjects straddle the line here, since they are tools designed for self-service data gathering and analysis. But these often have limited visualization and iteractive report-building included.
諸如MicroStrategy和SAP BusinessObjects之類的數據倉庫工具是這里的佼佼者,因為它們是設計用于自助數據收集和分析的工具。 但是,這些方法通常在可視化和有限的報表生成方面受到限制。
That’s where tools like Power BI, Tableau and Looker come in. These tools allow users to build data models, dashboards and reports with minimal coding. Importantly, they make it easy to disseminate and update information across an organization.
這就是諸如Power BI,Tableau和Looker之類的工具出現的地方。這些工具允許用戶以最少的代碼構建數據模型,儀表板和報告。 重要的是,它們使在整個組織中傳播和更新信息變得容易。
However, these tools tend to be inflexible in the way they handle and visualize data. They can also be expensive, with single-user annual licenses running several hundred or even thousands of dollars.
但是,這些工具在處理和可視化數據方面往往缺乏靈活性。 它們也可能很昂貴,單用戶年度許可證要花費數百甚至數千美元。
數據編程語言 (Data programming languages)
While many vendor tools are moving to a place where coding is not as essential to the data workflow, I still think it’s a good idea to learn programming. This helps sharpen understanding of how data processing works, and gives users fuller control of their workflow over using a graphical user interface (GUI).
盡管許多供應商工具正在遷移到編碼對數據工作流不那么重要的地方,但我仍然認為學習編程是一個好主意。 這有助于加深對數據處理方式的理解,并通過圖形用戶界面(GUI)使用戶對他們的工作流程有更全面的控制。
For data analytics, two open-source programming language are good fits: R and Python. Each include a dizzying universe of free packages made to help with everything from social media automation to geospatial analysis. Learning these tools also opens the door to advanced analytics and data science.
對于數據分析,兩種開源編程語言非常適合:R和Python。 每個軟件包都包含令人眼花of亂的免費軟件包,可幫助您處理從社交媒體自動化到地理空間分析的所有問題。 學習這些工具還為高級分析和數據科學打開了一扇門。
However, this slice could have the steepest learning curve in the stack, and many analysts may struggle to see the benefit of learning to code, when they can do most of what they need easily enough from a GUI.
但是,這部分可能是堆棧中最陡峭的學習曲線,并且當他們可以從GUI輕松地完成大部分所需工作時,許多分析師可能很難看到學習編碼的好處。
不分好壞,只是有所不同 (Not better or worse, just different)
Seen in the light of a “stack,” it makes little sense to compare any of these slices, or claim one as inferior than the other. They are meant to be complementary.
從“堆棧”的角度來看,比較這些切片中的任何切片,或聲稱其中一個切片的質量低于另一個切片,都沒有什么意義。 它們是互補的。
Data analysts often wonder which tool they should focus on learning or becoming the expert in. I would suggest not becoming the expert in any single one, but in learning each slice of the stack well enough to contextualize and choose between them.
數據分析人員經常想知道應該專注于學習或成為專家的工具。我建議不要成為任何一個專家,而是要充分學習堆棧的每個部分以進行上下文關聯并在它們之間進行選擇。
進入堆棧 (Entering the stack)
Learning one data tool is daunting. Learning a whole “stack” of them can seem impossible. However, this cross-training can expedite growth, as connections are made across platforms in how to use data effectively.
學習一種數據工具令人生畏。 學習整個“堆棧”似乎是不可能的。 但是,由于跨平臺建立了如何有效使用數據的聯系,因此這種交叉訓練可以加快增長。
What data tools do you use? How do you fit together? Other thoughts on the idea of an “analytics stack?” Let’s discuss in the comments.
您使用什么數據工具? 你們如何在一起? 關于“分析堆棧”的其他想法? 讓我們在評論中進行討論。
Originally published at https://georgejmount.com on August 8, 2020.
最初于 2020年8月8日 發布在 https://georgejmount.com 上。
翻譯自: https://medium.com/@georgemount/what-is-the-data-analytics-stack-7c87e4d4c2e
vs顯示堆棧數據分析
本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。 如若轉載,請注明出處:http://www.pswp.cn/news/388559.shtml 繁體地址,請注明出處:http://hk.pswp.cn/news/388559.shtml 英文地址,請注明出處:http://en.pswp.cn/news/388559.shtml
如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!