您是六個主要數據角色中的哪一個

When you were growing up, did you ever play the name game? The modern data organization has something similar, and it’s called the “Bad Data Blame Game.” Unlike the name game, however, the Bad Data Blame Game is played when data downtime strikes and no amount of rhyming and dancing can save the day.

當您長大時,您玩過名字游戲嗎? 現代的數據組織也有類似的東西,它被稱為“不良數據責備游戲”。 但是,與名稱游戲不同,當數據停機時會玩壞數據責備游戲 罷工,沒有任何押韻和跳舞可以挽救一天。

Data downtime refers to periods of time when your data is partial, erroneous, missing, or otherwise inaccurate, and nine times out of ten, you have no idea what caused it. All you know is that it’s 3 a.m., your CEO is pissed, your dashboards are wrong, and you need to fix it — stat.

數據停機時間是指數據不完整,錯誤,丟失或以其他方式不準確的時間段,十分之九,您不知道是什么原因造成的。 您所知道的是現在是凌晨3點,您的CEO生氣,您的儀表板錯誤,您需要修復它-統計信息。

After speaking to over 200 data teams, we’ve identified the major data personas involved in the Bad Data Blame Game. Maybe you recognize one or two?

在與200多個數據團隊進行了交談之后,我們已經確定了Bad Data Blame游戲中涉及的主要數據角色。 也許您認識一兩個?

In this article, we’ll introduce these roles, zero in on their hopes, dreams, and fears, and share our approach to conquering data reliability at your company.

在本文中,我們將介紹這些角色,零落他們的希望,夢想和恐懼,并分享我們在貴公司征服數據可靠性的方法。

首席數據官 (Chief Data Officer)

Image for post
Image courtesy of Javier Sierra on Unsplash.
圖片由 哈維爾·塞拉 ( Javier Sierra) 在《 Unsplash》 提供

This is Ophelia, your Chief Data Officer (CDO). Although she’s probably not (wo)manning your company’s data pipelines or Looker dashboards, Ophelia’s impact is hitched to the consistency, accuracy, relevance, interpretability, and reliability of the data her team provides.

這是Ophelia,您的首席數據官(CDO)。 盡管Ophelia可能不負責公司的數據管道或Looker儀表板,但其團隊的影響力在于一致性,準確性,相關性,可解釋性和可靠性。

Ophelia wakes up every day and asks herself two things. First, are different departments getting the data they need to be effective? And second, are we managing risk around that data effectively?

奧菲莉亞每天醒來,問自己兩件事。 首先,不同部門是否在獲取有效數據? 其次,我們是否圍繞該數據有效地管理風險?

She would sleep much easier with a clear, bird’s-eye view showing that her data ecosystem is operating as it should. At the end of the day, if bad data gets in front of the CEO, out to the public, or to any other data consumer, she’s on the line.

通過清晰的鳥瞰圖可以看出她的數據生態系統正在按預期運行,從而使她的睡眠更加輕松。 歸根結底,如果不良數據出現在CEO面前,向公眾或其他任何數據消費者傳播,那么她就可以上線了。

商業智能分析師 (Business Intelligence Analyst)

Image for post
Image courtesy of 圖片由 Christina克里斯蒂娜 on UnsplashUnsplash.

Betty, the business intelligence lead or data analyst, wants a punchy and insightful dashboard she can share with her stakeholders in marketing, sales, and operations to answer their multifarious questions about how their business functions are performing. When things go wrong at the practitioner-level, Betty is the first one paged.

商業智能主管或數據分析師Betty希望她能與市場,銷售和運營部門的利益相關者共享一個強大而有見地的儀表板,以回答他們有關其業務功能如何執行的各種問題。 當從業者層面出現問題時,Betty是第一頁。

To ensure reliable data, she needs to answer these questions:

為了確保數據可靠,她需要回答以下問題:

  • Are we translating data into metrics and insights that are meaningful to the business?

    我們是否將數據轉換為對業務有意義的指標和見解?
  • Are we confident that the data is reliable and means what we think it means?

    我們是否有信心數據可靠并能代表我們認為的意義?
  • Is it easy for others to access and understand these insights?

    其他人是否容易獲得和理解這些見解?

Null values and duplicated entries are Betty’s arch-nemeses and she’s a fan of anything that can prevent data downtime from compromising her peace of mind. She’s fatigued by business stakeholders that ask her to investigate a funny value in a report — it’s a long process to chase the data upstream and validate if it’s right!

空值和重復的條目是Betty的主要敵人,她是任何可以防止數據停機影響她安心的事物的支持者。 她對業務涉眾感到疲倦,他們要求她調查報告中的一個有趣的值-追逐上游數據并驗證是否正確是一個漫長的過程!

數據科學家 (Data Scientist)

Image for post
Image courtesy of Tim van der Kuip on Unsplash.
圖片由 Tim van der Kuip Unsplash 提供

Sam, the data scientist, studied Forestry in undergrad, but decided to make the jump to industry to pay off his student loans. Somewhere between a line of Python code and a data visualization, he fell in love with data science. And the rest was history.

數據科學家薩姆(Sam)在本科學習了林業,但是他決定跳入工業界以償還學生貸款。 在一段Python代碼和數據可視化之間的某個地方,他愛上了數據科學。 剩下的就是歷史了。

To do his job well, Sam needs to know 1) where the data comes from and 2) that it’s reliable, because if it’s not, his team’s A/B tests won’t work and all downstream consumers (analysts, managers, executives, and customers) will suffer.

為了做好自己的工作,Sam需要知道1)數據來自何處以及2)可靠,因為如果不可靠,他的團隊的A / B測試將無法正常工作,并且所有下游消費者(分析師,經理,管理人員,和客戶)將遭受損失。

Sam’s team spends roughly 80 percent of their time scrubbing, cleaning, and understanding the context of the data, so they need tools and solutions that can make their lives easier.

Sam的團隊花費了大約80%的時間來清理,清理和理解數據的上下文,因此他們需要可以簡化生活的工具和解決方案。

數據治理主管 (Data Governance Lead)

Image for post
Image courtesy of 圖片由 GAGA on UnsplashUnsplash中提供.

Proud owner of a seven-month old puppy, Gerald is the company’s very first data governance specialist. He started off on the legal team, and then, when GDPR and CCPA entered the picture, eventually focused his efforts exclusively on data compliance. It’s a novel role, but becoming increasingly important as the organization grows.

杰拉爾德(Gerald)驕傲地擁有一只七個月大的小狗,是公司的第一位數據治理專家。 他開始加入法律團隊,然后當GDPR和CCPA介入時,最終將他的工作完全集中在數據合規性上。 這是一個新穎的角色,但隨著組織的發展而變得越來越重要。

When it comes to data reliability, Gerald cares about 1) unified definitions of data and metrics across the company and 2) understanding who has access and visibility to what data.

關于數據可靠性,Gerald關注的是:1)公司中數據和指標的統一定義,以及2)了解誰可以訪問和查看哪些數據。

For Gerald, bad data can mean costly fines, erosion of customer trust, and lawsuits. Despite the criticality of his role, he sometimes jests that it’s like accounting: “you’re only front and center if something has gone wrong!”

對于杰拉爾德(Gerald)而言,不良數據可能意味著高昂的罰款,客戶信任度的下降以及訴訟。 盡管他扮演的角色很關鍵,但有時他還是開玩笑說這就像會計:“如果出了問題,您只會處于中心位置!”

數據工程師 (Data Engineer)

Image for post
Image courtesy of 圖片由 Christina克里斯蒂娜 on UnsplashUnsplash.

When it comes to data reliability, Emerson, the data engineer, is at the crux of the equation.

在數據可靠性方面,數據工程師艾默生(Emerson)處于關鍵所在。

Emerson started out as a full-stack developer at a small e-commerce startup, but then as the company grew, so too did their data needs. Before she knew it, she was responsible not just for building their data product but also integrating the data sources the team relies on to make decisions about the business. Now, she’s a Snowflake expert, PowerBI guru, and general data tooling whiz.

Emerson最初是一家小型電子商務初創公司的全棧開發人員,但是隨著公司的發展,他們的數據需求也隨之增長。 在不知不覺中,她不僅負責構建其數據產品,還負責集成團隊用來制定業務決策所依賴的數據源。 現在,她是Snowflake專家,PowerBI專家和通用數據工具專家。

Emerson and her team are the glue that hold the company’s data ecosystem together. They implement technologies that monitor the reliability of their company’s data, and if something goes awry, she’s the one whose paged by the analytics team at 3 a.m. to fix it. Like Betty, she’s lost countless hours of sleep because of this.

艾默生和她的團隊是將公司數據生態系統整合在一起的粘合劑。 他們采用的技術可以監控公司數據的可靠性,如果出現問題,分析小組會在凌晨3點對她進行修復。 像貝蒂一樣,她因此失去了數小時的睡眠。

To be successful at her job, Emerson must tackle a lot of things, including:

為了在工作中取得成功,艾默生必須處理很多事情,包括:

  • Designing a data platform solution that scales

    設計可擴展的數據平臺解決方案
  • Ensuring that data ingestion is reliable

    確保數據提取可靠
  • Making the platform accessible to other teams

    使其他團隊可以訪問該平臺
  • Being able to fix data downtime quickly when it happens

    能夠在發生故障時快速修復數據停機
  • And above all else, making life sustainable for the entire data organization

    最重要的是,使整個數據組織的生命可持續

數據產品經理 (Data Product Manager)

Image for post
Image courtesy of Elizeu Dias on Unsplash.
圖片由 Elizeu Dias 提供, 內容為 Unsplash

This is Peter. He’s a data product manager. Peter got his start as a back-end developer, but made the jump to product management a few years ago. Like Gerald, he’s the company’s first-ever hire in this role, which is simultaneously exciting and challenging.

這是彼得。 他是數據產品經理。 Peter最初是一名后端開發人員,但幾年前就跳槽到產品管理領域。 和杰拉爾德一樣,他是公司有史以來第一位擔任此職位的人,這既令人興奮又充滿挑戰。

He’s up to date on all the latest data engineering and data analytics solutions, and is often called upon to make decisions on what offerings his organization needs to invest in to be successful. He knows firsthand how automation and self-serve tooling make all the difference when it comes to delivering an accessible, scalable data product.

他了解所有最新的數據工程和數據分析解決方案,并且經常被要求就其組織為成功需要投資哪些產品做出決策。 他直接了解自動化和自助服務工具在交付可訪問的,可擴展的數據產品方面如何發揮作用。

All other data stakeholders, from analysts to social media managers, are dependent on him for building a platform that ingests, unifies, and makes accessible data from a myriad of sources to consumers all over the business. Oh, and did we mention that this data must be compliant with GDPR, CCPA, and other industry regulations? It’s a challenging role and it’s difficult to keep everyone happy — it seems like his platform is always one transformation away from what BI actually wanted.

從分析師到社交媒體經理的所有其他數據利益相關者都依賴他來構建一個平臺,該平臺可從眾多來源向整個企業的消費者提取,統一并提供可訪問的數據。 哦,我們是否提到過這些數據必須符合GDPR,CCPA和其他行業法規? 這是一個具有挑戰性的角色,很難讓每個人都開心–看來他的平臺始終是BI 真正想要的一種轉變。

誰負責數據可靠性? (Who is responsible for data reliability?)

So, who in your data organization owns the reliability piece of your data ecosystem?

那么,您的數據組織中誰擁有數據生態系統的可靠性?

As you can imagine, the answer isn’t simple. From your company’s CDO to your data engineers, it’s ultimately everyone’s responsibility to ensure data reliability. And although nearly every arm of every organization at every company relies on data, not every data team has the same structure, and various industries have different requirements. (For instance, it’s the norm for financial institutions to hire entire teams of data governance experts, but at a small startup, not so much. And for those startups that do — we commend you!)

您可以想象,答案并不簡單。 從公司的CDO到數據工程師,確保數據可靠性最終都是每個人的責任。 盡管每個公司的每個組織的幾乎每個部門都依賴數據,但并非每個數據團隊都具有相同的結構,并且各個行業都有不同的要求。 (例如,對于金融機構而言,雇用整個團隊的數據治理專家是正常的做法,但是對于一家小型初創公司而言,聘請的人數不多。對于那些從事此類工作的初創公司,我們表示贊賞!)

Below, we outline our approach to mapping data responsibilities, from accessibility to reliability, across your data organization using the RACI (Responsible, Accountable, Consulted, and Informed) matrix guidelines:

下面,我們概述了使用RACI (負責,負責,咨詢和知情)矩陣準則在整個數據組織中映射數據職責(從可訪問性到可靠性)的方法:

Image for post
Diagram courtesy of Monte Carlo.
圖由蒙地卡羅 ( Monte Carlo)提供 。

At companies that ingest and transform terabytes of data (like Netflix or Uber), we’ve found that it is common for data engineers and data product managers to tackle the responsibility of monitoring and alerting for data reliability issues.

在攝取和轉換TB級數據的公司(例如Netflix或Uber ),我們發現數據工程師和數據產品經理通常要承擔監視和警告數據可靠性問題的責任。

Barring these behemoths, the responsibility often falls on data engineers and product managers. They must balance the organization’s demand for data with what can be provided reliably. Notably, the brunt of any bad choices made here is often borne by the BI analysts, whose dashboards may wind up containing bad information or break from silent changes. In very early data organizations, these roles are often combined into a jack-of-all-trades data person or a product manager.

除這些龐然大物之外,責任通常落在數據工程師和產品經理身上。 他們必須在組織對數據的需求與可靠提供的數據之間取得平衡。 值得注意的是,BI分析師通常會在這里做出任何錯誤選擇,其中BI分析師的儀表板可能最終包含錯誤的信息或無法進行靜默更改。 在非常早期的數據組織中,這些角色通常被組合為萬事通數據人員或產品經理。

Regardless of your team’s situation, you’re not alone.

無論您的團隊情況如何,您都不是一個人。

Fortunately, there’s a better way to start trusting your data: data observability. It’s an approach that’s taking off with most innovative companies, no matter who is ultimately responsible for ensuring data reliability in your organization.

幸運的是,有一種更好的方式開始信任您的數據: 數據可觀察性 。 無論誰最終負責確保組織中數據的可靠性,這種方法在大多數創新型公司中都在流行。

In fact, with the right data reliability strategy, the Bad Data Blame Game is a thing of the past and full end-to-end observability is in sight.

實際上,有了正確的數據可靠性策略,Bad Data Blame Game已經成為過去,并且可以看到完整的端到端可觀察性。

Interested in learning more? Reach out to Barr Moses, Will Robins, and the rest of the Monte Carlo team.

有興趣了解更多嗎? Barr Moses Will Robins 蒙特卡洛團隊 的其他 成員接觸

This article was written by Barr Moses and Will Robins.

本文由 Barr Moses Will Robins 撰寫

翻譯自: https://towardsdatascience.com/which-of-the-six-major-data-personas-are-you-8dbf434b7c9e

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/news/389787.shtml
繁體地址,請注明出處:http://hk.pswp.cn/news/389787.shtml
英文地址,請注明出處:http://en.pswp.cn/news/389787.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

命令查看linux主機配置

查看cpu: # 總核數 物理CPU個數 X 每顆物理CPU的核數 # 總邏輯CPU數 物理CPU個數 X 每顆物理CPU的核數 X 超線程數# 查看物理CPU個數 cat /proc/cpuinfo| grep "physical id"| sort| uniq| wc -l# 查看每個物理CPU中core的個數(即核數) cat /proc/cpui…

C#中全局處理異常方式

using System; using System.Configuration; using System.Text; using System.Windows.Forms; using ZB.QueueSys.Common;namespace ZB.QueueSys {static class Program{/// <summary>/// 應用程序的主入口點。/// </summary>[STAThread]static void Main(){Appli…

5911. 模擬行走機器人 II

5911. 模擬行走機器人 II 給你一個在 XY 平面上的 width x height 的網格圖&#xff0c;左下角 的格子為 (0, 0) &#xff0c;右上角 的格子為 (width - 1, height - 1) 。網格圖中相鄰格子為四個基本方向之一&#xff08;“North”&#xff0c;“East”&#xff0c;“South”…

自定義按鈕動態變化_新聞價值的變化定義

自定義按鈕動態變化I read Bari Weiss’ resignation letter from the New York Times with some perplexity. In particular, I found her claim that she “was hired with the goal of bringing in voices that would not otherwise appear in your pages” a bit strange: …

Linux記錄-TCP狀態以及(TIME_WAIT/CLOSE_WAIT)分析(轉載)

1.TCP握手定理 2.TCP狀態 l CLOSED&#xff1a;初始狀態&#xff0c;表示TCP連接是“關閉著的”或“未打開的”。 l LISTEN &#xff1a;表示服務器端的某個SOCKET處于監聽狀態&#xff0c;可以接受客戶端的連接。 l SYN_RCVD &#xff1a;表示服務器接收到了來自客戶端請求…

677. 鍵值映射

677. 鍵值映射 實現一個 MapSum 類&#xff0c;支持兩個方法&#xff0c;insert 和 sum&#xff1a; MapSum() 初始化 MapSum 對象 void insert(String key, int val) 插入 key-val 鍵值對&#xff0c;字符串表示鍵 key &#xff0c;整數表示值 val 。如果鍵 key 已經存在&am…

算法 從 數中選出_算法可以選出勝出的nba幻想選秀嗎

算法 從 數中選出Note from Towards Data Science’s editors: While we allow independent authors to publish articles in accordance with our rules and guidelines, we do not endorse each author’s contribution. You should not rely on an author’s works without …

jQuery表單校驗

小小Demo&#xff1a; <script>$(function () {//給username綁定失去焦點事件$("#username").blur(function () {//得到username文本框的值var nameValue $(this).val();//每次清除數據$("table font:first").remove();//校驗username是否合法if (n…

5912. 每一個查詢的最大美麗值

5912. 每一個查詢的最大美麗值 給你一個二維整數數組 items &#xff0c;其中 items[i] [pricei, beautyi] 分別表示每一個物品的 價格 和 美麗值 。 同時給你一個下標從 0 開始的整數數組 queries 。對于每個查詢 queries[j] &#xff0c;你想求出價格小于等于 queries[j] …

django-rest-framework第一次使用使用常見問題

2019獨角獸企業重金招聘Python工程師標準>>> 記錄在第一次使用django-rest-framework框架使用時遇到的問題&#xff0c;為了便于理解在這里創建了Person和Grade這兩個model from django.db import models class Person(models.Model):SHIRT_SIZES ((S, Small),(M, …

插入腳注把腳注標注刪掉_地獄司機不應該只是英國電影歷史數據中的腳注,這說明了為什么...

插入腳注把腳注標注刪掉Cowritten by Andie Yam由安迪(Andie Yam)撰寫 Hell Drivers”, 1957地獄司機 》電影海報 Data visualization is a great way to celebrate our favorite pieces of art as well as reveal connections and ideas that were previously invisible. Mor…

vue之axios 登陸驗證及數據獲取

登陸驗證&#xff0c;獲取token methods:{callApi () {var vm thisvm.msg vm.result //驗證地址vm.loginUrl http://xxx///查詢地址vm.apiUrl http://yyy/vm.loginModel {username: 你的用戶名,password: 你的密碼,// grant_type: password,}//先獲取 tokenaxios.post(v…

5926. 買票需要的時間

5926. 買票需要的時間 有 n 個人前來排隊買票&#xff0c;其中第 0 人站在隊伍 最前方 &#xff0c;第 (n - 1) 人站在隊伍 最后方 。 給你一個下標從 0 開始的整數數組 tickets &#xff0c;數組長度為 n &#xff0c;其中第 i 人想要購買的票數為 tickets[i] 。 每個人買票…

貝葉斯統計 傳統統計_統計貝葉斯如何補充常客

貝葉斯統計 傳統統計For many years, academics have been using so-called frequentist statistics to evaluate whether experimental manipulations have significant effects.多年以來&#xff0c;學者們一直在使用所謂的常客統計學來評估實驗操作是否具有significant效果。…

吳恩達機器學習+林軒田機器學習+高等數學和線性代數等視頻領取

機器學習一直是一個熱門的領域。這次小編應大家需求&#xff0c;整理了許多相關學習視頻和書籍。本次分享包含&#xff1a;臺灣大學林軒田老師的【機器學習基石】和【機器學習技法】視頻教學、吳恩達老師的機器學習分享、徐小湛的高等數學和線性代數視頻&#xff0c;還有相關機…

saltstack二

配置管理 haproxy的安裝部署 haproxy各版本安裝包下載路徑https://www.haproxy.org/download/1.6/src/&#xff0c;跳轉地址為http&#xff0c;改為https即可 創建相關目錄 # 創建配置目錄 [rootlinux-node1 ~]# mkdir /srv/salt/prod/pkg/ [rootlinux-node1 ~]# mkdir /srv/sa…

319. 燈泡開關

319. 燈泡開關 初始時有 n 個燈泡處于關閉狀態。第一輪&#xff0c;你將會打開所有燈泡。接下來的第二輪&#xff0c;你將會每兩個燈泡關閉一個。 第三輪&#xff0c;你每三個燈泡就切換一個燈泡的開關&#xff08;即&#xff0c;打開變關閉&#xff0c;關閉變打開&#xff0…

如何生成隨機不重復的11位數字

要求 不重復隨機11位數字不占存儲我們都知道11位數字(random)對應有最大值max和最小值min99999999999和10000000000.很簡單的從最小值開始按順序分發到最大值&#xff0c;就滿足了不重復&#xff0c;不占存儲&#xff0c;11位數字的特性。那么接下來就要考慮如何生成隨機數字這…

因為你的電腦安裝了即點即用_即你所愛

因為你的電腦安裝了即點即用Data visualization is a great way to celebrate our favorite pieces of art as well as reveal connections and ideas that were previously invisible. More importantly, it’s a fun way to connect things we love — visualizing data and …

關于前端緩存問題

Cookie、localStorage、sessionStorage的異同 之前沒怎接觸過前端緩存&#xff0c;請教了前端同事之后他給我粘了幾行代碼&#xff0c;用localStorage存取信息&#xff0c;后來老大review代碼的時候發現&#xff0c;被批了一頓&#xff0c;現在好好看看這幾個前端緩存的區別&am…