When you were growing up, did you ever play the name game? The modern data organization has something similar, and it’s called the “Bad Data Blame Game.” Unlike the name game, however, the Bad Data Blame Game is played when data downtime strikes and no amount of rhyming and dancing can save the day.
當您長大時,您玩過名字游戲嗎? 現代的數據組織也有類似的東西,它被稱為“不良數據責備游戲”。 但是,與名稱游戲不同,當數據停機時會玩壞數據責備游戲 罷工,沒有任何押韻和跳舞可以挽救一天。
Data downtime refers to periods of time when your data is partial, erroneous, missing, or otherwise inaccurate, and nine times out of ten, you have no idea what caused it. All you know is that it’s 3 a.m., your CEO is pissed, your dashboards are wrong, and you need to fix it — stat.
數據停機時間是指數據不完整,錯誤,丟失或以其他方式不準確的時間段,十分之九,您不知道是什么原因造成的。 您所知道的是現在是凌晨3點,您的CEO生氣,您的儀表板錯誤,您需要修復它-統計信息。
After speaking to over 200 data teams, we’ve identified the major data personas involved in the Bad Data Blame Game. Maybe you recognize one or two?
在與200多個數據團隊進行了交談之后,我們已經確定了Bad Data Blame游戲中涉及的主要數據角色。 也許您認識一兩個?
In this article, we’ll introduce these roles, zero in on their hopes, dreams, and fears, and share our approach to conquering data reliability at your company.
在本文中,我們將介紹這些角色,零落他們的希望,夢想和恐懼,并分享我們在貴公司征服數據可靠性的方法。
首席數據官 (Chief Data Officer)
This is Ophelia, your Chief Data Officer (CDO). Although she’s probably not (wo)manning your company’s data pipelines or Looker dashboards, Ophelia’s impact is hitched to the consistency, accuracy, relevance, interpretability, and reliability of the data her team provides.
這是Ophelia,您的首席數據官(CDO)。 盡管Ophelia可能不負責公司的數據管道或Looker儀表板,但其團隊的影響力在于一致性,準確性,相關性,可解釋性和可靠性。
Ophelia wakes up every day and asks herself two things. First, are different departments getting the data they need to be effective? And second, are we managing risk around that data effectively?
奧菲莉亞每天醒來,問自己兩件事。 首先,不同部門是否在獲取有效數據? 其次,我們是否圍繞該數據有效地管理風險?
She would sleep much easier with a clear, bird’s-eye view showing that her data ecosystem is operating as it should. At the end of the day, if bad data gets in front of the CEO, out to the public, or to any other data consumer, she’s on the line.
通過清晰的鳥瞰圖可以看出她的數據生態系統正在按預期運行,從而使她的睡眠更加輕松。 歸根結底,如果不良數據出現在CEO面前,向公眾或其他任何數據消費者傳播,那么她就可以上線了。
商業智能分析師 (Business Intelligence Analyst)
Betty, the business intelligence lead or data analyst, wants a punchy and insightful dashboard she can share with her stakeholders in marketing, sales, and operations to answer their multifarious questions about how their business functions are performing. When things go wrong at the practitioner-level, Betty is the first one paged.
商業智能主管或數據分析師Betty希望她能與市場,銷售和運營部門的利益相關者共享一個強大而有見地的儀表板,以回答他們有關其業務功能如何執行的各種問題。 當從業者層面出現問題時,Betty是第一頁。
To ensure reliable data, she needs to answer these questions:
為了確保數據可靠,她需要回答以下問題:
- Are we translating data into metrics and insights that are meaningful to the business? 我們是否將數據轉換為對業務有意義的指標和見解?
- Are we confident that the data is reliable and means what we think it means? 我們是否有信心數據可靠并能代表我們認為的意義?
- Is it easy for others to access and understand these insights? 其他人是否容易獲得和理解這些見解?
Null values and duplicated entries are Betty’s arch-nemeses and she’s a fan of anything that can prevent data downtime from compromising her peace of mind. She’s fatigued by business stakeholders that ask her to investigate a funny value in a report — it’s a long process to chase the data upstream and validate if it’s right!
空值和重復的條目是Betty的主要敵人,她是任何可以防止數據停機影響她安心的事物的支持者。 她對業務涉眾感到疲倦,他們要求她調查報告中的一個有趣的值-追逐上游數據并驗證是否正確是一個漫長的過程!
數據科學家 (Data Scientist)
Sam, the data scientist, studied Forestry in undergrad, but decided to make the jump to industry to pay off his student loans. Somewhere between a line of Python code and a data visualization, he fell in love with data science. And the rest was history.
數據科學家薩姆(Sam)在本科學習了林業,但是他決定跳入工業界以償還學生貸款。 在一段Python代碼和數據可視化之間的某個地方,他愛上了數據科學。 剩下的就是歷史了。
To do his job well, Sam needs to know 1) where the data comes from and 2) that it’s reliable, because if it’s not, his team’s A/B tests won’t work and all downstream consumers (analysts, managers, executives, and customers) will suffer.
為了做好自己的工作,Sam需要知道1)數據來自何處以及2)可靠,因為如果不可靠,他的團隊的A / B測試將無法正常工作,并且所有下游消費者(分析師,經理,管理人員,和客戶)將遭受損失。
Sam’s team spends roughly 80 percent of their time scrubbing, cleaning, and understanding the context of the data, so they need tools and solutions that can make their lives easier.
Sam的團隊花費了大約80%的時間來清理,清理和理解數據的上下文,因此他們需要可以簡化生活的工具和解決方案。
數據治理主管 (Data Governance Lead)
Proud owner of a seven-month old puppy, Gerald is the company’s very first data governance specialist. He started off on the legal team, and then, when GDPR and CCPA entered the picture, eventually focused his efforts exclusively on data compliance. It’s a novel role, but becoming increasingly important as the organization grows.
杰拉爾德(Gerald)驕傲地擁有一只七個月大的小狗,是公司的第一位數據治理專家。 他開始加入法律團隊,然后當GDPR和CCPA介入時,最終將他的工作完全集中在數據合規性上。 這是一個新穎的角色,但隨著組織的發展而變得越來越重要。
When it comes to data reliability, Gerald cares about 1) unified definitions of data and metrics across the company and 2) understanding who has access and visibility to what data.
關于數據可靠性,Gerald關注的是:1)公司中數據和指標的統一定義,以及2)了解誰可以訪問和查看哪些數據。
For Gerald, bad data can mean costly fines, erosion of customer trust, and lawsuits. Despite the criticality of his role, he sometimes jests that it’s like accounting: “you’re only front and center if something has gone wrong!”
對于杰拉爾德(Gerald)而言,不良數據可能意味著高昂的罰款,客戶信任度的下降以及訴訟。 盡管他扮演的角色很關鍵,但有時他還是開玩笑說這就像會計:“如果出了問題,您只會處于中心位置!”
數據工程師 (Data Engineer)
When it comes to data reliability, Emerson, the data engineer, is at the crux of the equation.
在數據可靠性方面,數據工程師艾默生(Emerson)處于關鍵所在。
Emerson started out as a full-stack developer at a small e-commerce startup, but then as the company grew, so too did their data needs. Before she knew it, she was responsible not just for building their data product but also integrating the data sources the team relies on to make decisions about the business. Now, she’s a Snowflake expert, PowerBI guru, and general data tooling whiz.
Emerson最初是一家小型電子商務初創公司的全棧開發人員,但是隨著公司的發展,他們的數據需求也隨之增長。 在不知不覺中,她不僅負責構建其數據產品,還負責集成團隊用來制定業務決策所依賴的數據源。 現在,她是Snowflake專家,PowerBI專家和通用數據工具專家。
Emerson and her team are the glue that hold the company’s data ecosystem together. They implement technologies that monitor the reliability of their company’s data, and if something goes awry, she’s the one whose paged by the analytics team at 3 a.m. to fix it. Like Betty, she’s lost countless hours of sleep because of this.
艾默生和她的團隊是將公司數據生態系統整合在一起的粘合劑。 他們采用的技術可以監控公司數據的可靠性,如果出現問題,分析小組會在凌晨3點對她進行修復。 像貝蒂一樣,她因此失去了數小時的睡眠。
To be successful at her job, Emerson must tackle a lot of things, including:
為了在工作中取得成功,艾默生必須處理很多事情,包括:
- Designing a data platform solution that scales 設計可擴展的數據平臺解決方案
- Ensuring that data ingestion is reliable 確保數據提取可靠
- Making the platform accessible to other teams 使其他團隊可以訪問該平臺
- Being able to fix data downtime quickly when it happens 能夠在發生故障時快速修復數據停機
- And above all else, making life sustainable for the entire data organization 最重要的是,使整個數據組織的生命可持續
數據產品經理 (Data Product Manager)
This is Peter. He’s a data product manager. Peter got his start as a back-end developer, but made the jump to product management a few years ago. Like Gerald, he’s the company’s first-ever hire in this role, which is simultaneously exciting and challenging.
這是彼得。 他是數據產品經理。 Peter最初是一名后端開發人員,但幾年前就跳槽到產品管理領域。 和杰拉爾德一樣,他是公司有史以來第一位擔任此職位的人,這既令人興奮又充滿挑戰。
He’s up to date on all the latest data engineering and data analytics solutions, and is often called upon to make decisions on what offerings his organization needs to invest in to be successful. He knows firsthand how automation and self-serve tooling make all the difference when it comes to delivering an accessible, scalable data product.
他了解所有最新的數據工程和數據分析解決方案,并且經常被要求就其組織為成功需要投資哪些產品做出決策。 他直接了解自動化和自助服務工具在交付可訪問的,可擴展的數據產品方面如何發揮作用。
All other data stakeholders, from analysts to social media managers, are dependent on him for building a platform that ingests, unifies, and makes accessible data from a myriad of sources to consumers all over the business. Oh, and did we mention that this data must be compliant with GDPR, CCPA, and other industry regulations? It’s a challenging role and it’s difficult to keep everyone happy — it seems like his platform is always one transformation away from what BI actually wanted.
從分析師到社交媒體經理的所有其他數據利益相關者都依賴他來構建一個平臺,該平臺可從眾多來源向整個企業的消費者提取,統一并提供可訪問的數據。 哦,我們是否提到過這些數據必須符合GDPR,CCPA和其他行業法規? 這是一個具有挑戰性的角色,很難讓每個人都開心–看來他的平臺始終是BI 真正想要的一種轉變。
誰負責數據可靠性? (Who is responsible for data reliability?)
So, who in your data organization owns the reliability piece of your data ecosystem?
那么,您的數據組織中誰擁有數據生態系統的可靠性?
As you can imagine, the answer isn’t simple. From your company’s CDO to your data engineers, it’s ultimately everyone’s responsibility to ensure data reliability. And although nearly every arm of every organization at every company relies on data, not every data team has the same structure, and various industries have different requirements. (For instance, it’s the norm for financial institutions to hire entire teams of data governance experts, but at a small startup, not so much. And for those startups that do — we commend you!)
您可以想象,答案并不簡單。 從公司的CDO到數據工程師,確保數據可靠性最終都是每個人的責任。 盡管每個公司的每個組織的幾乎每個部門都依賴數據,但并非每個數據團隊都具有相同的結構,并且各個行業都有不同的要求。 (例如,對于金融機構而言,雇用整個團隊的數據治理專家是正常的做法,但是對于一家小型初創公司而言,聘請的人數不多。對于那些從事此類工作的初創公司,我們表示贊賞!)
Below, we outline our approach to mapping data responsibilities, from accessibility to reliability, across your data organization using the RACI (Responsible, Accountable, Consulted, and Informed) matrix guidelines:
下面,我們概述了使用RACI (負責,負責,咨詢和知情)矩陣準則在整個數據組織中映射數據職責(從可訪問性到可靠性)的方法:

At companies that ingest and transform terabytes of data (like Netflix or Uber), we’ve found that it is common for data engineers and data product managers to tackle the responsibility of monitoring and alerting for data reliability issues.
在攝取和轉換TB級數據的公司(例如Netflix或Uber ),我們發現數據工程師和數據產品經理通常要承擔監視和警告數據可靠性問題的責任。
Barring these behemoths, the responsibility often falls on data engineers and product managers. They must balance the organization’s demand for data with what can be provided reliably. Notably, the brunt of any bad choices made here is often borne by the BI analysts, whose dashboards may wind up containing bad information or break from silent changes. In very early data organizations, these roles are often combined into a jack-of-all-trades data person or a product manager.
除這些龐然大物之外,責任通常落在數據工程師和產品經理身上。 他們必須在組織對數據的需求與可靠提供的數據之間取得平衡。 值得注意的是,BI分析師通常會在這里做出任何錯誤選擇,其中BI分析師的儀表板可能最終包含錯誤的信息或無法進行靜默更改。 在非常早期的數據組織中,這些角色通常被組合為萬事通數據人員或產品經理。
Regardless of your team’s situation, you’re not alone.
無論您的團隊情況如何,您都不是一個人。
Fortunately, there’s a better way to start trusting your data: data observability. It’s an approach that’s taking off with most innovative companies, no matter who is ultimately responsible for ensuring data reliability in your organization.
幸運的是,有一種更好的方式開始信任您的數據: 數據可觀察性 。 無論誰最終負責確保組織中數據的可靠性,這種方法在大多數創新型公司中都在流行。
In fact, with the right data reliability strategy, the Bad Data Blame Game is a thing of the past and full end-to-end observability is in sight.
實際上,有了正確的數據可靠性策略,Bad Data Blame Game已經成為過去,并且可以看到完整的端到端可觀察性。
Interested in learning more? Reach out to Barr Moses, Will Robins, and the rest of the Monte Carlo team.
有興趣了解更多嗎? 與 Barr Moses , Will Robins 和 蒙特卡洛團隊 的其他 成員接觸 。
This article was written by Barr Moses and Will Robins.
本文由 Barr Moses 和 Will Robins 撰寫 。
翻譯自: https://towardsdatascience.com/which-of-the-six-major-data-personas-are-you-8dbf434b7c9e
本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。 如若轉載,請注明出處:http://www.pswp.cn/news/389787.shtml 繁體地址,請注明出處:http://hk.pswp.cn/news/389787.shtml 英文地址,請注明出處:http://en.pswp.cn/news/389787.shtml
如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!