數據庫不停機導數據方案_如何計算數據停機成本

數據庫不停機導數據方案

In addition to wasted time and sleepless nights, data quality issues lead to compliance risks, lost revenue to the tune of several million dollars per year, and erosion of trust — but what does bad data really cost your company? I’ve created a novel data downtime calculator that will help you measure the true financial impact of bad data on your organization.

除了浪費時間和不眠之夜之外, 數據質量問題還 導致合規風險, 每年 損失 數百萬美元的 收入 和信任度下降-但是,糟糕的數據真正會使您的公司付出什么呢? 我創建了一個新穎的 數據停機計算器 ,可以幫助您衡量不良數據對組織的真正財務影響。

What’s big, scary, and keeps even the best data teams up at night?

有什么大的,令人恐懼的,甚至可以讓最好的數據團隊在夜間工作?

If you guessed the ‘monster under your bed,’ nice try, but you’d be wrong. The answer is far more real, all-too-common, and you’re probably already experiencing it whether or not you realize it.

如果您猜到了“床下的怪物”,可以嘗試一下,但是您會錯的。 答案要真實得多,太普遍了,無論您是否意識到,您可能已經在體驗它了。

The answer? Data downtime. Data downtime refers to periods of time when your data is partial, erroneous, missing, or otherwise inaccurate, ranging from a few null values to completely outdated tables. These data fire drills are time-consuming and costly, corrupting otherwise excellent data pipelines with garbage data.

答案? 數據停機時間。 數據停機時間是指數據部分,錯誤,丟失或不準確的時間段,范圍從幾個空值到完全過時的表。 這些數據防火練習既耗時又昂貴, 使用垃圾數據破壞了本來很好的數據管道 。

壞數據的真實代價 (The true cost of bad data)

One CDO I spoke with recently told me that his 500-person team spends 1,200 cumulative hours per week tackling data quality issues, time otherwise spent on activities that drive innovation and generate revenue.

我最近與之交談的一位CDO告訴我,他的500人團隊每周花1200個小時累計時間來解決數據質量問題,否則將時間花在推動創新和創收的活動上。

To demonstrate the scope of this problem, here are some fast facts about just how must time data teams waste on data downtime:

為了說明此問題的范圍,以下是一些快速的事實,說明數據團隊必須如何浪費時間進行數據停機:

  • 50–80 percent of a data practitioner’s time is spent collecting, preparing, and fixing “unruly” data. (The New York Times)

    數據從業人員有50-80%的時間用于收集,準備和修復“不守規矩”的數據。 ( 紐約時報 )

  • 40 percent of a data analyst’s time is spent on vetting and validating analytics for data quality issues. (Forrester)

    數據分析師有40%的時間用于審查和驗證數據質量問題的分析。 ( Forrester )

  • 27 percent of a salesperson time is spent dealing with inaccurate data. (ZoomInfo)

    銷售人員有27%的時間用于處理不準確的數據。 ( ZoomInfo )

  • 50 percent of a data practitioner’s time is spent on identifying, troubleshooting, and fixing data quality, integrity, and reliability issues. (Harvard Business Review)

    數據從業人員有50%的時間用于識別,故障排除和修復數據質量,完整性和可靠性問題。 ( 哈佛商業評論 )

Based on these numbers, as well as interviews and surveys conducted with over 150 different data teams across industries, I estimate that data teams spend 30–40 percent of their time handling data quality issues instead of working on revenue-generating activities.

根據這些數字,以及對跨行業的150多個不同數據團隊進行的訪談和調查,我估計數據團隊將30%至40%的時間用于處理數據質量問題,而不是從事創收活動。

The cost of bad data is more than wasted time and sleepless nights; there are serious compliance, financial, and operational implications that can catch data leaders off guard, impacting both your team’s ROI and your company’s bottom line.

錯誤數據的代價不僅是浪費時間和不眠之夜; 嚴重的合規性,財務和運營影響可能會使數據領導者措手不及,從而影響團隊的投資回報率和公司的底線。

合規風險 (Compliance risk)

For several decades, the medical and financial services sectors, with their responsibility to protect personally identifiable information (PII) and stewardship of sensitive customer data sources, was the poster child for compliance.

幾十年來,醫療和金融服務部門一直負責保護個人身份信息(PII)和管理敏感的客戶數據源,這是遵守法規的典型代表。

Now, with nearly every industry handling user data, companies from e-commerce sites to dog food distributors must follow strict data governance mandates, from GDPR to CCPA, and other privacy protection regulations.

現在,幾乎每個行業都在處理用戶數據,從電子商務站點到狗糧分銷商的公司必須遵循嚴格的數據治理要求,從GDPR到CCPA以及其他隱私保護法規。

And bad data can manifest in any number of ways, from a mistyped email address to misreported financials and can cause serious ramifications down the road; for instance, in Vermont, outdated information about whether or not a customer wants to renew their annual subscription of a service can spell the difference between a seamless user experience and a class action lawsuit. Such errors can lead to fines and steep penalties.

從錯誤的電子郵件地址到錯誤的財務報告,不良數據可能以多種方式表現出來,并可能導致嚴重后果。 例如, 在佛蒙特州 ,有關客戶是否想要續訂其年度服務的過時信息可以消除無縫的用戶體驗與集體訴訟之間的區別。 這樣的錯誤可能導致罰款和嚴厲的處罰。

收入損失 (Lost revenue)

It’s often said that “time is money,” but for any company seeking the competitive edge, “data is money” is more accurate.

人們常說“時間就是金錢”,但是對于任何尋求競爭優勢的公司來說,“數據就是金錢”更為準確。

One of the most explicit links I’ve found between data downtime and lost revenue is in financial services. In fact, one data scientist at a financial services company that buys and sells consumer loans told me that a field name change can result in a $10M loss in transaction volume, or a week’s worth of deals.

我發現數據停機和收入損失之間最明顯的聯系之一是金融服務。 實際上,一家買賣消費者貸款的金融服務公司的數據科學家告訴我,域名更改可能導致交易額損失1000萬美元,或一周的交易額。

Behind these numbers is the reality that firefighting data downtime incidents not only wastes valuable time but tears teams away from revenue-generating projects. Instead of making progress on building new products and services that can add material value for your customers, data engineering teams spend time debugging and fixing data issues. A lack of visibility into what’s causing these problems only makes matters worse.

這些數字背后的事實是,消防數據停機事件不僅浪費寶貴的時間,而且使團隊遠離創收項目。 數據工程團隊沒有在開發可以為您的客戶增加實質價值的新產品和服務上取得進展,而是花時間調試和解決數據問題。 對導致這些問題的原因缺乏了解只會使情況變得更糟。

侵蝕數據信任 (Erosion of data trust)

The insights you derive from your data are only as accurate as the data itself. In fact, it’s my firm belief that numbers can lie and using bad data is worse than having no data at all.

您從數據中得出的見解僅與數據本身一樣準確。 實際上,我堅信數字撒謊,并且使用不良數據比根本沒有數據還要糟糕。

Data won’t hold itself accountable, but decision makers will, and over time, bad data can erode organizational trust in your data team as a revenue driver for the organization. After all, if you can’t rely on the data powering your analytics, why should your CEO? And for that matter, why should your customers?

數據本身不負責任,但是決策者將隨著時間的流逝,壞數據會削弱組織對您的數據團隊的信任,因為它是組織的收入驅動力。 畢竟,如果您不能依靠數據來支持分析,那么為什么您的CEO應該呢? 那么,為什么您的客戶呢?

To help you mitigate your data downtime problem, we put together a Data Downtime Cost Calculator that factors in how much money you’re likely to lose dealing with data downtime fire drills instead of working on revenue-generating activities.

為了幫助您緩解數據停機問題,我們建立了一個數據停機成本計算器 ,該計算器將您可能會損失多少錢來處理數據停機消防演習而不是從事創收活動。

您的數據停機成本計算器 (Your Data Downtime Cost Calculator)

As such, the annual cost of your data downtime can be measured by the engineering or resources you need to spend to resolve it.

因此,數據停機的年度成本可以通過解決該問題所需的工程或資源來衡量。

I’d propose that the right data downtime calculator factors in the cost of labor to tackle these issues, your compliance risk (in this case, we used the average GDPR fines), and the opportunity cost of losing stakeholder trust in your data. Per earlier estimates, you can assume that around 30 percent of an engineer’s time will be spent tackling data issues.

我建議正確的數據停機計算器應考慮解決這些問題的勞動力成本,合規風險(在這種情況下,我們使用GDPR的平均罰款)以及失去利益相關者對數據的信任的機會成本。 根據較早的估計,您可以假設工程師的大約30%的時間將花費在解決數據問題上。

Bringing this all together, your Data Downtime Cost Calculator is:

綜上所述,您的數據停機成本計算器是:

Labor Cost: ([Number of Engineers] X [Annual Salary of Engineer]) X 30%

人工成本:([工程師人數] X [工程師年薪])X 30%

+

+

Compliance Risk: [4% of Your Revenue in 2019]

合規風險:[2019年收入的4%]

+

+

Opportunity Cost: [Revenue you could have generated if you moved faster, releasing X new products, and acquired Y new customers]

機會成本:[如果您移動得更快,發布X個新產品并獲得Y個新客戶,您可能會產生收入]

= $年度數據停機成本 (= $ Annual Cost of Data Downtime)

Keep in mind that this equation will vary by company, but we’ve found that our framework can get most teams started.

請記住,這個方程式會因公司而異,但是我們發現我們的框架可以使大多數團隊入手。

Measuring the cost of your data downtime is the first step towards fully understanding the implications of bad data at your company. Fortunately, data downtime is avoidable. With the right approach to data reliability, you can keep the cost of bad data at bay and prevent bad data from corrupting good pipelines in the first place.

衡量數據停機成本是全面了解不良數據對公司的影響的第一步。 幸運的是,可以避免數據停機。 使用正確的數據可靠性方法,您可以控制壞數據的成本, 并首先防止壞數據破壞好的管道。

Have another way to measure the impact of data downtime? Would love to hear from you!

還有另一種方法來衡量數據停機的影響嗎? 希望 收到您的 來信!

翻譯自: https://towardsdatascience.com/how-to-calculate-the-cost-of-data-downtime-c0a48733b6f0

數據庫不停機導數據方案

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/news/387885.shtml
繁體地址,請注明出處:http://hk.pswp.cn/news/387885.shtml
英文地址,請注明出處:http://en.pswp.cn/news/387885.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

luogu4159 迷路 (矩陣加速)

考慮如果只有距離為1的邊,那我用在時間i到達某個點的狀態數矩陣 乘上轉移矩陣(就是邊的鄰接矩陣),就能得到i1時間的 然后又考慮到邊權只有1~9,那可以把邊拆成只有距離為1的 具體做法是一個點拆成9個然后串聯 1 #includ…

社群系統ThinkSNS+ V2.2-V2.3升級教程

WARNING本升級指南僅適用于 2.2 版本升級至 2.3 版本,如果你并非 2.2 版本,請查看其他升級指南,Plus 程序不允許跨版本升級!#更新代碼預計耗時: 2 小時這是你自我操作的步驟,確認將你的 2.2 版本代碼升級到…

BZOJ4881 線段游戲(二分圖+樹狀數組/動態規劃+線段樹)

相當于將線段劃分成兩個集合使集合內線段不相交,并且可以發現線段相交等價于逆序對。也即要將原序列劃分成兩個單增序列。由dilworth定理,如果存在長度>3的單減子序列,無解,可以先判掉。 這個時候有兩種顯然的暴力。 將點集劃分…

activemq部署安裝

一、架構和技術介紹 1、簡介 ActiveMQ 是Apache出品,最流行的,能力強勁的開源消息總線。完全支持JMS1.1和J2EE 1.4規范的 JMS Provider實現 2、activemq的特性 1. 多種語言和協議編寫客戶端。語言: Java, C, C, C#, Ruby, Perl, Python, PHP。應用協議: …

python初學者_面向初學者的20種重要的Python技巧

python初學者Python is among the most widely used market programming languages in the world. This is because of a variety of driving factors:Python是世界上使用最廣泛的市場編程語言之一。 這是由于多種驅動因素: It’s simple to understand. 很容易理解…

主串與模式串的匹配

主串與模式串的匹配 (1)BF算法: BF算法比較簡單直觀,其匹配原理是主串S.ch[i]和模式串T.ch[j]比較,若相等,則i和j分別指示串中的下一個位置,繼續比較后續字符,若不相等,從…

什么是 DDoS 攻擊?

歡迎訪問網易云社區,了解更多網易技術產品運營經驗。 全稱Distributed Denial of Service,中文意思為“分布式拒絕服務”,就是利用大量合法的分布式服務器對目標發送請求,從而導致正常合法用戶無法獲得服務。通俗點講就是利用網絡…

nginx 并發過十萬

一般來說nginx 配置文件中對優化比較有作用的為以下幾項: worker_processes 8; nginx 進程數,建議按照cpu 數目來指定,一般為它的倍數。 worker_cpu_affinity 00000001 00000010 00000100 00001000 00010000 00100000 01000000 10000000; 為每…

貝葉斯網絡建模

I am feeling sick. Fever. Cough. Stuffy nose. And it’s wintertime. Do I have the flu? Likely. Plus I have muscle pain. More likely.我感到惡心。 發熱。 咳嗽。 鼻塞。 現在是冬天。 我有流感嗎? 可能吧 另外我有肌肉疼痛。 更傾向于。 Bayesian networ…

長春南關區凈月大街附近都有哪些課后班?

長春南關區凈月大街附近都有哪些課后班?在學校的教育不能滿足廣大學生的需求的時候,一對一輔導、文化課輔導、高考輔導等越來越多的家長和孩子的選擇。相對于學校的大課教育,一對一輔導有著自身獨特的優勢,一對一輔導有著學校教學…

dev中文本框等獲取焦點事件

<ClientSideEvents GotFocus"GotFocus" /> editContract.SetFocus()//設置文本框等的焦點 function GotFocus(s, e) { window.top.DLG.show(700, 600, "PrePayment/ContractSelect.aspx", "選擇", null ); }…

數據科學家數據分析師_使您的分析師和數據科學家在數據處理方面保持一致

數據科學家數據分析師According to a recent survey conducted by Dimensional Research, only 50 percent of data analysts’ time is actually spent analyzing data. What’s the other half spent on? Data cleanup — that tedious and repetitive work that must be do…

神經網絡使用情景

神經網絡使用情景 人臉&#xff0f;圖像識別語音搜索文本到語音&#xff08;轉錄&#xff09;垃圾郵件篩選&#xff08;異常情況探測&#xff09;欺詐探測推薦系統&#xff08;客戶關系管理、廣告技術、避免用戶流失&#xff09;回歸分析 為何選擇Deeplearning4j&#xff1f; …

BZOJ4890 Tjoi2017城市

顯然刪掉的邊肯定是直徑上的邊。考慮枚舉刪哪一條。然后考慮怎么連。顯然新邊應該滿足其兩端點在各自樹中作為根能使樹深度最小。只要線性求出這個東西就可以了&#xff0c;這與求樹的重心的過程類似。 #include<iostream> #include<cstdio> #include<cmath>…

【國際專場】laravel多用戶平臺(SaaS, 如淘寶多用戶商城)的搭建策略

想不想用Laravel來搭建一個多用戶、或多租戶平臺&#xff1f;比如像淘寶那樣的多商戶平臺呢&#xff1f;聽上去很復雜&#xff0c;不是嗎&#xff1f;怎么能一個程序&#xff0c;給那么多的機構用戶來用呢&#xff1f;如何協調管理它們呢&#xff1f;數據庫怎么搭建呢&#xff…

GitHub常用命令及使用

GitHub使用介紹 摘要&#xff1a; 常用命令&#xff1a; git init 新建一個空的倉庫git status 查看狀態git add . 添加文件git commit -m 注釋 提交添加的文件并備注說明git remote add origin gitgithub.com:jinzhaogit/git.git 連接遠程倉庫git push -u origin master 將本地…

神經網絡的類型

KNN DNN SVM DL BP DBN RBF CNN RNN ANN 概述 本文主要介紹了當前常用的神經網絡&#xff0c;這些神經網絡主要有哪些用途&#xff0c;以及各種神經網絡的優點和局限性。 1 BP神經網絡 BP (Back Propagation)神經網絡是一種神經網絡學習算法。其由輸入層、中間層、輸出層組成的…

python db2查詢_如何將DB2查詢轉換為python腳本

python db2查詢Many companies are running common data analytics tasks using python scripts. They are asking employees to convert scripts that may currently exist in SAS or other toolsets to python. One step of this process is being able to pull in the same …

Dapper基礎知識三

在下剛畢業工作&#xff0c;之前實習有用到Dapper&#xff1f;這幾天新項目想用上Dapper&#xff0c;在下比較菜鳥&#xff0c;這塊只是個人對Dapper的一種總結。 Dapper&#xff0c;當項目在開發的時候&#xff0c;在沒有必要使用依賴注入的時候&#xff0c;如何做到對項目的快…

deeplearning4j

deeplearning4j 是基于java的深度學習庫&#xff0c;當然&#xff0c;它有許多特點&#xff0c;但暫時還沒學那么深入&#xff0c;所以就不做介紹了 需要學習dl4j&#xff0c;無從下手&#xff0c;就想著先看看官網的examples&#xff0c;于是&#xff0c;下載了examples程序&a…