mongodb仲裁者_真理的仲裁者

mongodb仲裁者

Coming out of college with a background in mathematics, I fell upward into the rapidly growing field of data analytics. It wasn’t until years later that I realized the incredible power that comes with the position. As Uncle Ben told Peter Parker (aka Spiderman), “With great power, comes great responsibility”. The proverb echoed by Uncle Ben perfectly sums up an unspoken reality for data professionals of all levels and types. You have to wonder if Peter Parker’s real superpower was data expertise. Unlike Spiderman, our enemies are not quite as obvious as a flying green monster. As a data professional, we must remain vigilant on topics such as data privacy, algorithmic biases, and presenting information objectively.

從大學畢業并擁有數學背景后,我就進入了快速增長的數據分析領域。 直到幾年后,我才意識到該職位所具有的強大力量。 正如本叔叔對彼得·帕克(又名蜘蛛俠)說的那樣:“能力越大,責任就越大”。 本叔叔回響的諺語完美地概括了所有級別和類型的數據專業人員一個不言而喻的現實。 您必須懷疑Peter Parker的真正超級能力是否是數據專業知識。 與蜘蛛俠不同,我們的敵人并不像飛行的綠色怪物那么明顯。 作為數據專業人員,我們必須保持警惕,例如數據隱私,算法偏見和客觀地呈現信息。

政府中的數據倫理 (Data Ethics in the Government)

My first encounter with sensitive data came at the U.S. Census Bureau back in 2016. My team was responsible for compiling and disseminating the U.S International Trade in Goods and Services report each month. The reports show how much the U.S. imports and exports various commodities with other countries. To the average person, this might not impact their lives, but to an investor, this information is incredibly valuable.

我第一次接觸敏感數據是在2016年的美國人口普查局。我的團隊負責每月編制和發布《美國國際商品和服務貿易報告》 。 報告顯示,美國與其他國家進出口了多少商品。 對于普通人來說,這可能不會影響他們的生活,但對投資者而言,此信息非常有價值。

Being an ambitious employee, I wanted to add a little pizzazz to their webpage. My plan was to display a fancy, Tableau chart (yes, they were fancy back then) relating to the Trans-Pacific-Partnership. This would be the equivalent of a news agency reporting the relevant facts for any major economic event. Sadly, I was shut down. I was told that the Census could not appear biased on the new free trade agreement. At the time, I did not quite understand. However, looking back on it, I can fully appreciate the sensitivity. The Census controls incredibly valuable information that could have wide implications on the economy and its people. In order to be effective, it must remain non-partisan. Otherwise, the numbers will become politicized and then the truth becomes questionable.

作為一個雄心勃勃的員工,我想在他們的網頁上加些小氣。 我的計劃是要顯示一張與跨太平洋伙伴關系有關的Tableau圖表(是的,當時它們很漂亮)。 這相當于新聞機構報道任何重大經濟事件的有關事實。 可悲的是,我被關閉了。 有人告訴我,人口普查似乎不會對新的自由貿易協定產生偏見。 當時,我不太了解。 但是,回顧它,我可以完全理解它的敏感性。 人口普查控制著極其寶貴的信息,這些信息可能對經濟及其人民產生廣泛影響。 為了有效,它必須保持無黨派。 否則,數字將被政治化,然后真相就變得可疑。

算法偏向 (Algorithmic Biases)

“When a measure becomes a target, it ceases to be a good measure”- Goodhart’s Law

“當一項措施成為目標時,它就不再是一項好的措施”-古德哈特定律

I see the above statement quoted often, yet KPIs remain incredibly common in organizations. One of my previous digital transformation projects required my department to adopt a new CRM (Contact Relationship Management) software. With this new system, leadership requested KPIs to measure participation in the tool. Anyone who has installed a new system knows the challenges of culture change and adoption. The software and the process must go hand-in-hand to be successful. Therefore, we needed the best method for measuring and incentivizing user activity in the CRM.

我看到上面的陳述經常被引用,但是KPI在組織中仍然非常普遍。 我以前的數字轉換項目之一要求我的部門采用新的CRM(聯系關系管理)軟件。 通過此新系統,領導層要求KPI衡量該工具的參與程度。 任何安裝了新系統的人都知道文化變革和采用的挑戰。 該軟件和過程必須齊頭并進,才能成功。 因此,我們需要衡量和激勵CRM中用戶活動的最佳方法。

In our system, users were expected to enter and update potential public policies that would impact the organization. We had users responsible for different regions around the globe. Some regions, such as Europe, had more policy activity than other regions. Some regions had more users to help keep the records up to date. Each region could vary in its importance from a financial perspective. In our CRM, you could measure logins, views, edits, added records, deleted records, and more. Each metric had an inherent bias in the calculation. To simplify things, we will assume that we can only calculate metrics at the region level and this will be on a biweekly basis. Let’s take a look at some of the options and their implications.

在我們的系統中,希望用戶輸入并更新可能影響組織的潛在公共策略。 我們有負責全球不同地區的用戶。 歐洲等某些地區的政策活動比其他地區更多。 一些地區有更多的用戶來幫助使記錄保持最新。 從財務角度看,每個地區的重要性可能會有所不同。 在我們的CRM中,您可以衡量登錄,視圖,編輯,添加的記錄,刪除的記錄等。 每個指標在計算中都有一個固有的偏差。 為了簡化起見,我們假設我們只能在區域級別上計算指標,并且這將是每兩周一次。 讓我們看一些選項及其含義。

When designing the appropriate KPIs for this new system, there were biases, assumptions, and incentives at play no matter which metric we chose. While mindlessly scrolling through Twitter, I recently came upon a quote that perfectly sums up the above process.

在為該新系統設計適當的KPI時,無論我們選擇哪種度量標準,都存在偏差,假設和激勵因素。 在漫不經心地瀏覽Twitter時,我最近引述了一個引言,它完美地總結了上述過程。

“The very act of turning something into a number is an assumption.”- Kareem Carr

“將某物轉化為數字的行為只是一種假設。”- Kareem Carr

誠信是必須的 (Integrity is a Must)

A few months back, I was working with a colleague who needed some assistance with the analysis and presentation of information that would be available to the public. As soon as you hear the words, “public data”, any data professional’s mind will immediately gravitate towards data security. Fortunately, this was not an issue.

幾個月前,我正在與一位同事合作,他需要一些幫助來分析和呈現可供公眾使用的信息。 一旦您聽到“公共數據”一詞,任何數據專業人士的想法都會立即趨向于數據安全。 幸運的是,這不是問題。

My colleague proceeded to explain what data we had (i.e. very little) and the purpose of the presentation. After some exploration, I realized that we could not provide any summary statistics at the requested level of detail. We could only provide an estimate of the overall total. This was insufficient for their project. There was pressure to “make some magic happen”; especially, if I wanted to impress a few senior level colleagues. The short term would yield a reputational boost for myself, but over the long term, it risks significant reputational damage for the organization (and myself).

我的同事開始解釋我們擁有的數據(即很少)以及演示的目的。 經過一番探索,我意識到我們無法提供所要求的詳細級別的任何摘要統計信息。 我們只能提供總體估算值。 這對于他們的項目是不夠的。 迫于壓力“要使一些魔術發生”; 特別是如果我想打動一些高級同事。 短期將為自己帶來聲譽提升,但從長遠來看,它將給組織(和我自己)帶來重大聲譽損失。

Image for post
UnsplashUnsplash

最后的想法 (Final Thoughts)

As data is becoming seamlessly woven into every process, there come ethical risks that aren’t talked about enough. When data professionals start implementing black-box algorithms into your decision-making processes, it will be too late. Organizations need to instill a culture of ethical, data-driven decision making from the top.

隨著數據無縫地融入到每個流程中,隨之而來的道德風險還沒有得到足夠的重視。 當數據專業人員開始在您的決策過程中實施黑盒算法時,為時已晚。 組織需要從高層灌輸一種道德的,由數據驅動的決策文化。

As a data professional, you will frequently find yourself at the center of difficult decisions, especially, if you work with colleagues who struggle with data and numbers. Your job is to bridge the gap between their subject matter expertise and the appropriate analysis or presentation of the information. In that gap, lies an opportunistic, invisible enemy who wants you to take the shortcut. Follow in Spiderman’s footsteps and proceed with integrity.

作為數據專業人員,您經常會發現自己處于困難決策的中心,尤其是與與數據和數字糾纏不清的同事一起工作時。 您的工作是彌合他們的主題專業知識和適當的信息分析或表示之間的鴻溝。 在那個空白中,是一個機會主義的,看不見的敵人,他想讓你走捷徑。 跟隨蜘蛛俠的腳步,繼續誠信。

~ The Data Generalist

? 數據通才

翻譯自: https://towardsdatascience.com/the-arbiters-of-truth-d97ce1a4e4a6

mongodb仲裁者

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/news/389423.shtml
繁體地址,請注明出處:http://hk.pswp.cn/news/389423.shtml
英文地址,請注明出處:http://en.pswp.cn/news/389423.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

優化 回歸_使用回歸優化產品價格

優化 回歸應用數據科學 (Applied data science) Price and quantity are two fundamental measures that determine the bottom line of every business, and setting the right price is one of the most important decisions a company can make. Under-pricing hurts the co…

Node.js——異步上傳文件

前臺代碼 submit() {var file this.$refs.fileUpload.files[0];var formData new FormData();formData.append("file", file);formData.append("username", this.username);formData.append("password", this.password);axios.post("http…

用 JavaScript 的方式理解遞歸

原文地址 1. 遞歸是啥? 遞歸概念很簡單,“自己調用自己”(下面以函數為例)。 在分析遞歸之前,需要了解下 JavaScript 中“壓棧”(call stack) 概念。 2. 壓棧與出棧 棧是什么?可以理解是在內存…

PyTorch官方教程中文版:Pytorch之圖像篇

微調基于 torchvision 0.3的目標檢測模型 """ 為數據集編寫類 """ import os import numpy as np import torch from PIL import Imageclass PennFudanDataset(object):def __init__(self, root, transforms):self.root rootself.transforms …

大數據數據科學家常用面試題_進行數據科學工作面試

大數據數據科學家常用面試題During my time as a Data Scientist, I had the chance to interview my fair share of candidates for data-related roles. While doing this, I started noticing a pattern: some kinds of (simple) mistakes were overwhelmingly frequent amo…

scrapy模擬模擬點擊_模擬大流行

scrapy模擬模擬點擊復雜系統 (Complex Systems) In our daily life, we encounter many complex systems where individuals are interacting with each other such as the stock market or rush hour traffic. Finding appropriate models for these complex systems may give…

公司想申請網易企業電子郵箱,怎么樣?

不論公司屬于哪個行業,選擇企業郵箱,交互界面友好度、穩定性、安全性都是選擇郵箱所必須考慮的因素。網易企業郵箱郵箱方面已有21年的運營經驗,是國內資歷最高的電子郵箱,在各個方面都非常成熟完善。 從交互界面友好度來看&#x…

莫煩Matplotlib可視化第二章基本使用代碼學習

基本用法 import matplotlib.pyplot as plt import numpy as np""" 2.1基本用法 """ # x np.linspace(-1,1,50) #[-1,1]50個點 # #y 2*x 1 # # y x**2 # plt.plot(x,y) #注意:x,y順序不能反 # plt.show()"""…

vue.js python_使用Python和Vue.js自動化報告過程

vue.js pythonIf your organization does not have a data visualization solution like Tableau or PowerBI nor means to host a server to deploy open source solutions like Dash then you are probably stuck doing reports with Excel or exporting your notebooks.如果…

plsql中導入csvs_在命令行中使用sql分析csvs

plsql中導入csvsIf you are familiar with coding in SQL, there is a strong chance you do it in PgAdmin, MySQL, BigQuery, SQL Server, etc. But there are times you just want to use your SQL skills for quick analysis on a small/medium sized dataset.如果您熟悉SQ…

第十八篇 Linux環境下常用軟件安裝和使用指南

提醒:如果之后要安裝virtualenvwrapper的話,可以直接跳到安裝virtualenvwrapper的方法,而不需要先安裝好virtualenv安裝virtualenv和生成虛擬環境安裝virtualenv:yum -y install python-virtualenv生成虛擬環境:先切換…

莫煩Matplotlib可視化第三章畫圖種類代碼學習

3.1散點圖 import matplotlib.pyplot as plt import numpy as npn 1024 X np.random.normal(0,1,n) Y np.random.normal(0,1,n) T np.arctan2(Y,X) #用于計算顏色plt.scatter(X,Y,s75,cT,alpha0.5)#alpha是透明度 #plt.scatter(np.arange(5),np.arange(5)) #一條線的散點…

計算機科學必讀書籍_5篇關于數據科學家的產品分類必讀文章

計算機科學必讀書籍Product categorization/product classification is the organization of products into their respective departments or categories. As well, a large part of the process is the design of the product taxonomy as a whole.產品分類/產品分類是將產品…

es6解決回調地獄問題

本文摘抄自阮一峰老師的 http://es6.ruanyifeng.com/#docs/generator-async 異步 所謂"異步",簡單說就是一個任務不是連續完成的,可以理解成該任務被人為分成兩段,先執行第一段,然后轉而執行其他任務,等做好…

交替最小二乘矩陣分解_使用交替最小二乘矩陣分解與pyspark建立推薦系統

交替最小二乘矩陣分解pyspark上的動手推薦系統 (Hands-on recommender system on pyspark) Recommender System is an information filtering tool that seeks to predict which product a user will like, and based on that, recommends a few products to the users. For ex…

莫煩Matplotlib可視化第四章多圖合并顯示代碼學習

4.1Subplot多合一顯示 import matplotlib.pyplot as plt import numpy as npplt.figure() """ 每個圖占一個位置 """ # plt.subplot(2,2,1) #將畫板分成兩行兩列,選取第一個位置,可以去掉逗號 # plt.plot([0,1],[0,1]) # # plt.su…

python 網頁編程_通過Python編程檢索網頁

python 網頁編程The internet and the World Wide Web (WWW), is probably the most prominent source of information today. Most of that information is retrievable through HTTP. HTTP was invented originally to share pages of hypertext (hence the name Hypertext T…

Python+Selenium自動化篇-5-獲取頁面信息

1.獲取頁面title title:獲取當前頁面的標題顯示的字段from selenium import webdriver import time browser webdriver.Chrome() browser.get(https://www.baidu.com) #打印網頁標題 print(browser.title) #輸出內容:百度一下,你就知道 2.…

火種 ctf_分析我的火種數據

火種 ctfOriginally published at https://www.linkedin.com on March 27, 2020 (data up to date as of March 20, 2020).最初于 2020年3月27日 在 https://www.linkedin.com 上 發布 (數據截至2020年3月20日)。 Day 3 of social distancing.社會疏離的第三天。 As I sit on…

莫煩Matplotlib可視化第五章動畫代碼學習

5.1 Animation 動畫 import numpy as np import matplotlib.pyplot as plt from matplotlib import animationfig,ax plt.subplots()x np.arange(0,2*np.pi,0.01) line, ax.plot(x,np.sin(x))def animate(i):line.set_ydata(np.sin(xi/10))return line,def init():line.set…