算法偏見是什么
在上一篇文章中,我們展示了當數據將情緒從動作中剝離時會發生什么 (In the last article, we showed what happens when data strip emotions out of an action)
In Part 1 of this series, we argued that data can turn anyone into a psychopath, and though that’s an extreme way of looking at things, it holds a certain amount of truth.
在本系列的第1部分中 ,我們認為數據可以使任何人都變得精神病 ,盡管這是一種看待事物的極端方法,但它具有一定的真理性。
It’s natural to cheer at a newspaper headline proclaiming the downfall of a distant enemy stronghold, but is it ok the cheer while actually watching thousands of civilians inside that city die gruesome deaths?
在報紙頭條上宣稱一個遙遠的敵人要塞的倒塌是很自然的,但是當實際上看著這座城市中成千上萬的平民死于可怕的死亡時,這種歡呼是可以的嗎?
No, it’s not.
不,這不對。
But at the same time―if you cheer the headline showing a distant military victory, it means you’re a human, and not necessarily a psychopath.
但同時-如果您為頭條新聞打招呼,表明遙遙領先的軍事勝利,那意味著您是人,不一定是精神病患者。
The abstracted data of that headline strips the emotional currency of the event, and induces a psychopathic response from you.
該標題的抽象數據剝奪了事件的情感色彩,并引起了您的精神病性React。
That’s what headlines do, and they can induce a callous response from most anyone.
頭條新聞就是這樣做的,它們可以引起大多數人的冷酷回應。
So if data can induce a state of momentary psychopathy, what happens when you combine data and algorithms?
因此,如果數據可能導致短暫的心理狀態,那么將數據和算法結合起來會發生什么?
Data can’t feel, and algorithms can’t feel either.
數據感覺不到,算法也感覺不到。
Is that a state of unfeeling multiplied by two?
那是一種無情的狀態乘以2嗎?
Or is it a state of unfeeling squared?
還是處于無情的平方狀態?
Whatever the case, let’s not talk about the momentary psychopathy abetted by these unfeeling elements.
無論如何,我們不要談論這些不情愿的因素助長的暫時性精神病。
Let’s talk about bias.
讓我們談談偏見 。
Because if left unchecked, unfeeling algorithms can and will lead anyone into a state of bias, including you.
因為如果任其發展,毫無情理的算法會并且會導致任何人(包括您在內)進入偏見狀態。
But before we try to understand algorithmic bias, we must take a moment to recognize how much we don’t understand our own algorithms.
但在此之前,我們試著去理解算法的偏見,我們必須花點時間認識到我們是多么喜歡和不理解我們自己的算法。
Yes, humanity makes algorithms, and humanity relies upon them countless times every day, but we don’t understand them.
是的,人類創造了算法,人類每天都依賴于它們無數次,但是我們不了解它們。

無論我們認為自己做了多少,我們都不再了解我們自己的算法 (We no longer understand our own algorithms, no matter how much we think we do)
At a high, high level, we could conceive of an algorithmic process as having three parts―an Input, the algorithm itself, and an Outcome.
在較高,較高的級別上,我們可以將算法過程想象為包含三個部分:輸入,算法本身和結果。

But we are now far, far away from human-understandable algorithms like the Sieve of Eratosthenes, and though the above image might be great for an Introduction to Algorithms class―today’s algorithms can no longer be adequately described by the above three parts alone.
但是我們現在離人類可以理解的算法( 例如Eratosthenes的Sieve)還差得很遠,盡管上面的圖片對于算法入門課程來說可能很棒,但是僅靠以上三個部分就不能充分描述當今的算法 。
The tech writer Franklin Foer describes one of the reasons for this in his book World Without Mind: The Existential Threat of Big Tech―
科技作家富蘭克林·富爾(Franklin Foer)在他的《 世界無思想:大科技的生存威脅》一書中描述了其中的原因之一-
Perhaps Facebook no longer fully understands its own tangle of algorithms — the code, all sixty million lines of it, is a palimpsest, where engineers add layer upon layer of new commands. (This is hardly a condition unique to Facebook. The Cornell University computer scientist Jon Kleinberg cowrote an essay that argued, “We have, perhaps for the first time ever, built machines we do not understand. . . . At some deep level we don’t even really understand how they’re producing the behavior we observe. This is the essence of their incomprehensibility.” What’s striking is that the “we” in that sentence refers to the creators of code.)
也許Facebook不再完全了解它自己的算法纏結-它的全部六千萬行代碼都是最簡單的方法,工程師在其中添加了一層又一層的新命令。 (這幾乎不是Facebook所獨有的條件。康奈爾大學計算機科學家喬恩·克萊恩伯格(Jon Kleinberg)在一篇文章中寫道:“也許我們有史以來第一次建造了我們不了解的機器……從某種程度上講,我們不了解。 “甚至還沒有真正理解他們如何產生我們觀察到的行為。這是他們難以理解的本質。”令人驚訝的是,該句子中的“我們”是指代碼的創建者。)
At the very least, the algorithmic codes that run our lives are palimpsests―documents that are originally written by one group of people, and then written over by another group, and then a third, and then a fourth―until there is no one expert on the code itself, or perhaps even one person who understands it.
最起碼,運行我們的生活算法代碼是被最初是由一組人寫的,然后由另一組寫在palimpsests的文檔,然后第三個,然后第四直到沒有一個專家代碼本身,甚至可能是一個了解代碼的人。
And these algorithmic palimpsests are millions of lines of code long, or even billions.
這些算法上的障礙是數百萬行代碼, 甚至數十億行 。
Remember Mark Zuckerberg’s 2018 testimony before Congress?
還記得馬克·扎克伯格(Mark Zuckerberg)在國會面前的2018年證詞嗎?

That was the testimony of an individual who didn’t have the faintest understanding about 99% of Facebook’s inner workings.
那是一個對Facebook 99%的內部運作不了解的人的證詞。
Because no one does.
因為沒有人做。
Larry Page and Sergey Brin don’t understand Google as a whole.
拉里·佩奇(Larry Page)和謝爾蓋·布林(Sergey Brin)不太了解Google。
Because no one does.
因為沒有人做。
And the algorithms that define our daily lives?
以及定義我們日常生活的算法?
No one understands them completely, nor does anyone understand the massive amounts of data that they take in.
沒有人能完全理解它們,也沒有人能理解他們吸收的海量數據。
So let’s update our algorithm diagram. We need to understand that there are more Inputs than we can understand, and that the algorithms themselves are black boxes.
因此,讓我們更新算法圖。 我們需要了解的是,輸入比我們理解的要多,并且算法本身就是黑匣子。
So here is a slightly more accurate, yet still high-level view of what is happening with our algorithms.
因此,對于我們的算法正在發生的事情,這是一個稍微準確但仍是高級的視圖。

Again―there are more Inputs than we can understand, going into a black-box algorithm we do not fully understand.
再說一遍–輸入的數量超出了我們的理解,進入了一個我們還不完全了解的黑盒算法。
And this can lead to many things, including bias.
這會導致很多事情,包括偏見。
算法偏見的案例研究-一家公司被告知青睞曲棍球運動員Jared (A case study in algorithmic bias―a company is told to favor lacrosse players named Jared)
A company recently ran a hiring algorithm, and the intent of the algorithm was to eliminate bias in hiring processes.
一家公司最近運行了一種招聘算法,該算法的目的是消除招聘過程中的偏差。
The algorithm’s purpose was to find the best candidates.
該算法的目的是找到最佳候選者。
The company entered some training data into the algorithm based on past successful candidates, and then ran the algorithm again with a current group of candidates.
公司根據過去成功的候選人將一些訓練數據輸入到算法中,然后使用當前的候選人組再次運行該算法。
The algorithm, among other things, favored candidates named Jared that played lacrosse.
除其他因素外,該算法更喜歡打曲棍網兜球的候選人賈里德 。

The algorithmic Output was biased, but not in the way anyone expected.
算法輸出有偏差,但沒有人期望的那樣。
How could this have happened?
這怎么可能發生?
算法不是富有同情心的,更不用說有感覺了,但是它們確實善于發現模式 (Algorithms are not compassionate, let alone sentient―but they are really good at finding patterns)
In the above case, the algorithm found a pattern within the training data that lacrosse players named Jared tend to be good hires.
在上述情況下,該算法在訓練數據中發現了一種模式,即名為Jared的長曲棍球運動員往往是不錯的錄用者。
That’s a biased recommendation of course, and a faulty one.
當然,這是有偏見的建議,也是錯誤的建議。
Why did it occur?
為什么會發生?

Well, beyond us recognizing that we don’t understand the algorithm itself, we can cite thinkers like Dr. Nicol Turner Lee of Brookings, who explained on Noah Feldman’s Deep Background podcast that external sources of algorithmic bias are often manifold.
好吧,除了我們認識到我們不了解算法本身之外,我們還可以引用布魯金斯大學的Nicol Turner Lee博士這樣的思想家,他們在Noah Feldman的Deep Background播客中解釋說,算法偏差的外部來源通常是多種多樣的。
There might be bias in the training data, and quite often the data scientists who made the algorithm might be of a homogenous group, which might in turn encourage the algorithm to suggest the hiring of more candidates like themselves.
訓練數據中可能存在偏差,并且制定該算法的數據科學家通常可能屬于同一個群體,這反過來可能會鼓勵該算法建議雇用更多像他們自己的候選人。
And of course, there is societal and systemic bias, which will inevitably work its way into an unfeeling, pattern-recognizing algorithm.
當然,存在社會和系統偏見,這不可避免地會變成一種毫無感覺的模式識別算法。
So to update our algorithm chart once again―
所以要再次更新我們的算法圖

There are faint echoes of Jared and lacrosse somewhere in the Inputs, and we certainly see them in the Outputs.
輸入中有些地方有Jared和曲棍網兜球的微弱回聲,我們當然可以在輸出中看到它們。
Of course, both the full scope of the Inputs and the algorithm itself remain a mystery.
當然,輸入的全部范圍和算法本身都是一個謎。
The only thing we know for sure is that if your name is Jared, and you played lacrosse, you will have an advantage.
我們唯一可以確定的是,如果您的名字叫Jared,并且打過曲棍球,那么您將獲得優勢。
這是一個幽默的例子,但是當賭注更高時會發生什么呢? (This was a humorous example―but what happens when the stakes are higher?)
Hiring algorithms are relatively low stakes in the grand scheme of things, especially considering that virtually any rational company would take steps to eliminate a penchant for lacrosse-playing Jareds from their hiring processes as soon as they could.
招聘算法在總體規劃中的風險相對較低,尤其是考慮到幾乎任何一家理性公司都將采取措施,盡可能消除對那些喜歡曲棍網兜球的Jared的雇用過程。
But what if the algorithm is meant to set credit rates?
但是,如果該算法用于設置信用利率怎么辦?
What if the algorithm is meant to determine a bail amount?
如果該算法用于確定保釋金該怎么辦?
What if this algorithm leads to a jail term for someone who should have been sent home instead?
如果此算法導致應該送回家的人入獄,該怎么辦?

If you are spending the night in jail only because your name isn’t Jared and you didn’t play lacrosse, your plight is no longer a humorous cautionary tale.
如果您只是因為您的名字不是賈里德(Jared)并且沒有參加曲棍網兜球而在監獄里過夜,那么您的困境不再是一個幽默的警示故事。
And when considering Outcome of a single unwarranted night in jail, there is one conclusion―
考慮到監獄里一個不必要的夜晚的結果,有一個結論-
An Outcome like that cannot be.
這樣的結果不可能 。
Even if a robotic algorithm leads to 100 just verdicts in a row, if the 101st leads to an unjust jail sentence, that cannot be.
即使自動執行算法導致連續100次判決,如果第101條導致不公正的監禁判決,那也不可能。
There are protections against this of course―the legal system understands, in theory at least, that an unjust sentence cannot be.
當然可以防止這種情況的發生-法律制度至少在理論上理解到,不公正的判決是不可能的 。
But we’re dealing with algorithms here, and they often operate at a level far beyond our understanding of what can and cannot be.
但是我們在這里處理算法,它們的工作水平通常超出了我們對可能和不可能的理解。
簡要說明一下-算法無法從技術上顯示基于受憲法保護的類的偏見,但他們通常會找到實現此目的的方法 (A brief aside — Algorithms cannot technically show bias based on Constitutionally protected classes, but they often find ways to do this)
It’s not just morality prohibiting bias in high stakes algorithmic decisions, it’s the Constitution.
不僅道德禁止高風險算法決策中的偏見,還在于憲法。
Algorithms are prohibited from showing bias―or preferences―based on ethnicity, gender, sexual orientation and many other things.
禁止算法基于種族,性別,性取向和許多其他事物顯示偏見或偏好。

Those cannot be a factor, due to them being a Constitutionally-protected class.
由于它們是受憲法保護的階級 ,所以它們不能成為一個因素。
But what about secondary characteristics that imply any of the above?
但是暗示上述任何特征的次要特征呢?
Again, algorithms are great at finding patterns, and even if they are told to ignore certain categories, they can―and will―find patterns that act as substitute for those categories.
再有,算法擅長于找到模式,即使被告知忽略某些類別,它們也可以并且會找到能夠替代那些類別的模式。
Consider these questions―
考慮這些問題-
- What gender has a name like Jared? 什么性別的人都喜歡Jared?
- What kind of background suggests that a person played lacrosse in high school? 什么樣的背景表明一個人在高中打曲棍球?
And going a bit further―
再往前走-
- What is implied by the zip code of the subject’s home address? 主題的家庭住址的郵政編碼意味著什么?
So no, an algorithm―particularly one born of a public institution like a courthouse―cannot show bias against Constitutionally-protected classes.
因此,不能,一種算法(尤其是像法院這樣的公共機構出生的算法)不能表現出對受憲法保護的階級的偏見。
But it might, and probably will if we are not vigilant.
但是,如果我們不保持警惕,它可能會,也可能會。
算法可以使您產生偏見嗎? 考慮到算法無處不在-答案可能是肯定的。 (Can algorithms make you biased? Considering algorithms are everywhere―the answer may be yes.)
You don’t have to be an HR person at a Tech company or a bail-setting judge to become biased by algorithms.
您不必成為技術公司的人力資源人員或保釋法官就可以對算法產生偏見。
If you live in the modern world and―
如果您生活在現代世界中,并且-
Engage in Social Media, read a news feed, go onto dating apps, or do just about anything online―that bias will be sent down to you.
參與社交媒體,閱讀新聞提要,使用約會應用程序或在線上進行幾乎所有操作,這些偏見都會被發送給您。
Bias will influence the friends you choose, the beliefs you have, the people you date and everything else.
偏見會影響您選擇的朋友,您的信念,與您約會的人以及其他所有因素。

The average smartphone user engages with 9 apps per day, and spends about 2 hours and 15 minutes per day interacting with them.
智能手機的平均用戶每天使用9個應用程序 ,并且每天花費約2個小時15分鐘與之互動 。
And what are the inner-workings of these apps?
這些應用程序的內部功能是什么?
That’s a mystery to the user.
這對用戶來說是個謎。
What are the inner-workings of the algorithms inside these apps?
這些應用程序內部算法的內部運作方式是什么?
The inner-workings of the apps are a black box to both the user and the company that designed them.
應用程序的內部工作對于用戶和設計它們的公司來說都是一個黑匣子。
當然,恒定的算法數據流會導致長期存在的隱性和隱性系統偏差 (And of course, the constant stream of algorithmic data can lead to the perpetuation of insidious, and often unseen systemic bias)
Dr. Lee gave this example on the podcast―
李博士在播客上舉了這個例子
One thing for example I think we say in the paper which I think is just profound is that as an African-American who may be served more higher-interest credit card rates, what if I see that ad come through, and I click it just because I’m interested to see why I’m getting this ad, automatically I will be served similar ads, right? So it automatically places me in that high credit risk category. The challenge that we’re having now, Noah, is that as an individual consumer I have no way of recurating what my identity is.
例如,我認為我們在論文中說的一件我認為意義深遠的事情是,作為可能會獲得更高利率信用卡利率的非裔美國人,如果我看到該廣告通過,然后點擊它,該怎么辦?因為我很想知道為什么要得到這則廣告,所以會自動向我投放類似的廣告,對嗎? 因此,它自動將我置于高信用風險類別中。 諾亞,我們現在面臨的挑戰是,作為個人消費者,我無法重新獲得自己的身份。
Dr. Lee has a Doctorate and is a Senior Fellow at a prestigious institute, and has accomplished countless other things.
Lee博士擁有博士學位,并且是著名研究所的高級研究員,并完成了無數其他工作。
But if an algorithm sends her an ad for a high-interest credit card because of her profile, and she inadvertently clicks an ad, or even just hovers her mouse over an ad, that action is registered and added to her profile.
但是,如果算法由于她的個人資料而向她發送了一張針對高息信用卡的廣告,而她無意間點擊了廣告, 甚至只是將鼠標懸停在廣告上 ,該操作就會被注冊并添加到她的個人資料中。

And then her credit is dinged, because another algorithm sees her as the type of person who clicks or hovers on ads for high-interest credit card rates.
然后,她的信用下降了,因為另一種算法將她視為點擊或徘徊在廣告上以獲得高利率信用卡利率的人的類型。
And of course, if an algorithm sees that lacrosse-playing Jareds should be served ads for Individual Retirement Accounts, that may lead to a different Outcome.
當然,如果算法發現應將打長曲棍球的Jareds投放給個人退休帳戶廣告,這可能會導致不同的結果。
Dr. Lee makes the point that this is no one’s fault per se, but systemic bias can certain show up.
李博士指出,這本身不是人的錯,但一定會出現系統性偏見。
Every response you make to a biased algorithm is added to your profile, even if the addition is antithetical to your true profile.
您對偏差算法做出的每個響應都會添加到您的配置文件中,即使添加的內容與您的真實配置文件相反。
And of course there is no way that any of us can know what our profile is, let alone recurate it.
當然,我們絕對不可能知道我們的個人資料,更不用說對其進行復述了。
因此,個人和系統無意間受到算法的偏見-我們該怎么辦? (So individuals and the system are unintentionally biased by algorithms―what do we do?)
First of all, we don’t scrap the whole system.
首先,我們不會廢棄整個系統。
Algorithms can make you biased, and as I showed in Part 1, data can lead you to a form of psychopathy.
算法可能會使您產生偏見,正如我在第1部分中所展示的,數據會導致您陷入某種精神病。
But algorithms and data also improve our lives in countless other ways. They can cure diseases and control epidemics. They can improve test scores of the children from underserved communities.
但是算法和數據還可以通過無數其他方式改善我們的生活。 他們可以治愈疾病并控制流行病。 他們可以提高服務不足社區兒童的考試成績。
Rockford, Illinois employed data and algorithms to end homelessness in their city.
伊利諾伊州羅克福德采用數據和算法來結束他們所在城市的無家可歸現象 。
They solved homelessness, and that is incredible.
他們解決了無家可歸的問題,這簡直令人難以置信。
So what do we do?
那么我們該怎么辦?
We tweak the system, and we tweak our own approach to it.
我們調整系統,然后調整自己的方法。
And we’ll do that in Part 3.
我們將在第3部分中進行說明。
Stay tuned!
敬請關注!
This article is Part 2 of a 3 Part series — The Perils and Promise of Data
本文是3部分系列的第2部分-數據的危害和承諾
Part 1 of this series is here— 3 ways data can turn anyone into a psychopath, including you
本系列的第1部分在這里- 數據可以使任何人變成精神病者(包括您)的3種方式
Part 3 of this series — Coming Soon!
本系列的第3部分-即將推出!
Jonathan Maas has a few books on Amazon, and you can contact him through Medium, or Goodreads.com/JMaas .
喬納森·馬斯(Jonathan Maas) 在亞馬遜上有幾本書 ,您可以通過Medium或Goodreads.com/JMaas與他聯系。
翻譯自: https://medium.com/predict/algorithms-can-leave-anyone-biased-including-you-f39cb6abd127
算法偏見是什么
本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。 如若轉載,請注明出處:http://www.pswp.cn/news/389461.shtml 繁體地址,請注明出處:http://hk.pswp.cn/news/389461.shtml 英文地址,請注明出處:http://en.pswp.cn/news/389461.shtml
如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!