英國腦科學領域_來自英國A級算法崩潰的數據科學家的4課

英國腦科學領域

In the UK, families, educators, and government officials are in an uproar about the effects of a new algorithm for scoring “A-levels,” the advanced level qualifications used to evaluate students’ knowledge of specific subjects in preparation for university study.

在英國,家庭,教育工作者和政府官員對一種新的“ A-levels ”評分算法的效果感到震驚,“ A-levels ”是用于評估學生對特定科目的知識以準備大學學習的高級證書。

A-level courses usually culminate with exams conducted in testing centers. Because of COVID-19, this year’s exams were canceled. In lieu of those exams and their decisive scores, Ofqual, the government agency responsible for scoring students’ work in A-level classes, opted to use a new algorithm to grade students’ work by applying statistical models of school performance from earlier years. Teachers had already graded the students’ work as part of their coursework, but the algorithm overrode those grades, dropping final scores a full letter for 36% of entries and two full letters for 3% of entries. For thousands of students, A’s became B’s, B’s became C’s, and in a few cases, B’s became D’s. Some students failed their classes, because the algorithm determined that some students must fail. Meanwhile, 5% of well-off students attending private schools saw their scores increase.

A級課程通常以在測試中心進行的考試達到高潮。 由于COVID-19,今年的考試被取消了。 代替對這些考試及其決定性成績的評分,負責對學生在A級課程中的工作進行評分的政府機構Ofqual選擇采用一種新算法,通過應用早些年的學校成績統計模型來對學生的工作進行評分。 老師已經為學生的作業評分,這是他們課程學習的一部分,但是該算法取代了這些成績, 最終分數降低了36%的滿分和2%的3% 。 對于成千上萬的學生,A變成B,B變成C,在少數情況下,B變成D。 一些學生的課程不及格,因為該算法確定某些學生必須不及格。 同時,上私立學校的小康學生中有5%的分數有所提高。

The algorithm’s grading scheme affected disadvantaged students the most. As The Guardian notes:

該算法的評分方案對處境不利的學生影響最大。 正如《衛報》所述

“Pupils from disadvantaged backgrounds have been worst hit by the controversial standardisation process used to award A-level grades in England this year, while pupils at private schools benefited the most. Private schools increased the proportion of students achieving top grades — A* and A — twice as much as pupils at comprehensives. . . . Pupils in lower socioeconomic backgrounds were most likely to have the grades proposed by their teachers overruled, while those in wealthier areas were less likely to be downgraded, according to the analysis.”

“來自貧困家庭的學生受到今年用于授予英國A級成績的有爭議的標準化程序的打擊最大,而私立學校的學生受益最大。 私立學校增加了達到最高成績A *和A的學生比例,是綜合學生的兩倍。 。 。 。 分析認為,社會經濟背景較低的學生最有可能推翻老師建議的成績,而較富裕地區的學生則不太可能被降級。

The lower grades led some universities and medical schools to revoke the acceptance letters. Affected students were crushed.

低年級導致一些大學和醫學院撤銷了錄取通知書。 受影響的學生被壓碎了。

Now many universities are reversing those decisions, and the government is performing a “U-turn,” accepting teachers’ grades as the final A-level scores. Still not impressed, a senior Tory MP is now calling for the abolition of Ofqual itself.

現在,許多大學正在扭轉這些決定,而政府正在執行“掉頭”,接受教師的成績作為最終的A級成績。 仍然沒有留下深刻印象的是, 一位高級保守黨議員現在呼吁取消Ofqual本身 。

數據科學家的經驗教訓 (Lessons for Data Scientists)

Here are four lessons this debacle offers data scientists and data engineers.

這是這場災難給數據科學家和數據工程師的四課。

1.如果結果看起來很奇怪,請仔細檢查算法。 (1. If the results seem odd, double-check your algorithm.)

If you’re developing an algorithm that lowers results for a significant number of entries — let alone 40% of entries — it’s time to re-evaluate your algorithm, especially if the results affect people in life-altering ways, such as denying them a mortgage or affecting which university they can attend.

如果您正在開發一種算法,該算法會降低大量條目的結果(更不用說40%的條目了),那么該是重新評估算法的時候了,尤其是當結果以改變生活的方式影響人們時,例如拒絕他們抵押或影響他們可以參加的大學。

Again, from The Guardian:

再次,從衛報

“Ofqual instead chose to focus on its own measure of accuracy — whether it was right ‘within a grade’. . . . But as any A-level student will tell you, accuracy ‘within a grade’ is meaningless. Ofqual may mark itself highly if it gives an A student a B, but for that student, the difference is life-changing.”

“ Ofqual而是選擇專注于自己的準確性衡量標準-是否“在等級內”是正確的。 。 。 。 但是,正如任何A級學生都會告訴您的那樣,“在年級內”的準確性是沒有意義的。 如果給A學生一個B,Ofqual可能會給予很高的評價,但是對于那個學生來說,差異是改變人生的。”

Keep your eyes open for shifts in data patterns, and understand what constitutes a significant change for the data science use case you’re working on.

睜大眼睛注意數據模式的變化,并了解什么構成了您正在研究的數據科學用例的重大變化。

2.如果結果似乎有偏差,請對算法進行三遍檢查。 (2. If the results seem biased, triple-check your algorithm.)

It’s one thing to produce unexpected results. It’s another thing to produce unexpected results that favor the wealthy and disadvantage everyone else. There’s a growing concern among data scientists and the public about the effects of bias in data science algorithms. If results are not only unexpected but clearly biased against an economic or racial cohort, the algorithm should be re-examined and corrected.

產生意想不到的結果是一回事。 產生意想不到的結果,有利于其他所有人的富人和弱勢,這是另一回事。 數據科學家和公眾對數據科學算法中偏差的影響越來越關注。 如果結果不僅出乎意料,而且明顯不利于經濟或種族,則應重新檢查和糾正該算法。

3.只要有可能,請尋求專家的幫助。 (3. Whenever possible, get help from experts.)

In April 2020, the Royal Statistical Society (RSS), a charity that promotes statistics for the common good, offered Ofqual the assistance of two of its fellows: Guy Nason, professor of statistics at Imperial College London, and Paula Williamson, professor of medical statistics at the University of Liverpool. But Ofqual would accept their assistance only if they agreed to sign a five-year non-disclosure agreement. The professors understandably refused, so Ofqual ended up applying its scoring algorithm without their guidance.

2020年4月,為促進公共利益而促進統計的慈善機構皇家統計學會 (RSS)向Ofqual 提供了兩個研究員的協助 :倫敦帝國理工學院統計學教授Guy Nason和醫學教授Paula Williamson利物浦大學的統計數據。 但是,只有當他們同意簽署為期五年的保密協議時,Ofqual才會接受他們的幫助。 教授們拒絕了,這是可以理解的,因此Ofqual最終在他們的指導下應用了其評分算法。

Many projects can benefit from a fresh perspective and outside expertise. If you can get second or third opinions, do so.

許多項目可以從嶄新的視角和外部專業知識中受益。 如果您可以獲得第二或第三意見,請這樣做。

4.要透明。 (4. Be transparent.)

It’s troubling that a government agency would try to keep its grading algorithm secret — especially when that algorithm determines which students will end up attending which universities. One can’t help but wonder if Ofqual realized that its algorithm was biased and wished to conceal the details.

令人不安的是,政府機構將試圖保密其評分算法,尤其是當該算法確定哪些學生最終將進入哪所大學時。 人們不禁要問,Ofqual是否意識到其算法有偏見,并希望隱瞞細節。

If data scientists want the public to trust the results of their algorithm, then it’s best to be open about how that algorithm works.

如果數據科學家希望公眾信任其算法的結果,那么最好對算法的工作方式持開放態度。

As the leadership of the RSS wrote in a letter to the Office of Statistics Regulation on August 14, 2020:

正如RSS的領導在2020年8月14日給統計局的信中寫道:

“One issue underpinning trustworthiness of statistics is their quality and accuracy, which is why we have summarised some of our technical concerns. But another element in trustworthiness is the transparency with which the statistics have been set out and considered, and the extent to which they meet public need.”

“統計數據可信賴性的一個問題是統計數據的質量和準確性,這就是我們總結一些技術問題的原因。 但是可信性的另一個要素是,統計數據的制定和考慮的透明度以及滿足公眾需求的程度。”

Transparency matters. People need to be able understand how criteria are evaluated and decisions are made. Critically, transparent discussions of algorithms should take place before analytical results are shared with the public. Transparency should help guide decision-making, not excuse it.

透明度很重要。 人們需要能夠理解如何評估標準和制定決策。 至關重要的是,在與公眾分享分析結果之前,應該對算法進行透明的討論。 透明應該幫助指導決策,而不是原諒。

透明,公平和道德的重要性 (The Importance of Transparency, Fairness, and Ethics)

Ultimately, data science involves more than statistics. It also requires ethics, an open mind, and a clear understanding of the results that algorithms can have on people’s lives.

最終,數據科學不僅涉及統計。 它還需要道德,開放的心態以及對算法可能對人們的生活產生的影響的清晰理解。

Let’s close with these words from the RSS:

讓我們從RSS中的這些詞結束:

The use of statistics for public good is based only partly on technical statistical issues. Some statistics are technically bad, wrong or worse than others because of the way that data are gathered, or the statistical modelling that takes place. But in many cases, statistics or statistical models are inadequate for the weight being put on them in decision-making, or embed various other judgements that need to be clear. . . . So while we continue to have concerns about various technical decisions made by the qualification regulators, we also believe that having an more open discussion about this well before individual results were announced would have resulted in more trust in, and more trustworthy, statistical choices, in part because there would have been greater understanding of the underlying principles being applied and more detailed justifications of them.”

為公共利益使用統計信息僅部分基于技術統計問題。 由于收集數據的方式或進行的統計建模,某些統計在技術上比其他統計差,錯或差。 但是在許多情況下,統計數據或統計模型不足以在決策中施加重擔,或者嵌入各種其他需要明確的判斷。 。 。 。 因此,盡管我們繼續對資格認證監管機構做出的各種技術決策表示擔憂,但我們也相信,在宣布單個結果之前就此問題進行更加公開的討論,將會使人們更加信任,更值得信賴的統計選擇。部分是因為人們將對所應用的基本原理有更深入的了解,并對其有更詳細的論據。”

They point out that fairness is more than a matter of statistics:

他們指出,公平不僅僅是統計問題:

“‘Fairness’ is not of course a statistical concept. Different and reasonable people will have different judgements about what is ‘fair’, both in general and about this particular issue. . . . But a statistical procedure should be capable of being judged as ‘fair’ or ‘reasonable’ in advance of its being used or knowing which individuals may be affected.”

“公平”當然不是一個統計概念。 總體而言,對于這個“特殊”問題,不同的,合理的人會有不同的判斷。 。 。 。 但是統計程序應該能夠在被使用或知道哪些人可能受到影響之前被判斷為“公平”或“合理”。

The importance of attention to detail, of openness to expert opinion, of transparency, and of a keen sense of what’s fair and how data science results affect real people — these are the lessons that data scientists can take away from the UK’s A-levels debacle.

注重細節,保持專家意見的開放性,透明性以及對公平事物以及數據科學成果如何影響真實人的敏銳感知的重要性,這些都是數據科學家可以從英國A級考試崩潰中吸取的教訓。

On this occasion, even the manner of testing itself proves to be educational.

在這種情況下,甚至測試方式本身也被證明具有教育意義。

翻譯自: https://medium.com/data-culpa/four-lessons-for-data-scientists-from-the-uks-a-levels-algorithm-debacle-e0e7ea41bd59

英國腦科學領域

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/news/392544.shtml
繁體地址,請注明出處:http://hk.pswp.cn/news/392544.shtml
英文地址,請注明出處:http://en.pswp.cn/news/392544.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

MVC發布后項目存在于根目錄中的子目錄中時的css與js、圖片路徑問題

加載固定資源js與css <script src"Url.Content("~/Scripts/js/jquery.min.js")" type"text/javascript"></script> <link href"Url.Content("~/Content/css/shop.css")" rel"stylesheet" type&quo…

telegram 機器人_學習使用Python在Telegram中構建您的第一個機器人

telegram 機器人Imagine this, there is a message bot that will send you a random cute dog image whenever you want, sounds cool right? Let’s make one!想象一下&#xff0c;有一個消息機器人可以隨時隨地向您發送隨機的可愛狗圖像&#xff0c;聽起來很酷吧&#xff1…

判斷輸入的字符串是否為回文_刷題之路(九)--判斷數字是否回文

Palindrome Number問題簡介&#xff1a;判斷輸入數字是否是回文,不是返回0,負數返回0舉例:1:輸入: 121輸出: true2:輸入: -121輸出: false解釋: 回文為121-&#xff0c;所以負數都不符合3:輸入: 10輸出: false解釋: 倒序為01&#xff0c;不符合要求解法一&#xff1a;這道題比較…

python + selenium 搭建環境步驟

介紹在windows下&#xff0c;selenium python的安裝以及配置。1、首先要下載必要的安裝工具。 下載python&#xff0c;我安裝的python3.0版本,根據你自己的需要安裝下載setuptools下載pip(python的安裝包管理工具) 配置系統的環境變量 python,需要配置2個環境變量C:\Users\AppD…

VirtualBox 虛擬機復制

本文簡單講兩種情況下的復制方式 1 跨電腦復制 2 同一virtrul box下 虛擬機復制 ---------------------------------------------- 1 跨電腦復制 a虛擬機 是老的虛擬機 b虛擬機 是新的虛擬機 新虛擬機b 新建&#xff0c; 點擊下一步會生成 相應的文件夾 找到老虛擬機a的 vdi 文…

javascript實用庫_編寫實用JavaScript的實用指南

javascript實用庫by Nadeesha Cabral通過Nadeesha Cabral 編寫實用JavaScript的實用指南 (A practical guide to writing more functional JavaScript) Functional programming is great. With the introduction of React, more and more JavaScript front-end code is being …

數據庫數據過長避免_為什么要避免使用商業數據科學平臺

數據庫數據過長避免讓我們從一個類比開始 (Lets start with an analogy) Stick with me, I promise it’s relevant.堅持下去&#xff0c;我保證這很重要。 If your selling vegetables in a grocery store your business value lies in your loyal customers and your positi…

mysql case快捷方法_MySQL case when使用方法實例解析

首先我們創建數據庫表&#xff1a; CREATE TABLE t_demo (id int(32) NOT NULL,name varchar(255) DEFAULT NULL,age int(2) DEFAULT NULL,num int(3) DEFAULT NULL,PRIMARY KEY (id)) ENGINEInnoDB DEFAULT CHARSETutf8;插入數據&#xff1a;INSERT INTO t_demo VALUES (1, 張…

【~~~】POJ-1006

很簡單的一道題目&#xff0c;但是引出了很多知識點。 這是一道中國剩余問題&#xff0c;先貼一下1006的代碼。 #include "stdio.h" #define MAX 21252 int main() { int p , e , i , d , n 1 , days 0; while(1) { scanf("%d %d %d %d",&p,&e,&…

Java快速掃盲指南

文章轉自&#xff1a;https://segmentfault.com/a/1190000004817465#articleHeader22 JDK&#xff0c;JRE和 JVM 的區別 JVM&#xff1a;java 虛擬機&#xff0c;負責將編譯產生的字節碼轉換為特定機器代碼&#xff0c;實現一次編譯多處執行&#xff1b; JRE&#xff1a;java運…

xcode擴展_如何將Xcode插件轉換為Xcode擴展名

xcode擴展by Khoa Pham通過Khoa Pham 如何將Xcode插件轉換為Xcode擴展名 (How to convert your Xcode plugins to Xcode extensions) Xcode is an indispensable IDE for iOS and macOS developers. From the early days, the ability to build and install custom plugins ha…

leetcode 861. 翻轉矩陣后的得分(貪心算法)

有一個二維矩陣 A 其中每個元素的值為 0 或 1 。 移動是指選擇任一行或列&#xff0c;并轉換該行或列中的每一個值&#xff1a;將所有 0 都更改為 1&#xff0c;將所有 1 都更改為 0。 在做出任意次數的移動后&#xff0c;將該矩陣的每一行都按照二進制數來解釋&#xff0c;矩…

數據分析團隊的價值_您的數據科學團隊的價值

數據分析團隊的價值This is the first article in a 2-part series!!這是分兩部分的系列文章中的第一篇&#xff01; 組織數據科學 (Organisational Data Science) Few would argue against the importance of data in today’s highly competitive corporate world. The tech…

mysql 保留5位小數_小猿圈分享-MySQL保留幾位小數的4種方法

今天小猿圈給大家分享的是MySQL使用中4種保留小數的方法&#xff0c;希望可以幫助到大家&#xff0c;讓大家的工作更加方便。1 round(x,d)用于數據x的四舍五入, round(x) ,其實就是round(x,0),也就是默認d為0&#xff1b;這里有個值得注意的地方是&#xff0c;d可以是負數&…

leetcode 842. 將數組拆分成斐波那契序列(回溯算法)

給定一個數字字符串 S&#xff0c;比如 S “123456579”&#xff0c;我們可以將它分成斐波那契式的序列 [123, 456, 579]。 形式上&#xff0c;斐波那契式序列是一個非負整數列表 F&#xff0c;且滿足&#xff1a; 0 < F[i] < 2^31 - 1&#xff0c;&#xff08;也就是…

博主簡介

面向各層次&#xff08;從中學到博士&#xff09;提供GIS和Python GIS案例實驗實習培訓&#xff0c;以解決問題為導向&#xff0c;以項目實戰為主線&#xff0c;以科學研究為思維&#xff0c;不講概念&#xff0c;不局限理論&#xff0c;簡單照做&#xff0c;即學即會。 研究背…

自定義Toast 很簡單就可以達到一些對話框的效果 使用起來很方便

自定義一個layout布局 通過toast.setView 設置布局彈出一些警示框 等一些不會改變的提示框 很方便public class CustomToast {public static void showUSBToast(Context context) {//加載Toast布局 View toastRoot LayoutInflater.from(context).inflate(R.layout.toas…

微信小程序阻止冒泡點擊_微信小程序bindtap事件與冒泡阻止詳解

bindtap就是點擊事件在.wxml文件綁定:cilck here在一個組件的屬性上添加bindtap并賦予一個值(一個函數名)當點擊該組件時, 會觸發相應的函數執行在后臺.js文件中定義tapMessage函數://index.jsPage({data: {mo: Hello World!!,userid : 1234,},// 定義函數tapMessage: function…

同情機器人_同情心如何幫助您建立更好的工作文化

同情機器人Empathy is one of those things that can help in any part of life whether it’s your family, friends, that special person and even also at work. Understanding what empathy is and how it effects people took me long time. I struggle with human inter…

數據庫課程設計結論_結論

數據庫課程設計結論When writing about learning or breaking into data science, I always advise building projects.在撰寫有關學習或涉足數據科學的文章時&#xff0c;我總是建議構建項目。 It is the best way to learn as well as showcase your skills.這是學習和展示技…