hadoop將消亡
Harvard Business Review marked the boom of Data Scientists in their famous 2012 article “Data Scientist: Sexiest Job”, followed by untenable demand in the past decade. [3]
《哈佛商業評論 》在2012年著名的文章“數據科學家:最性感的工作”中標志著數據科學家的蓬勃發展,隨后十年來需求持續不振。 [3]
“..demand has raced ahead of supply. Indeed, the shortage of data scientists is becoming a serious constraint in some sectors.”
“ ..需求已經超越了供應。 實際上,在某些領域,數據科學家的短缺正在成為嚴重的制約因素。”
McKinsey & Co just published an article (Aug 2020) suggesting we rethink how many Data Scientists we really need in light of newer automation technologies (AutoML).[4]
麥肯錫公司 ( McKinsey&Co)剛剛發表了一篇文章(2020年8月),建議我們根據更新的自動化技術(AutoML)重新考慮真正需要多少數據科學家。[4]
“Over the long term, purely technical data scientists will still be needed, but simply far fewer than most currently predict.”
“從長遠來看,仍將需要純技術數據科學家,但遠遠少于目前大多數人的預測。”

In every boom cycle you have a shortage of talent and an influx of imposters or just less qualified people (eg, dot.com y2k if you could spell Java you were a software engineer). As domains mature, tools and automation weed out those who aren’t really qualified or aren’t doing high value work. Data Science is no different.
在每個繁榮周期中,您都會缺乏人才,冒名頂替的人或缺乏資格的人會涌入(例如,如果您可以拼寫Java,那么dot.com y2k就是您是一名軟件工程師)。 隨著領域的成熟,工具和自動化將淘汰那些沒有真正資格或沒有從事高價值工作的人。 數據科學也是如此。
骯臟的秘密 (The Dirty Secret)

Data Science secrets are not as exciting as celebrity sex secrets unfortunately. Behind this “sexy” job is the large amount of grunt work required of Data Science projects— some of which include:
不幸的是,數據科學的秘密并不像名人性秘密那樣令人興奮。 這項“性感”工作的背后是數據科學項目所需的大量繁瑣工作,其中包括:
- Data sourcing, validation and cleanup 數據來源,驗證和清理
- Trying feature combinations and engineered features 嘗試功能組合和工程功能
- Testing different models and model parameters 測試不同的模型和模型參數
Most agree that data-prep work is 80% of any ML/DS project [1] which has given rise to the Data Engineer specialty [2]. The remaining time is spent trying out features and testing models to squeeze out a few % pt’s of accuracy. It simply takes a lot of time — and while experience, intuition and luck allow a scientist to narrow down the scenarios, sometimes the best solution requires trying many extra atypical (almost random) scenarios. One solution is automation and utilizing brute-force compute cycles using the new breed of tools named AutoML.
大多數人都認為數據準備工作是任何ML / DS項目的80%[1],這引起了數據工程師的專長[2]。 剩下的時間用于測試功能和測試模型,以減少百分之幾點的準確性。 它僅花費大量時間 ,而經驗,直覺和運氣使科學家可以縮小方案的范圍, 有時最好的解決方案需要嘗試許多額外的非典型(幾乎隨機)方案。 一種解決方案是自動化,并使用名為AutoML的新型工具利用蠻力計算周期。
AutoML —就像天網嗎? (AutoML — Is it like Skynet ?)
Automated Machine Learning (AutoML) is software that automates of the repetitive work for you in an organized way. (Get a demo of H2O or DataRobot and see for yourself). Feed it the data, set the goal, and take a nap while it grinds thru iterations of features, models, and parameters. While it lacks domain expertise and precision, it makes up for it with brute force and superb bookkeeping/reporting (with some logic and heuristics of course) .
自動化機器學習(AutoML)是一種軟件,可以有組織地自動執行重復性工作。 (獲取H2O或DataRobot的演示,然后親自看看)。 在通過要素,模型和參數的迭代進行研磨時,向其提供數據,設定目標并小睡一會。 盡管它缺乏領域專業知識和準確性,但它用蠻力和出色的簿記/報告(當然有一些邏輯和啟發式)來彌補它。
When and if it replaces Scientists was polled on KDNuggets 5yrs ago?—?recent thinking is that time for some of us is?very?soon.
什么時候以及是否取代它,五年前就在KDNuggets上對《科學家》進行了調查-最近的想法是,對于我們中的某些人來說,這是很快的事情。

Not everyone agrees of course.
當然,并非所有人都同意。
Rachel Thomas of Fast.AI: “There are frequent media headlines about both the scarcity of machine learning talent and about the promises of companies claiming their products automate machine learning and eliminate the need for ML expertise altogether.” [7]
Fast.AI的Rachel Thomas: “關于機器學習人才的稀缺以及關于聲稱其產品實現機器學習自動化并完全消除ML專業知識需求的公司的承諾的媒體頭條經常出現。” [7]
Dr. Thomas seems to feel AutoML is misconstrued and a fair amount of hype. She makes compelling points to help us understand the full ML cycle and what AutoML is and what it isn’t. It does not replace the work of experts but it does highly augments their work — not yet Skynet but give it some time...
托馬斯博士似乎覺得AutoML被誤解了,并且大肆宣傳。 她讓引人注目分,幫助我們理解全ML周期,什么AutoML 是什么,它不是 。 它不能代替專家的工作,但是可以極大地增強他們的工作-還不是天網,但要花點時間...
那我的工作要走了嗎? (So Is My Job Going Away ?)
Google Brain co-founder Andrew Ng often states concern of imminent jobs losses caused by AI and ML [5]— however most analysis has been focused on operational and blue collar work. What about our cushy Data Science jobs? McKinsey describes the possible future awaiting us:
Google Brain的聯合創始人安德魯·伍(Andrew Ng)經常表示擔心由AI和ML造成的即將失業的工作[5],但是大多數分析都集中在運營和藍領工作上。 那我們輕松的數據科學工作呢? 麥肯錫描述了等待我們的未來:

The bright side is that Data Scientists are not being fully replaced (graphic shows 29% … )— but let’s focus on McKinsey’s point to rethink the number and skillset of scientists needed. The number of scientists may drop per project as you add AutoML to your team (bots like TARS, R2D2 or HAL), but most research still suggest that aggregate demand for humans (scientists) will continue to increase for the next 5yrs+ at least.
好的一面是,數據科學家還沒有被完全取代(圖形顯示為29%…),但是讓我們關注麥肯錫的觀點,重新考慮所需的科學家數量和技能。 當您向團隊中添加AutoML(像TARS,R2D2或HAL之類的機器人)時,每個項目的科學家人數可能會減少,但是大多數研究仍然表明,至少在接下來的5年以上,對人類(科學家)的總需求將繼續增長。
The bulk of online articles [9] make it clear Data Scientists are not dead after all. But most agree AutoML has come of age and is changing the makeup of projects and staffing even today. We all need to evolve, and as a Data Scientist you need to learn to leverage AutoML and related tech improvement or risk falling behind.
大量在線文章[9]清楚地表明,數據科學家畢竟還沒有死。 但是,大多數人都同意AutoML已經成熟,并且即使在今天也正在改變項目和人員配置。 我們每個人都需要發展,作為數據科學家,您需要學習利用AutoML和相關的技術改進,否則風險就會落伍。
Automation is a good thing — we can focus on higher value work and eliminate boring and repetitive tasks (albeit the the boring, repetitive work paid pretty well …). I think we know it makes sense, why pay us when they can pay a cheaper robot? Thus next time you’re on a project, ask yourself am I doing expert Data Scientist work, an impostor, or are my days numbered ?
自動化是一件好事—我們可以專注于更高價值的工作,并消除無聊的重復性工作(盡管無聊的重復性工作的報酬很好……)。 我認為我們知道這是有道理的,為什么當他們可以付錢購買更便宜的機器人時,為什么要付錢給我們呢? 因此,下次您進行項目時,請問自己是我在做數據科學家方面的專家工作,是騙子,還是我的工作日已過?
“Will the real data scientist please stand up?”
“請真正的數據科學家站起來嗎?”
The net takeaway — the future of DS/ML is bright but you need to embrace changes or you’ll go from Data Scientist to Dead Scientist. “Resistance is Futile” — but in this case assimilating will pay off.
最終的結果-DS / ML的未來是光明的,但是您需要擁抱變化,否則您將從數據科學家到死去的科學家。 “ 抵抗是徒勞的 ”-但在這種情況下,同化將奏效 。
參考和啟示 (References and Inspirations)
[1] Ruiz, “The 80/20 data science dilemna” — https://www.infoworld.com/article/3228245/the-80-20-data-science-dilemma.html
[1] Ruiz,“ 80/20數據科學難題” — https://www.infoworld.com/article/3228245/the-80-20-data-science-dilemma.html
[2] Angelov, “Rise of the Data Engineer” — https://towardsdatasciencte.com/the-rise-of-the-data-strategist-2402abd62866?_branch_match_id=764068755630717009
[2] Angelov ,“數據工程師的崛起” — https://towardsdatasciencte.com/the-rise-of-the-data-strategist-2402abd62866?_branch_match_id=764068755630717009
[3] HBR’s Sexiest job article— https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century
[3] HBR上最性感的工作文章-https : //hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-the-21st世紀
[4] McKinsey on Rethinking AI Talent — https://www.mckinsey.com/business-functions/mckinsey-analytics/our-insights/rethinking-ai-talent-strategy-as-automated-machine-learning-comes-of-age
[4]麥肯錫(McKinsey)關于對AI人才的重新思考— https://www.mckinsey.com/business-functions/mckinsey-analytics/our-insights/rethinking-ai-talent-strategy-as-automated-machine-learning-comes-of -年齡
[5] Andrew Ng’s thoughts on Jobs and AI — https://www.youtube.com/watch?v=aU4RQD--Lec
[5]吳安德(Andrew Ng)關于喬布斯和人工智能的思想-https: //www.youtube.com/watch?v= aU4RQD-- Lec
[6] Looking back at the 2015 Poll on AutoML — https://www.kdnuggets.com/2020/03/poll-automl-replace-data-scientists-results.html
[6]綜觀2015輪詢上AutoML背面- https://www.kdnuggets.com/2020/03/poll-automl-replace-data-scientists-results.html
[7] FastAI’s Rachel Thomas on the AutoML hype, what ML Scientists do and what AutoML can do — https://www.fast.ai/2018/07/12/auto-ml-1/
[7] FastAI的Rachel Thomas對AutoML的炒作,ML科學家做什么以及AutoML可以做什么— https://www.fast.ai/2018/07/12/auto-ml-1/
[8] Various references to Sci-Fi AI/robots — TARS from Interstellar, HAL from 2001, Borg assimilation from Star Trek, and of course Terminator’s Skynet.
[8]關于科幻AI /機器人的各種參考文獻:《星際穿越》中的TARS,2001年以來的HAL,《星際迷航》中的博格同化,當然還有終結者的天網。
[9] Various articles on AutoML vs Humans KDNuggets, Wired, and Medium.
[9]有關AutoML與人的KDNuggets的各種文章, Wired和Medium 。
翻譯自: https://towardsdatascience.com/data-scientists-adapt-or-die-2f009ebe4935
hadoop將消亡
本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。 如若轉載,請注明出處:http://www.pswp.cn/news/390600.shtml 繁體地址,請注明出處:http://hk.pswp.cn/news/390600.shtml 英文地址,請注明出處:http://en.pswp.cn/news/390600.shtml
如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!