數據科學家編程能力需要多好
I have held the title of data scientist in two industries. I’ve interviewed for more than 30 additional data science positions. I’ve been the CTO of a data-centric startup. I’ve done many hours of data science consulting.
我曾擔任過兩個行業的數據科學家。 我已經面試了30多個其他數據科學職位。 我曾擔任以數據為中心的初創公司的CTO。 我已經完成了許多小時的數據科學咨詢。
With that background, you will hopefully realize that I’m not a data denier. I’m a firm believer in the power of statistics, machine learning, and all the tools in a data scientist’s toolbox. I know that data science is a powerhouse field filled with amazing people that are changing the world.
有這樣的背景,您將有希望認識到我不是拒絕數據的人。 我堅信統計,機器學習以及數據科學家工具箱中的所有工具的強大功能。 我知道數據科學是一個強大的領域,充滿著改變世界的杰出人士。
That being said, many companies don’t need a data scientist.
話雖這么說,許多公司并不需要數據科學家。
No, that wasn’t strong enough. Let me try again.
不,那還不夠強大。 讓我再試一遍。
The vast majority of companies that are looking for a data scientist don’t need one.
尋找數據科學家的絕大多數公司都不需要。
Of all the companies I’ve worked or interviewed with as a data scientist, I’d say 80% of them were looking for the wrong role.
在我作為數據科學家工作或采訪過的所有公司中,我要說其中80%都在尋找錯誤的角色。
Some of them just needed a data analyst. Others needed a data engineer or a data architect. The rest didn’t have a data need at all.
其中一些只需要一個數據分析師。 其他人則需要數據工程師或數據架構師。 其余的完全沒有數據需求。
您想解決什么問題? (What problem are you looking to solve?)
I always ask this question when someone is looking to hire me. Originally, I asked what they were looking to do with their data, but I’ve since realized that the answer to that latter question doesn’t matter. The focus needs to be on the problem, not the solution. Companies hire to solve problems.
當有人要雇用我時,我總是問這個問題。 最初,我問他們想如何處理他們的數據,但后來我意識到對后一個問題的答案并不重要。 重點需要放在問題上,而不是解決方案上。 公司雇用來解決問題。
Good companies don’t hire a position because it’s trendy to have around. They hire because — for every dollar that employee costs them — they are getting more than a dollar in return. It’s that simple. It’s all about ROI.
好的公司不會雇用職位,因為這很時髦。 他們之所以雇用,是因為-員工每花費1美元,他們就會獲得超過1美元的回報。 就這么簡單。 都是關于投資回報率的。
All companies understand that when it comes to positions like accounting and sales because they know how ROI works for accounting or sales. They know what problem needs to be solved and they know who can do it.
所有公司都了解會計和銷售等職位,因為他們知道投資回報率如何用于會計或銷售。 他們知道需要解決什么問題,并且知道誰可以解決。
But data confuses companies. It especially confuses older companies, but startups are not immune. We’ve all been told that there’s gold in them thar data.
但是數據使公司感到困惑。 它尤其使較老的公司感到困惑,但是初創公司并非無法幸免。 我們都被告知這些數據中有黃金。
And who doesn’t love a good gold rush?
還有誰不喜歡淘金熱呢?
Just like the gold rush of old, most people don’t know where to look for the gold, many of them have fallen for fool’s gold, and no matter how much a vein has been picked clean, people keep coming back looking for scraps.
就像古老的淘金熱一樣,大多數人都不知道在哪里尋找黃金,其中許多人已經淪為傻瓜的黃金,而且無論清理了多少靜脈,人們都不斷回來尋找廢料。
The underlying issue is that companies have been told their data is valuable. And it might be. But whether packaged for sale or used internally, data is a part of a solution, and every solution’s value is determined by the cost of the problem it is solving.
根本問題是,公司被告知其數據很有價值。 可能是這樣。 但是,無論是打包出售還是內部使用,數據都是解決方案的一部分,每個解決方案的價值都取決于解決方案的成本。
Without a problem, a solution is just an idea. And, as I’ve mentioned in multiple previous posts, ideas are worthless.
沒有問題,解決方案只是一個想法。 而且,正如我在之前的多篇文章中提到的那樣,想法毫無價值。
Data rushes happen because companies have a solution — data — and they are looking for a problem to apply it to. It’s a completely backward approach. You don’t decide to use screws because you have a screwdriver handy. You decide to use a screwdriver because you need to tighten a screw.
出現數據高峰是因為公司擁有解決方案-數據-并且他們正在尋找將其應用的問題。 這是一種完全落后的方法。 由于螺絲刀很方便,因此您不決定使用螺釘。 您決定使用螺絲刀,因為您需要擰緊螺絲。
Data is a resource. So why is data not treated like any other resource?
數據是一種資源。 那么為什么數據沒有像其他資源一樣被對待呢?
Data is inherently different than other resources in one important way.
數據在一種重要方式上與其他資源固有地不同。
Let’s look at oil, a pretty standard resource. Unless you are The Beverly Hillbillies, you don’t just find oil lying around in your backyard. If you have thousands of tons of oil, you have it because you planned to have it for a specific purpose. And once you use it for that purpose, it’s gone.
讓我們看一下石油,這是一種非常標準的資源。 除非您是The Beverly Hillbillies ,否則您不僅會發現后院周圍散布著石油。 如果您有數千噸的石油,那么就擁有它是因為您計劃將其用于特定目的。 一旦將其用于此目的,它就消失了。
But companies have exabytes of data. Maybe they had it for a purpose. Maybe there was a regulatory requirement for them to keep it. Maybe it was just easier to keep than to throw away.
但是公司擁有EB級的數據。 也許他們有目的。 也許他們有保留的監管要求。 也許保留起來比扔掉要容易。
Whatever the reason, they have it now, and they want to use it. They just don’t know what to use it for. And they often assume data scientists are the answer. After all, data is right there in the title, and scientists are smart.
無論出于何種原因,他們現在都擁有它,并且想要使用它。 他們只是不知道用它做什么。 他們通常認為數據科學家就是答案。 畢竟,數據就在標題中,科學家是聰明的。
科學家不是你拼寫工程師的方式 (S-c-i-e-n-t-i-s-t is not how you spell engineer)

Let me give these companies the benefit of the doubt and say they actually do have problems that their data could solve. That still doesn’t necessarily make hiring a data scientist the correct next step.
讓我給這些公司帶來疑問的好處,并說他們確實確實存在其數據可以解決的問題。 但這并不一定使下一步聘請數據科學家成為正確的選擇。
Data scientists solve puzzles. They take billions of pieces of data and turn them into a single, cohesive picture. But they can’t do that if you don’t give them all the pieces.
數據科學家解決難題。 他們獲取數十億條數據,并將它們轉變為單一的,有凝聚力的圖像。 但是,如果您不給他們所有的東西,他們將無法做到這一點。
If your data streams into ten different systems that don’t talk to each other, you are setting your data scientist up for failure. You need someone that can bridge those systems, bringing the data into a single place. That’s the job of a data engineer, not a data scientist. Depending on the situation, you may also need data architecture, data modeling, and database administration.
如果您的數據流到十個彼此不通信的不同系統中,那么您將使數據科學家面臨失敗的準備。 您需要可以橋接這些系統的人員,將數據放在一個地方。 那是數據工程師的工作,而不是數據科學家的工作。 根據情況,您可能還需要數據體系結構,數據建模和數據庫管理。
If you really want to, you can find a data scientist that can handle everything from the engineering to the DB admin work. I’ve been that data scientist. But my rate was much higher than what they would have paid to just hire the correct person for the job.
如果確實需要,您可以找到一個數據科學家,可以處理從工程到數據庫管理員的所有工作。 我一直是那個數據科學家。 但是我的薪水比他們僅僅雇用合適的人所付出的薪水要高得多。
Why did they overpay? Because they didn’t yet understand the current status of their data or what a data scientist actually does.
他們為什么多付錢? 因為他們還不了解數據的當前狀態或數據科學家的實際行為。
Why did I take the job? Because I was too naive to know better.
我為什么要這份工作? 因為我太天真,無法更好地了解。
Everyone would have been better off if the company had hired a data engineer, waited 6–12 months, then brought on a data scientist when they were fully prepared.
如果公司聘請了一位數據工程師,等待了6到12個月,然后在他們做好充分準備的情況下請來了一位數據科學家,那么每個人都會過得更好。
準備? 有目標嗎? 聘請! (Ready? Have an aim? Hire!)
Has your company identified problems that you need data science to solve?
您的公司是否已確定需要數據科學解決的問題?
Is your data in a state that a data scientist can work with?
您的數據處于數據科學家可以使用的狀態嗎?
If you answered both of these with a definitive ‘yes’, then you may need a data scientist. Congratulations, your company is doing things right. Pat yourselves on the back no more than three times then go do some amazing things.
如果您用肯定的“是”回答了這兩個問題,那么您可能需要一位數據科學家。 恭喜,您的公司做對了。 拍拍自己的背部不超過三遍,然后去做一些令人驚奇的事情。
If you answered either question with a ‘no’ or a general look of confusion, then save your money and a data scientist’s sanity by taking down that job posting you just put up. Maybe replace it with a posting for a data engineer or data analyst. Or maybe just be happy not to have to go through the hiring process.
如果您回答“否”或普遍感到困惑,則可以通過刪除剛提出的工作來節省金錢和數據科學家的理智。 也許將其替換為數據工程師或數據分析師的帖子。 或者也許只是高興地不必經歷整個招聘過程。
Not sure what you need? Talk to a data consultant before you waste your money.
不確定你需要什么? 在浪費金錢之前,請與數據顧問聯系。
Like this advice? Take 0.001% of the money you just saved and buy me a drink someday.
喜歡這個建議嗎? 拿走您剛存的錢的0.001%,有一天再給我喝一杯。
翻譯自: https://medium.com/swlh/do-we-need-data-scientists-8d8e8062688a
數據科學家編程能力需要多好
本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。 如若轉載,請注明出處:http://www.pswp.cn/news/389058.shtml 繁體地址,請注明出處:http://hk.pswp.cn/news/389058.shtml 英文地址,請注明出處:http://en.pswp.cn/news/389058.shtml
如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!