大數據業務學習筆記_學習業務成為一名出色的數據科學家

大數據業務學習筆記

意見 (Opinion)

A lot of aspiring Data Scientists think what they need to become a Data Scientist is :

許多有抱負的數據科學家認為,成為一名數據科學家需要具備以下條件:

  • Coding

    編碼
  • Statistic

    統計
  • Math

    數學
  • Machine Learning

    機器學習
  • Deep Learning

    深度學習

And any other technical skills.

以及其他任何技術技能。

The above list is accurate; most of the Data Scientist qualification you need right now is what I list above. It is unavoidable, as many job listing right now always list these skills as a prerequisite. Just look at the example of Data Scientist job requirements and preferences below.

上面的清單是準確的; 我上面列出的是您現在需要的大多數數據科學家資格。 這是不可避免的,因為現在很多工作清單總是將這些技能列為前提條件。 只需看下面的數據科學家工作要求和偏好示例。

Image for post
Taken from indeed.com
摘自確實網站

Most of the requirements sound technical; degree, coding, math, and stats. Although, there is an underlying business understanding requirement that you might not realize at first from this job advertisement.

大部分要求聽起來都是技術性的; 學位,編碼,數學和統計信息。 但是,有一個潛在的業務理解要求,您可能首先不會從此招聘廣告中意識到。

If you look closely, they require someone that had experience in applying the analytical method to solve practical business problems. It implies your everyday task would consisting of solving the business problem, which in turn, you need to understand what kind of business the company runs and how the process itself works.

如果您仔細觀察,他們會要求那些具有應用分析方法來解決實際業務問題的經驗的人。 這意味著您的日常任務將包括解決業務問題 ,而這又需要您了解公司經營哪種業務以及流程本身如何運作。

You might ask, “Why do I need to understand it? Just create the machine learning model and the problem is solved, isn’t it?” Well, that line of thinking is dangerous, and I would explain why.

您可能會問:“為什么我需要了解它? 只需創建機器學習模型即可解決問題,不是嗎?” 好吧,這種思路很危險,我將解釋原因。

Just for a reminder, I would argue what makes you great as a Data Scientist is not only how well your coding skill is or how much you understand the statistical theory or even the master of business understanding, but it is a combination of many.

提醒您, 讓我成為數據科學家的不僅僅在于您的編碼技能如何,或者您對統計理論甚至對業務理解的掌握有多少而且還包括很多方面。

Anybody, of course, could agree or not with my opinion as I believe there are no specific skills that make you a great Data Scientist.

當然,任何人都可以同意或不同意我的觀點,因為我相信沒有特定的技能可以使您成為一名出色的數據科學家。

Data Scientist employment is hard. It would not easy to get in this field. With many applicants and people with a similar set of skills, you need to stand out. Business Understanding is the skill that would certainly separate you from all the fish in the ponds.

數據科學家的工作很難。 進入這個領域并不容易。 由于許多申請人和具有類似技能的人,您需要脫穎而出。 業務理解能力無疑會使您與池塘中的所有魚區分開。

In my experience as a Data Scientist, there is no skill that I felt underrated as much as the business understanding skill. I even thought that you don’t need to understand the business in my early career. How wrong I was.

根據我作為數據科學家的經驗,沒有什么比業務理解技能低估了。 我什至以為您在我的早期職業中不需要了解業務。 我錯了

I am not ashamed, though, to admit that I did not consider the business aspect essential at first because many data science education and books did not even teach us about this.

但是,我并不感到ham愧,因為我一開始并不認為業務方面是必不可少的,因為許多數據科學教育和書籍甚至都沒有教過我們這一點。

So, why is it crucial to learn the business and how it impacts your employment as a Data Scientist?

那么,為什么學習業務至關重要,它又如何影響您作為數據科學家的工作呢?

Just imagine this situation. You work in the data department of the food industry with candy as their main product, and the company plans to release a new sour candy product. The company then ask the sales department to sell the product. Now, the sales department know that the company had a data department and requesting the data team to give new leads where they can sell sour candy.

試想一下這種情況。 您在食品工業的數據部門工作時,以糖果為主要產品,并且該公司計劃發布一種新的酸味糖果產品。 然后,公司要求銷售部門出售產品。 現在,銷售部門知道該公司有一個數據部門,并要求數據團隊提供新的線索以銷售酸味糖果。

Before anybody complains that “This is not our job, we create a machine learning model!” or “I work as a data scientist, not in the sales department.” No, this is precisely what Data scientists do in the company; many of the projects are to work with another department for solving the company problem.

在有人抱怨“這不是我們的工作之前,我們創建了機器學習模型!” 或“我是數據科學家,而不是在銷售部門。” 不,這正是數據科學家在公司中所做的; 許多項目將與另一個部門合作解決公司問題。

Back to our scenario, how do you correctly approach this problem then? You might think, “Just create a machine learning model to generate the leads.” Yes, it is on the right track, but how exactly you create the model? On what basis? Is the business question even viable enough to solved using the machine learning model?

回到我們的情況,那么您如何正確解決此問題? 您可能會想,“只要創建一個機器學習模型來生成線索即可。” 是的,它是在正確的軌道上,但是您如何精確地創建模型? 在什么基礎上? 業務問題是否足夠可行,可以使用機器學習模型解決?

You can’t just suddenly using a machine learning model, right? This is why business understanding is so crucial as a Data Scientist. You need to understand how the candy business in more detail. Keep asking a question like,

您不能只是突然使用機器學習模型,對嗎? 這就是為什么業務理解對數據科學家如此重要的原因。 您需要更詳細地了解糖果業務。 繼續問一個問題,

  • What kind of business question exactly we want to solve?”

    我們到底想解決什么樣的業務問題?”

  • “Would we even need a machine learning model?”

    “我們甚至需要機器學習模型嗎?”

  • “What kind of attributes related to candy sales?”

    “與糖果銷售相關的屬性是什么?”

  • “How is the candy selling strategy and practice within and outside of the company?”.

    “公司內部和外部的糖果銷售策略和實踐如何?”

And many more business questions you could think of related to the business.

還有更多您可能想到的與業務相關的業務問題。

It is important to know what kind of business your company run and everything related to the business as your work as a data scientist would need you to make sense of the data.

了解您的公司經營哪種業務以及與該業務相關的所有事項非常重要,因為作為數據科學家,您需要了解數據

While it is easy to say that business understanding skill is essential, it is not easy to gain one.

雖然容易理解業務理解技能是必不可少的,但要獲得一項技能卻并不容易。

Education is one thing; for example, you might have a higher chance to stand out to applying for a data science position in the PR company if your educational background is communication compared to someone with a biology degree.

教育是一回事; 例如,與具有生物學學位的人相比,如果您的教育背景是交流,那么您可能有更大的機會脫穎而出在PR公司申請數據科學職位。

Although work experience quickly covers this. Working experience with another job title in a similar business industry would provide significant leverage, as you already understand the business process.

盡管工作經驗很快就涵蓋了這一點。 由于您已經了解業務流程,因此在類似的業務行業中擁有另一個職務的工作經驗將提供重要的影響。

For a fresher, it might be a hard industry to break in, but in hindsight, there are many benefits as a fresher as well. I remember Tyler Folkman’s post on his LinkedIn why the industry should consider recent graduates, and I agree. The recent graduate could:

對于新生,這可能是一個很難進入的行業,但是事后看來,新生也有很多好處。 我記得泰勒·福克曼(Tyler Folkman)在其LinkedIn上的帖子,為什么該行業應考慮應屆畢業生,我也同意。 應屆畢業生可以:

  1. Come with preparation

    附帶準備
  2. Hungry to learn about the business

    渴望了解業務
  3. Make an impact

    產生影響

Freshers should a target for companies that have established their data journeys. The company could teach many things about business more easily as fresher have no experience at all in the business world. In my opinion, never count out the freshers.

新生應該成為建立數據旅程的公司的目標。 該公司可以更輕松地教授有關業務的許多事情,因為剛開始的新手根本沒有業務領域的經驗。 我認為,永遠不要指望新生。

I also would tell you about my experience, as well. When I first get the data project, I was not thinking about the business at all and just tried to build the machine learning model. And how disastrous it turns out to be.

我也將告訴您我的經歷。 當我第一次獲得數據項目時,我根本沒有考慮業務,只是嘗試構建機器學習模型。 事實證明這是多么的災難。

I present the model to the related parties with hype in my brain. My model result is good, I know everything about the data, and I know the theory of the model I used. Easy peasy, right? So, wrong. It turns out that the user did not care about the model I used. They are more interested in knowing if I already consider a business approach “A” or why I used the data that should not relate at all to the business. It ends with a discussion that I need more business training.

我在腦海中大肆宣傳該模型。 我的模型結果很好,我了解所有有關數據的知識,并且知道我使用的模型的理論。 輕輕松松吧? 大錯特錯。 事實證明,用戶并不關心我使用的模型。 他們更想知道我是否已經考慮過業務方法“ A”,或者為什么我使用了與業務根本不相關的數據。 最后,我需要更多的業務培訓。

It is embarrassing, but I am not ashamed at all to admit that it is my fault not to consider business understanding. I could be the best in model creation or statistic, but not knowing the business turns out to be a disaster. Since that day, I try to learn more about the business process itself, even before considering any of the technical things.

令人尷尬,但我完全不as愧承認不考慮業務了解是我的錯。 在模型創建或統計方面,我可能是最好的,但我不知道這業務真是一場災難。 從那天開始,即使在考慮任何技術問題之前,我也會嘗試進一步了解業務流程本身。

結論 (Conclusion)

In my opinion, fresher or not, try to learn the business as much as possible.

我認為,無論是否新鮮,都應盡可能多地學習業務。

Focus on one industry you feel interested in; finance, banking, credit, automotive, candy, oil, etc. Every single business has a different approach and strategy; you just need to focus on learning the industry you like.

專注于您感興趣的一個行業; 金融,銀行,信貸,汽車,糖果,石油等。每一項業務都有不同的方法和策略; 您只需要專注于學習自己喜歡的行業即可。

Data scientist employment is hard. It was not easy to get into this field. With many applicants and many people with a similar set of skills, you need to stand out. Business understanding is the skill that will undoubtedly separate you from all the fish in the pond.

數據科學家的工作很難。 進入這個領域并不容易。 在許多申請人和具有相似技能的許多人中, 您需要脫穎而出。 業務理解能力無疑會使您與池塘中的所有魚類區分開。

翻譯自: https://towardsdatascience.com/learn-the-business-to-become-a-great-data-scientist-635fa6029fb6

大數據業務學習筆記

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/news/388090.shtml
繁體地址,請注明出處:http://hk.pswp.cn/news/388090.shtml
英文地址,請注明出處:http://en.pswp.cn/news/388090.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

postman 請求參數為數組及JsonObject

2019獨角獸企業重金招聘Python工程師標準>>> 1. (1)數組的請求方式(post) https://blog.csdn.net/qq_21205435/article/details/81909184 (2)數組的請求方式(get) http://localhost:port/list?ages10,20,30 后端接收方式: PostMa…

領扣(LeetCode)對稱二叉樹 個人題解

給定一個二叉樹,檢查它是否是鏡像對稱的。 例如,二叉樹 [1,2,2,3,4,4,3] 是對稱的。 1/ \2 2/ \ / \ 3 4 4 3但是下面這個 [1,2,2,null,3,null,3] 則不是鏡像對稱的: 1/ \2 2\ \3 3說明: 如果你可以運用遞歸和迭代兩種方法解決這個問題&#…

python 開發api_使用FastAPI和Python快速開發高性能API

python 開發apiIf you have read some of my previous Python articles, you know I’m a Flask fan. It is my go-to for building APIs in Python. However, recently I started to hear a lot about a new API framework for Python called FastAPI. After building some AP…

Purley平臺Linpak測試,從踏坑開始一步步優化

Purley平臺Linpak測試,從踏坑開始一步步優化 #記2017年11月第一次踏坑事件 測試平臺配置: 6nodes CPU: Intel Gold 6132 2.6GHz 14C RAM: 8G *12 2666MHz NET: Infiband FDR OS: centos7.2 mpi: Intel-mpi hpl: xhpl.intel 開始踏第一坑 現象&#xff1a…

基于easyui開發Web版Activiti流程定制器詳解(一)——目錄結構

題外話(可略過): 前一段時間(要是沒記錯的話應該是3個月以前)發布了一個更新版本,很多人說沒有文檔看著比較困難,所以打算拿點時間出來詳細給大家講解一下,…

HDOJ 2037:今年暑假不AC_大二寫

AC代碼&#xff1a; #include <iostream> #include <cstdio> #include <algorithm> #define Max 105 using namespace std;struct TimeList {int start;int end; }timelist[Max]; bool compare(TimeList a, TimeList b) {if(a.end b.end)return a.start &l…

基于easyui開發Web版Activiti流程定制器詳解(二)——文件列表

&#xfeff;&#xfeff;上一篇我們介紹了目錄結構&#xff0c;這篇給大家整理一個文件列表以及詳細說明&#xff0c;方便大家查找文件。 由于設計器文件主要保存在wf/designer和js/designer目錄下&#xff0c;所以主要針對這兩個目錄進行詳細說明。 wf/designer目錄文件詳解…

杭電oj2047-2049、2051-2053、2056、2058

2047 阿牛的EOF牛肉串 1 #include<stdio.h>2 3 int main(){4 int n,i;5 _int64 s[51];6 while(~scanf("%d",&n)){7 s[1]3;s[2]8;8 for(i3;i<n;i){9 s[i] s[i-1]*2 s[i-2]*2; 10 } 11 print…

Power BI:M與DAX以及度量與計算列

When I embarked on my Power BI journey I was almost immediately slapped with an onslaught of foreign and perplexing terms that all seemed to do similar, but somehow different, things.當我開始Power BI之旅時&#xff0c;我幾乎立刻受到了外國和困惑術語的沖擊&am…

git 基本命令和操作

設置全局用戶名密碼 $ git config --global user.name runoob $ git config --global user.email testrunoob.comgit init:初始化倉庫 創建新的 Git 倉庫 git clone: 拷貝一個 Git 倉庫到本地 : git clone [url]git add:將新增的文件添加到緩存 : git add test.htmlgit status …

基于easyui開發Web版Activiti流程定制器詳解(三)——頁面結構(上)

&#xfeff;&#xfeff;上一篇介紹了定制器相關的文件&#xff0c;這篇我們來看看整個定制器的界面部分&#xff0c;了解了頁面結構有助于更好的理解定制器的實現&#xff0c;那么現在開始吧&#xff01; 首先&#xff0c;我們來看看整體的結構&#xff1a; 整體結構比較簡單…

bzoj 4300 絕世好題 —— 思路

題目&#xff1a;https://www.lydsy.com/JudgeOnline/problem.php?id4300 記錄一下 mx[j] 表示以第 j 位上是1的元素結尾的子序列長度最大值&#xff0c;轉移即可。 代碼如下&#xff1a; #include<iostream> #include<cstdio> #include<cstring> #include&…

基于easyui開發Web版Activiti流程定制器詳解(四)——頁面結構(下)

&#xfeff;&#xfeff;題外話&#xff1a; 這兩天周末在家陪老婆和兒子沒上來更新請大家見諒&#xff01;上一篇介紹了調色板和畫布區的頁面結構&#xff0c;這篇講解一下屬性區的結構也是定制器最重要的一個頁面。 屬性區整體頁面結構如圖&#xff1a; 在這個區域可以定義工…

梯度下降法優化目標函數_如何通過3個簡單的步驟區分梯度下降目標函數

梯度下降法優化目標函數Nowadays we can learn about domains that were usually reserved for academic communities. From Artificial Intelligence to Quantum Physics, we can browse an enormous amount of information available on the Internet and benefit from it.如…

FFmpeg 是如何實現多態的?

2019獨角獸企業重金招聘Python工程師標準>>> 前言 眾所周知&#xff0c;FFmpeg 在解碼的時候&#xff0c;無論輸入文件是 MP4 文件還是 FLV 文件&#xff0c;或者其它文件格式&#xff0c;都能正確解封裝、解碼&#xff0c;而代碼不需要針對不同的格式做出任何改變&…

基于easyui開發Web版Activiti流程定制器詳解(五)——Draw2d詳解(一)

&#xfeff;&#xfeff;背景&#xff1a; 小弟工作已有十年有余&#xff0c;期間接觸了不少工作流產品&#xff0c;個人比較喜歡的還是JBPM&#xff0c;因為出自名門Jboss所以備受推崇&#xff0c;但是現在JBPM版本已經與自己當年使用的版本&#xff08;3.X&#xff09;大相徑…

Asp.net MVC模型數據驗證擴展ValidationAttribute

在Asp.Mvc項目中有自帶的一套完整的數據驗證功能&#xff0c;客戶端可以用HtmlHelper工具類&#xff0c;服務端可以用ModelState進行驗證。而他們都需要System.ComponentModel.DataAnnotations類庫中的特性功能&#xff0c;通過在屬性上方添加特性就可以達到驗證前后端驗證數據…

seaborn 子圖_Seaborn FacetGrid:進一步完善子圖

seaborn 子圖Data visualizations are essential in data analysis. The famous saying “one picture is worth a thousand words” holds true in the scope of data visualizations as well. In this post, I will explain a well-structured, very informative collection …

基于easyui開發Web版Activiti流程定制器詳解(六)——Draw2d的擴展(一)

&#xfeff;&#xfeff;題外話&#xff1a; 最近在忙公司的云項目空閑時間不是很多&#xff0c;所以很久沒來更新&#xff0c;今天補上一篇&#xff01; 回顧&#xff1a; 前幾篇介紹了一下設計器的界面和Draw2d基礎知識&#xff0c;這篇講解一下本設計器如何擴展Draw2d。 進…

深度學習網絡總結

1.Siamese network Siamese [sai? mi:z] 孿生 左圖的孿生網絡是指兩個網絡通過共享權值實現對輸入的輸出&#xff0c;右圖的偽孿生網絡則不共享權值(pseudo-siamese network)。 孿生神經網絡是用來衡量兩個輸入的相似度&#xff0c;可以用來人臉驗證、語義相似度分析、QA匹配…