什么事數據科學_如果您想進入數據科學,則必須知道的7件事

什么事數據科學

No way. No freaking way to enter data science any time soon…That is exactly what I thought a year back.

沒門。 很快就不會出現進入數據科學的怪異方式 ……這正是我一年前的想法。

A little bit about my data science story: I am a complete beginner in the Data Science field and I was desperately looking for a switch from digital marketing to data science exactly 6 months back. I assume you may want to ask..why desperately? Well, Because I became over confident in my job hunting abilities and resigned my ex-job without a backup. I started panicking during the last few days of my notice period. All the courses and tutorials available online and just the vast number of topics I had to cover to get started in data science was overwhelming for me. They say time flies and boy do I agree! It has already been half a year into my first data science job. I cannot wait to share all the learnings and experiences with you. If you are currently in the same shoes as I was, go on and keep reading for insights and motivation.

關于數據科學的故事:我是數據科學領域的一個完整的初學者,而我拼命地希望在6個月前從數字營銷轉向數據科學。 我想你可能想問..為什么要拼命? 好吧,因為我對自己的求職能力變得過于自信,并辭掉了我的前工作而沒有后援。 在通知期的最后幾天,我開始驚慌失措。 在線提供的所有課程和教程,以及我在數據科學入門中必須涵蓋的大量主題,對我來說是不勝枚舉的。 他們說時光飛逝,男孩,我同意! 我的第一份數據科學工作已經半年了。 我迫不及待想與您分享所有的學習和經驗。 如果您目前的狀態與我相同,請繼續閱讀以獲取見識和動力。

  1. Practice more than you read:

    練習比:

I remember going through every single data science boot camp course available in Udemy and buying a couple of top rated courses that covered Python, SQL, Tableau and Machine Learning topics (Pro tip: Don’t go for generic “Data Science boot camps”. These courses don’t cover important topics in depth. Instead, try tool-specific boot camps like python boot camp, SQL boot camp, Deep Learning boot camp etc.). The courses were all detailed and honestly very helpful. But even after all the 50+ hours of lectures and many assignments, I was still someone with no data science experience. Even the basic analysis tasks in the first month of my job were relatively difficult for me. I was absolutely struggling to meet deadlines.

我記得我要遍歷Udemy中的每個數據科學新手訓練營課程,并購買幾個涵蓋Python,SQL,Tableau和機器學習主題的最受好評的課程(專業提示:不要參加通用的“數據科學新手訓練營”。這些課程沒有深入介紹重要的主題,而是嘗試使用特定于工具的新手訓練營,例如python新手訓練營,SQL新手訓練營,深度學習新手訓練營等 。 這些課程都很詳盡,說實話非常有幫助。 但是即使經過了50多個小時的講座和許多任務,我仍然還是沒有數據科學經驗的人。 就連我上班第一個月的基本分析任務對我來說都是相對困難的。 我絕對難以按時完成任務。

Image for post
PinterestPinterest購買

Looking back, I feel that I focused more on learning and less on practicing. I listened to all the lectures which covered new topics in every lecture, did some teeny tiny assignments and thought I am doing it all the right way. However, I think of it all very differently now. Learning should be through practicing and implementing new ideas. That is when you make mistakes, observe new things, research on how to code the solution in a better way and you know..really learn. This certainly happened after starting my latest job as I had to work on new ideas every day and implement them. Trust me, that is when I picked up actual skills. If you are in the online course phase, spare some time to build projects and implement the topics you learned.

回顧過去,我覺得我更多地專注于學習而不是練習。 我聽了所有講座,每次講座都涵蓋了新主題,做了一些小小的小作業,并以為我做得很好。 但是,我現在對這一切的看法截然不同。 學習應該通過實踐和實施新思想來進行。 那就是當您犯錯,觀察新事物,研究如何以更好的方式編寫解決方案的代碼時,您才真正了解。 這肯定是在開始我的最新工作后發生的,因為我每天必須研究和實施新的想法。 相信我,那是我掌握實際技能的時候。 如果您處于在線課程階段,請花一些時間來構建項目并實施您學到的主題。

2. Coding skills:

2.編碼技巧:

Image for post
https://changhsinlee.com/https://changhsinlee.com/購買

Most people who try to enter this field have a slight misconception that data science involves relatively less coding than software engineering. There is a little bit of truth to it. Because if you take Python which is the widely used language in data science, there are built-in libraries for almost all types of algorithms and operations. Though these libraries are very helpful, there is only so much they can do. I for one thought that data science is all about data analysis, plots, model fitting, prediction and accuracy metrics. These things are of course a part of it but software engineering is another huge part too. For example, when you want to build a production level product recommendation engine pipeline, you will have to work on many things like SQL scripts, data sync, training, tuning, prediction, evaluation frameworks, unit testing, logging, dashboards, admin panel, model deployment, version control and so much more. All of this combined involves a hell lot of critical thinking and coding. This is the kind of stuff you will work in the long run or maybe in your first few months! I am not saying that you need to know everything about coding everything but some level of proficiency in coding will be needed and also useful for you.

大多數嘗試進入該領域的人都有些誤解,認為數據科學涉及的編碼少于軟件工程。 有一點道理。 因為如果您使用Python(這是數據科學中廣泛使用的語言),那么幾乎所有類型的算法和操作都有內置的庫。 盡管這些庫非常有用,但是它們只能做很多事情。 我曾經以為,數據科學就是關于數據分析,圖表,模型擬合,預測和準確性指標的全部。 這些當然是其中的一部分,但是軟件工程也是另一個重要部分。 例如,當您要構建生產級別的產品推薦引擎管道時,您將需要處理許多事情,例如SQL腳本,數據同步,培訓,調整,預測,評估框架,單元測試,日志記錄,儀表板,管理面板,模型部署,版本控制等等。 所有這些結合在一起涉及大量的批判性思維和編碼。 從長遠來看,或者您可能會在頭幾個月中使用這種東西! 我并不是說您需要了解有關一切編碼的所有知識,但是將需要一定程度的編碼熟練度,并且對您也很有用。

3. No pressure to learn every single data science tool:

3.沒有學習每個數據科學工具的壓力:

There are way too many data science tools in the market and it can be quite confusing to find where to start. The best option is to learn one data science friendly coding language, one database tool and one visualization tool. This is a good way to begin with and is like the basic requirement for many entry level roles. When you are just laying the foundation, don’t pressure yourself to learn too many tools. Instead, take things slowly. Understand the basics and explore topics in depth in whatever tool you learn. You will eventually learn many tools when you are in the job due to project requirements or just while working on your passion projects.

市場上有太多的數據科學工具,很難找到從哪里開始。 最好的選擇是學習一種數據科學友好的編碼語言,一種數據庫工具和一種可視化工具。 這是開始的好方法,就像許多入門級角色的基本要求一樣。 當您只是奠定基礎時,不要強迫自己學習太多的工具。 相反,慢慢來。 了解基礎知識,并以所學的任何工具深入探討主題。 由于項目要求或在從事激情項目時,您最終將在工作中學習許多工具。

Image for post
UdemyUdemy購買

I started with Python, SQL and Tableau when I was searching for a job. Nothing more. Now I know to work on a couple of other tools like Spark, Hbase, Kibana, Dash, Elasticsearch and Logstash. I am sure I will have to learn new tools in the coming days. The point is, learn a tool with utmost clarity of how it will be useful for your requirement.

在尋找工作時,我從Python,SQL和Tableau開始。 而已。 現在我知道要使用其他幾個工具,例如Spark,Hbase,Kibana,Dash,Elasticsearch和Logstash。 我敢肯定,未來幾天我將不得不學習新工具。 重點是,要學習一種最清楚如何滿足您的需求的工具。

4. You are ready to take interviews:

4.您準備接受采訪:

Tell that to yourself whenever you feel like skipping an interview call or meeting because your brain is telling you that you are not going to make it. I cannot remember the number of times I learned something new while attending an interview. It is either about the data science industry or new products or just a concept. I am not suggesting you to attend interviews randomly to learn stuff. It would be an obvious waste of time for the poor interviewer. Data science is a vague term and so are the job requirements for every data science role. You might never feel ready if you want to tick every single job requirement before attending an interview.

每當您想跳過面試電話或會議時告訴自己,因為您的大腦告訴您您不會參加。 我不記得參加面試時學習新知識的次數。 它與數據科學行業或新產品有關,或者只是一個概念。 我不建議您隨機參加面試以學習知識。 對于可憐的面試官來說,這顯然是浪費時間。 數據科學是一個模糊的術語,每個數據科學角色的工作要求也是如此。 如果您想在參加面試之前打勾每個工作要求,您可能永遠也不會做好準備。

GiphyGiphy購買

The preparation phase can be a long one too. It depends on your learning speed and prior knowledge. It is very easy to get stuck in that phase because there are too many topics to cover. Set goals during interview preparation and as you achieve those goals, start looking for interview opportunities. Every time you fail an interview, you will find the need to improve on a particular area or learn a new market requirement. And that my friend will help you in the next interviews.

準備階段也可能很長。 這取決于您的學習速度和先驗知識。 由于涉及的主題太多,因此很容易陷入這一階段。 在準備面試時設定目標,并在實現這些目標時開始尋找面試機會。 每次面試失敗時,您都會發現需要改進特定領域或了解新的市場需求。 我的朋友會在下次面試中為您提供幫助。

5. Ideal companies to apply for data science roles

5.申請數據科學職位的理想公司

Usually, people are flexible about roles and companies when applying for interviews as beginners. But if you are wondering what is the type of company in which you should apply for a data science role, it is completely subjective. Let us talk about product-based and service-based companies from a data science perspective. Service companies usually work on one-time data analysis or prototype whereas product companies involve rigorous software development and data analysis is just a part of it. Python, R. Powerpoint and Excel will do the job for you most of the days in service companies whereas product companies will want you to work on whatever tool is required to do the job. Basically, product companies will involve a lot of software engineering in addition to data analysis.

通常,在初學者申請面試時,人們會靈活選擇角色和公司。 但是,如果您想知道應申請數據科學職位的公司類型,那完全是主觀的。 讓我們從數據科學的角度談談基于產品和基于服務的公司。 服務公司通常從事一次性數據分析或原型工作,而產品公司則涉及嚴格的軟件開發,而數據分析只是其中的一部分。 在服務公司中,Python,R。Powerpoint和Excel大部分時間都可以為您完成工作,而產品公司則希望您使用所需的任何工具來完成工作。 基本上,產品公司除數據分析外還將涉及許多軟件工程。

They work on projects that will help them to improve their products by incorporating data science in them or they make new data based products like product recommendation engine, AI-based chatbots etc. or they just use analytics to make better decisions in the organization. Service companies work on analytics projects purely based on client requirements. So like I said it is up to your interests. Choose wisely!

他們從事的項目將通過整合數據科學來幫助他們改善產品,或者開發基于新數據的產品,例如產品推薦引擎,基于AI的聊天機器人等,或者他們只是使用分析方法在組織中做出更好的決策。 服務公司純粹根據客戶需求來進行分析項目。 因此,就像我說的那樣,這取決于您的興趣。 做出明智的選擇!

6. Data Science can be frustrating:

6.數據科學可能令人沮喪:

Data-based problems are very interesting to work on but some can be equally frustrating too. One of the difficult aspects of your work will be just to patiently wait for good results. Often you might not know whether you are going in the right direction. There are too many unknowns and a lot of things in your project will require plain trial and error to arrive at an optimal solution. Like they say it is all fun and games till you reach the hyper-parameter tuning part of your model :)

基于數據的問題非常有趣,但是有些問題同樣令人沮喪。 工作的困難之處之一就是耐心等待良好的結果。 通常,您可能不知道自己是否朝著正確的方向前進。 未知數太多,您項目中的許多事情都需要經過反復試驗才能得出最佳解決方案。 就像他們說的那樣,這很有趣,也很有趣,直到您到達模型的超參數調整部分為止:)

Most of us do a Proof of Concept before implementing any solution. But sometimes even POCs fail to give insights about certain hiccups you might face during the actual task. For example, once at work, we spent an entire month researching and implementing a solution for our pipeline. It eventually didn’t work out. We had to start all over again and this caused a huge progress lag in the supposedly well-performing project. The key take away from a couple of incidents like this is that always set clear goals, evaluate your POC thoroughly and when stuck at a point for too long, just remember to try fast, fail fast, evaluate fast and try again fast. Being fast is super important for good progress.

我們大多數人在實施任何解決方案之前都要進行概念驗證。 但是有時候,甚至POC都無法提供您在實際任務中可能遇到的某些打h的見解。 例如,一旦上班,我們就花了整整一個月的時間研究和實施管道解決方案。 最終沒有奏效。 我們不得不重新開始,這在原本表現良好的項目中造成了巨大的進度滯后。 避免發生此類事件的關鍵是始終設定明確的目標,徹底評估POC,并且在某個時間停留太長時間時,請記住要快嘗試,快失敗,快評估并再試一次。 快節奏對于取得良好的進步至關重要。

7. Your storytelling skills will matter a lot:

7. 您的講故事技巧非常重要:

You will most likely be dealing with customers from non-technical backgrounds. Your organization leaders may not be data scientists. Your own teammates might be from diverse backgrounds (pure mathematicians, some API users etc.). These are the people who will recognize your work and will add value to your work.

您很可能會與非技術背景的客戶打交道。 您的組織負責人可能不是數據科學家。 您自己的隊友可能來自不同的背景(純數學家,某些API用戶等)。 這些人將認可您的工作并為您的工作增添價值。

It is so important that you communicate your thoughts, ideas, analyses and results in an interactive and understandable way to your audience. I clearly remember struggling in my first team meeting with the CEO where we had to explain the progress in projects, discuss use cases and future AI goals. That is when it hit me that sticking to numbers and just analytical skills are not enough. A good story explaining the analysis can interest your manager. A story explaining how a particular data science solution can solve the pain point of a problem can interest your customer. Different stories have different impacts on different people. Frame your story carefully with data science elements like visualizations, dashboards, reports etc. and put your everything in it while delivering it.

以互動和易于理解的方式與聽眾交流思想,想法,分析和結果非常重要。 我清楚地記得,在與首席執行官的第一次團隊會議中,我們不得不解釋項目的進展,討論用例和未來的AI目標時遇到的困難。 那就是讓我感到震驚的是,僅僅依靠數字和僅僅分析技能是不夠的。 講解分析的好故事會讓您的經理感興趣。 解釋特定數據科學解決方案如何解決問題痛點的故事可能會使您的客戶感興趣。 不同的故事對不同的人有不同的影響。 借助可視化,儀表板,報告等數據科學元素精心構建故事,并在交付時將所有內容放入其中。

Final Thoughts:

最后的想法:

Data Science is no rocket science. If I can do it, then you can do it too! There is no good time as now to enter this fast-growing field. That being said, it definitely gets a little bit tough to keep up with all the new things happening in this field and the competition. But, what matters is that we learn, implement, make mistakes and grow consistently. Happy analyzing:)

數據科學不是火箭科學。 如果我可以做到,那么您也可以做到! 現在沒有進入這個快速增長領域的好時機。 話雖這么說,要跟上該領域和競爭中發生的所有新事物肯定會有點困難。 但是,重要的是我們學習,實施,犯錯誤并不斷成長。 分析愉快:)

翻譯自: https://medium.com/swlh/7-things-you-must-know-if-youre-trying-to-enter-data-science-2a9a531750e0

什么事數據科學

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/news/389915.shtml
繁體地址,請注明出處:http://hk.pswp.cn/news/389915.shtml
英文地址,請注明出處:http://en.pswp.cn/news/389915.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

python基礎03——數據類型string

1. 字符串介紹 在python中,引號中加了引號的字符都被認為是字符串。 1 namejim 2 address"beijing" 3 msg My name is Jim, I am 22 years old! 那單引號、雙引號、多引號有什么區別呢? 1) 單雙引號木有任何區別,部分情況 需要考慮…

Java基礎-基本數據類型

Java中常見的轉義字符: 某些字符前面加上\代表了一些特殊含義: \r :return 表示把光標定位到本行行首. \n :next 表示把光標定位到下一行同樣的位置. 單獨使用在某些平臺上會產生不同的效果.通常這兩個一起使用,即:\r\n. 表示換行. \t :tab鍵,長度上相當于四個或者是八個空格 …

季節性時間序列數據分析_如何指導時間序列數據的探索性數據分析

季節性時間序列數據分析為什么要進行探索性數據分析? (Why Exploratory Data Analysis?) You might have heard that before proceeding with a machine learning problem it is good to do en end-to-end analysis of the data by carrying a proper exploratory …

TortoiseGit上傳項目到GitHub

1. 簡介 gitHub是一個面向開源及私有軟件項目的托管平臺,因為只支持git 作為唯一的版本庫格式進行托管,故名gitHub。 2. 準備 2.1 安裝git:https://git-scm.com/downloads。無腦安裝 2.2 安裝TortoiseGit(小烏龜):https://torto…

496. 下一個更大元素 I

496. 下一個更大元素 I 給你兩個 沒有重復元素 的數組 nums1 和 nums2 ,其中nums1 是 nums2 的子集。 請你找出 nums1 中每個元素在 nums2 中的下一個比其大的值。 nums1 中數字 x 的下一個更大元素是指 x 在 nums2 中對應位置的右邊的第一個比 x 大的元素。如果…

利用PHP擴展Taint找出網站的潛在安全漏洞實踐

一、背景 筆者從接觸計算機后就對網絡安全一直比較感興趣,在做PHP開發后對WEB安全一直比較關注,2016時無意中發現Taint這個擴展,體驗之后發現確實好用;不過當時在查詢相關資料時候發現關注此擴展的人數并不多;最近因為…

美團騎手檢測出虛假定位_在虛假信息活動中檢測協調

美團騎手檢測出虛假定位Coordination is one of the central features of information operations and disinformation campaigns, which can be defined as concerted efforts to target people with false or misleading information, often with some strategic objective (…

869. 重新排序得到 2 的冪

869. 重新排序得到 2 的冪 給定正整數 N ,我們按任何順序(包括原始順序)將數字重新排序,注意其前導數字不能為零。 如果我們可以通過上述方式得到 2 的冪,返回 true;否則,返回 false。 示例 …

org.apache.maven.archiver.MavenArchiver.getManifest

eclipse導入新的maven項目時,pom.xml第一行報錯: org.apache.maven.archiver.MavenArchiver.getManifest(org.apache.maven.project.MavenProject, org.apache.maven.archiver.MavenArchiveConfiguration) 解決辦法: help -> Install New…

殺進程常用命令

殺進程命令pkill 進程名killall 進程名 # 平緩kill -HUP pid # 平緩kill -USR2 pidkill pid (-9 不要使用)轉載于:https://www.cnblogs.com/jmaly/p/9492406.html

CertUtil.exe被利用來下載惡意軟件

1、前言 經過國外文章信息,CertUtil.exe下載惡意軟件的樣本。 2、實現原理 Windows有一個名為CertUtil的內置程序,可用于在Windows中管理證書。使用此程序可以在Windows中安裝,備份,刪除,管理和執行與證書和證書存儲相…

335. 路徑交叉

335. 路徑交叉 給你一個整數數組 distance 。 從 X-Y 平面上的點 (0,0) 開始,先向北移動 distance[0] 米,然后向西移動 distance[1] 米,向南移動 distance[2] 米,向東移動 distance[3] 米,持續移動。也就是說&#x…

回歸分析假設_回歸分析假設的最簡單指南

回歸分析假設The Linear Regression is the simplest non-trivial relationship. The biggest mistake one can make is to perform a regression analysis that violates one of its assumptions! So, it is important to consider these assumptions before applying regress…

Spring Aop之Advisor解析

2019獨角獸企業重金招聘Python工程師標準>>> 在上文Spring Aop之Target Source詳解中,我們講解了Spring是如何通過封裝Target Source來達到對最終獲取的目標bean進行封裝的目的。其中我們講解到,Spring Aop對目標bean進行代理是通過Annotatio…

react事件處理函數中綁定this的bind()函數

問題引入 import React, { Component } from react; import {Text,View } from react-native;export default class App extends Component<Props> {constructor(props){super(props)this.state{times:0}this.timePlusthis.timePlus.bind(this);}timePlus(){let timethis…

301. 刪除無效的括號

301. 刪除無效的括號 給你一個由若干括號和字母組成的字符串 s &#xff0c;刪除最小數量的無效括號&#xff0c;使得輸入的字符串有效。 返回所有可能的結果。答案可以按 任意順序 返回。 示例 1&#xff1a; 輸入&#xff1a;s “()())()” 輸出&#xff1a;["(())…

為什么隨機性是信息

用位思考 (Thinking in terms of Bits) Imagine you want to send outcomes of 3 coin flips to your friends house. Your friend knows that you want to send him those messages but all he can do is get the answer of Yes/No questions arranged by him. Lets assume th…

Chrome無法播放m3u8格式的直播視頻流的問題解決

出國&#xff0c;然后安裝這個插件即可&#xff1a;Native HLS Playback https://chrome.google.com/webstore/detail/native-hls-playback/emnphkkblegpebimobpbekeedfgemhof?hlzh-CN轉載于:https://www.cnblogs.com/EasonJim/p/8737001.html

大數據相關從業_如何在組織中以數據從業者的身份閃耀

大數據相關從業Build bridges, keep the maths under your hat and focus on serving.架起橋梁&#xff0c;將數學放在腦海中&#xff0c;并專注于服務。 通過協作而不是通過孤立的孤島來交付出色的數據工作。 (Deliver great data work through collaboration not through co…

暑假周總結六

本周開始了做網站的商品展示和商品查詢的功能&#xff0c;基本功能已完成了。平均每天花4到5個小時進行學習和編碼 這周學習了lucene分詞器&#xff0c;但是雖然學了一些這些方面的東西&#xff0c;但是查詢的時候效果還是不行&#xff0c;還是繼續學習 一些更好處理關鍵字的方…