什么事數據科學
No way. No freaking way to enter data science any time soon…That is exactly what I thought a year back.
沒門。 很快就不會出現進入數據科學的怪異方式 ……這正是我一年前的想法。
A little bit about my data science story: I am a complete beginner in the Data Science field and I was desperately looking for a switch from digital marketing to data science exactly 6 months back. I assume you may want to ask..why desperately? Well, Because I became over confident in my job hunting abilities and resigned my ex-job without a backup. I started panicking during the last few days of my notice period. All the courses and tutorials available online and just the vast number of topics I had to cover to get started in data science was overwhelming for me. They say time flies and boy do I agree! It has already been half a year into my first data science job. I cannot wait to share all the learnings and experiences with you. If you are currently in the same shoes as I was, go on and keep reading for insights and motivation.
關于數據科學的故事:我是數據科學領域的一個完整的初學者,而我拼命地希望在6個月前從數字營銷轉向數據科學。 我想你可能想問..為什么要拼命? 好吧,因為我對自己的求職能力變得過于自信,并辭掉了我的前工作而沒有后援。 在通知期的最后幾天,我開始驚慌失措。 在線提供的所有課程和教程,以及我在數據科學入門中必須涵蓋的大量主題,對我來說是不勝枚舉的。 他們說時光飛逝,男孩,我同意! 我的第一份數據科學工作已經半年了。 我迫不及待想與您分享所有的學習和經驗。 如果您目前的狀態與我相同,請繼續閱讀以獲取見識和動力。
Practice more than you read:
練習比:
I remember going through every single data science boot camp course available in Udemy and buying a couple of top rated courses that covered Python, SQL, Tableau and Machine Learning topics (Pro tip: Don’t go for generic “Data Science boot camps”. These courses don’t cover important topics in depth. Instead, try tool-specific boot camps like python boot camp, SQL boot camp, Deep Learning boot camp etc.). The courses were all detailed and honestly very helpful. But even after all the 50+ hours of lectures and many assignments, I was still someone with no data science experience. Even the basic analysis tasks in the first month of my job were relatively difficult for me. I was absolutely struggling to meet deadlines.
我記得我要遍歷Udemy中的每個數據科學新手訓練營課程,并購買幾個涵蓋Python,SQL,Tableau和機器學習主題的最受好評的課程(專業提示:不要參加通用的“數據科學新手訓練營”。這些課程沒有深入介紹重要的主題,而是嘗試使用特定于工具的新手訓練營,例如python新手訓練營,SQL新手訓練營,深度學習新手訓練營等 。 這些課程都很詳盡,說實話非常有幫助。 但是即使經過了50多個小時的講座和許多任務,我仍然還是沒有數據科學經驗的人。 就連我上班第一個月的基本分析任務對我來說都是相對困難的。 我絕對難以按時完成任務。

Looking back, I feel that I focused more on learning and less on practicing. I listened to all the lectures which covered new topics in every lecture, did some teeny tiny assignments and thought I am doing it all the right way. However, I think of it all very differently now. Learning should be through practicing and implementing new ideas. That is when you make mistakes, observe new things, research on how to code the solution in a better way and you know..really learn. This certainly happened after starting my latest job as I had to work on new ideas every day and implement them. Trust me, that is when I picked up actual skills. If you are in the online course phase, spare some time to build projects and implement the topics you learned.
回顧過去,我覺得我更多地專注于學習而不是練習。 我聽了所有講座,每次講座都涵蓋了新主題,做了一些小小的小作業,并以為我做得很好。 但是,我現在對這一切的看法截然不同。 學習應該通過實踐和實施新思想來進行。 那就是當您犯錯,觀察新事物,研究如何以更好的方式編寫解決方案的代碼時,您才真正了解。 這肯定是在開始我的最新工作后發生的,因為我每天必須研究和實施新的想法。 相信我,那是我掌握實際技能的時候。 如果您處于在線課程階段,請花一些時間來構建項目并實施您學到的主題。
2. Coding skills:
2.編碼技巧:

Most people who try to enter this field have a slight misconception that data science involves relatively less coding than software engineering. There is a little bit of truth to it. Because if you take Python which is the widely used language in data science, there are built-in libraries for almost all types of algorithms and operations. Though these libraries are very helpful, there is only so much they can do. I for one thought that data science is all about data analysis, plots, model fitting, prediction and accuracy metrics. These things are of course a part of it but software engineering is another huge part too. For example, when you want to build a production level product recommendation engine pipeline, you will have to work on many things like SQL scripts, data sync, training, tuning, prediction, evaluation frameworks, unit testing, logging, dashboards, admin panel, model deployment, version control and so much more. All of this combined involves a hell lot of critical thinking and coding. This is the kind of stuff you will work in the long run or maybe in your first few months! I am not saying that you need to know everything about coding everything but some level of proficiency in coding will be needed and also useful for you.
大多數嘗試進入該領域的人都有些誤解,認為數據科學涉及的編碼少于軟件工程。 有一點道理。 因為如果您使用Python(這是數據科學中廣泛使用的語言),那么幾乎所有類型的算法和操作都有內置的庫。 盡管這些庫非常有用,但是它們只能做很多事情。 我曾經以為,數據科學就是關于數據分析,圖表,模型擬合,預測和準確性指標的全部。 這些當然是其中的一部分,但是軟件工程也是另一個重要部分。 例如,當您要構建生產級別的產品推薦引擎管道時,您將需要處理許多事情,例如SQL腳本,數據同步,培訓,調整,預測,評估框架,單元測試,日志記錄,儀表板,管理面板,模型部署,版本控制等等。 所有這些結合在一起涉及大量的批判性思維和編碼。 從長遠來看,或者您可能會在頭幾個月中使用這種東西! 我并不是說您需要了解有關一切編碼的所有知識,但是將需要一定程度的編碼熟練度,并且對您也很有用。
3. No pressure to learn every single data science tool:
3.沒有學習每個數據科學工具的壓力:
There are way too many data science tools in the market and it can be quite confusing to find where to start. The best option is to learn one data science friendly coding language, one database tool and one visualization tool. This is a good way to begin with and is like the basic requirement for many entry level roles. When you are just laying the foundation, don’t pressure yourself to learn too many tools. Instead, take things slowly. Understand the basics and explore topics in depth in whatever tool you learn. You will eventually learn many tools when you are in the job due to project requirements or just while working on your passion projects.
市場上有太多的數據科學工具,很難找到從哪里開始。 最好的選擇是學習一種數據科學友好的編碼語言,一種數據庫工具和一種可視化工具。 這是開始的好方法,就像許多入門級角色的基本要求一樣。 當您只是奠定基礎時,不要強迫自己學習太多的工具。 相反,慢慢來。 了解基礎知識,并以所學的任何工具深入探討主題。 由于項目要求或在從事激情項目時,您最終將在工作中學習許多工具。

I started with Python, SQL and Tableau when I was searching for a job. Nothing more. Now I know to work on a couple of other tools like Spark, Hbase, Kibana, Dash, Elasticsearch and Logstash. I am sure I will have to learn new tools in the coming days. The point is, learn a tool with utmost clarity of how it will be useful for your requirement.
在尋找工作時,我從Python,SQL和Tableau開始。 而已。 現在我知道要使用其他幾個工具,例如Spark,Hbase,Kibana,Dash,Elasticsearch和Logstash。 我敢肯定,未來幾天我將不得不學習新工具。 重點是,要學習一種最清楚如何滿足您的需求的工具。
4. You are ready to take interviews:
4.您準備接受采訪:
Tell that to yourself whenever you feel like skipping an interview call or meeting because your brain is telling you that you are not going to make it. I cannot remember the number of times I learned something new while attending an interview. It is either about the data science industry or new products or just a concept. I am not suggesting you to attend interviews randomly to learn stuff. It would be an obvious waste of time for the poor interviewer. Data science is a vague term and so are the job requirements for every data science role. You might never feel ready if you want to tick every single job requirement before attending an interview.
每當您想跳過面試電話或會議時告訴自己,因為您的大腦告訴您您不會參加。 我不記得參加面試時學習新知識的次數。 它與數據科學行業或新產品有關,或者只是一個概念。 我不建議您隨機參加面試以學習知識。 對于可憐的面試官來說,這顯然是浪費時間。 數據科學是一個模糊的術語,每個數據科學角色的工作要求也是如此。 如果您想在參加面試之前打勾每個工作要求,您可能永遠也不會做好準備。
The preparation phase can be a long one too. It depends on your learning speed and prior knowledge. It is very easy to get stuck in that phase because there are too many topics to cover. Set goals during interview preparation and as you achieve those goals, start looking for interview opportunities. Every time you fail an interview, you will find the need to improve on a particular area or learn a new market requirement. And that my friend will help you in the next interviews.
準備階段也可能很長。 這取決于您的學習速度和先驗知識。 由于涉及的主題太多,因此很容易陷入這一階段。 在準備面試時設定目標,并在實現這些目標時開始尋找面試機會。 每次面試失敗時,您都會發現需要改進特定領域或了解新的市場需求。 我的朋友會在下次面試中為您提供幫助。
5. Ideal companies to apply for data science roles
5.申請數據科學職位的理想公司
Usually, people are flexible about roles and companies when applying for interviews as beginners. But if you are wondering what is the type of company in which you should apply for a data science role, it is completely subjective. Let us talk about product-based and service-based companies from a data science perspective. Service companies usually work on one-time data analysis or prototype whereas product companies involve rigorous software development and data analysis is just a part of it. Python, R. Powerpoint and Excel will do the job for you most of the days in service companies whereas product companies will want you to work on whatever tool is required to do the job. Basically, product companies will involve a lot of software engineering in addition to data analysis.
通常,在初學者申請面試時,人們會靈活選擇角色和公司。 但是,如果您想知道應申請數據科學職位的公司類型,那完全是主觀的。 讓我們從數據科學的角度談談基于產品和基于服務的公司。 服務公司通常從事一次性數據分析或原型工作,而產品公司則涉及嚴格的軟件開發,而數據分析只是其中的一部分。 在服務公司中,Python,R。Powerpoint和Excel大部分時間都可以為您完成工作,而產品公司則希望您使用所需的任何工具來完成工作。 基本上,產品公司除數據分析外還將涉及許多軟件工程。
They work on projects that will help them to improve their products by incorporating data science in them or they make new data based products like product recommendation engine, AI-based chatbots etc. or they just use analytics to make better decisions in the organization. Service companies work on analytics projects purely based on client requirements. So like I said it is up to your interests. Choose wisely!
他們從事的項目將通過整合數據科學來幫助他們改善產品,或者開發基于新數據的產品,例如產品推薦引擎,基于AI的聊天機器人等,或者他們只是使用分析方法在組織中做出更好的決策。 服務公司純粹根據客戶需求來進行分析項目。 因此,就像我說的那樣,這取決于您的興趣。 做出明智的選擇!
6. Data Science can be frustrating:
6.數據科學可能令人沮喪:
Data-based problems are very interesting to work on but some can be equally frustrating too. One of the difficult aspects of your work will be just to patiently wait for good results. Often you might not know whether you are going in the right direction. There are too many unknowns and a lot of things in your project will require plain trial and error to arrive at an optimal solution. Like they say it is all fun and games till you reach the hyper-parameter tuning part of your model :)
基于數據的問題非常有趣,但是有些問題同樣令人沮喪。 工作的困難之處之一就是耐心等待良好的結果。 通常,您可能不知道自己是否朝著正確的方向前進。 未知數太多,您項目中的許多事情都需要經過反復試驗才能得出最佳解決方案。 就像他們說的那樣,這很有趣,也很有趣,直到您到達模型的超參數調整部分為止:)
Most of us do a Proof of Concept before implementing any solution. But sometimes even POCs fail to give insights about certain hiccups you might face during the actual task. For example, once at work, we spent an entire month researching and implementing a solution for our pipeline. It eventually didn’t work out. We had to start all over again and this caused a huge progress lag in the supposedly well-performing project. The key take away from a couple of incidents like this is that always set clear goals, evaluate your POC thoroughly and when stuck at a point for too long, just remember to try fast, fail fast, evaluate fast and try again fast. Being fast is super important for good progress.
我們大多數人在實施任何解決方案之前都要進行概念驗證。 但是有時候,甚至POC都無法提供您在實際任務中可能遇到的某些打h的見解。 例如,一旦上班,我們就花了整整一個月的時間研究和實施管道解決方案。 最終沒有奏效。 我們不得不重新開始,這在原本表現良好的項目中造成了巨大的進度滯后。 避免發生此類事件的關鍵是始終設定明確的目標,徹底評估POC,并且在某個時間停留太長時間時,請記住要快嘗試,快失敗,快評估并再試一次。 快節奏對于取得良好的進步至關重要。
7. Your storytelling skills will matter a lot:
7. 您的講故事技巧非常重要:
You will most likely be dealing with customers from non-technical backgrounds. Your organization leaders may not be data scientists. Your own teammates might be from diverse backgrounds (pure mathematicians, some API users etc.). These are the people who will recognize your work and will add value to your work.
您很可能會與非技術背景的客戶打交道。 您的組織負責人可能不是數據科學家。 您自己的隊友可能來自不同的背景(純數學家,某些API用戶等)。 這些人將認可您的工作并為您的工作增添價值。
It is so important that you communicate your thoughts, ideas, analyses and results in an interactive and understandable way to your audience. I clearly remember struggling in my first team meeting with the CEO where we had to explain the progress in projects, discuss use cases and future AI goals. That is when it hit me that sticking to numbers and just analytical skills are not enough. A good story explaining the analysis can interest your manager. A story explaining how a particular data science solution can solve the pain point of a problem can interest your customer. Different stories have different impacts on different people. Frame your story carefully with data science elements like visualizations, dashboards, reports etc. and put your everything in it while delivering it.
以互動和易于理解的方式與聽眾交流思想,想法,分析和結果非常重要。 我清楚地記得,在與首席執行官的第一次團隊會議中,我們不得不解釋項目的進展,討論用例和未來的AI目標時遇到的困難。 那就是讓我感到震驚的是,僅僅依靠數字和僅僅分析技能是不夠的。 講解分析的好故事會讓您的經理感興趣。 解釋特定數據科學解決方案如何解決問題痛點的故事可能會使您的客戶感興趣。 不同的故事對不同的人有不同的影響。 借助可視化,儀表板,報告等數據科學元素精心構建故事,并在交付時將所有內容放入其中。
Final Thoughts:
最后的想法:
Data Science is no rocket science. If I can do it, then you can do it too! There is no good time as now to enter this fast-growing field. That being said, it definitely gets a little bit tough to keep up with all the new things happening in this field and the competition. But, what matters is that we learn, implement, make mistakes and grow consistently. Happy analyzing:)
數據科學不是火箭科學。 如果我可以做到,那么您也可以做到! 現在沒有進入這個快速增長領域的好時機。 話雖這么說,要跟上該領域和競爭中發生的所有新事物肯定會有點困難。 但是,重要的是我們學習,實施,犯錯誤并不斷成長。 分析愉快:)
翻譯自: https://medium.com/swlh/7-things-you-must-know-if-youre-trying-to-enter-data-science-2a9a531750e0
什么事數據科學
本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。 如若轉載,請注明出處:http://www.pswp.cn/news/389915.shtml 繁體地址,請注明出處:http://hk.pswp.cn/news/389915.shtml 英文地址,請注明出處:http://en.pswp.cn/news/389915.shtml
如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!