數據科學學習心得
When trying to learn anything all by yourself, it is easy to lose motivation and get thrown off track.
嘗試自己學習所有東西時,很容易失去動力并偏離軌道。
In this article, I will provide you with some tips that I used to stay focused in my data science journey.
在本文中,我將向您提供一些我過去一直專注于數據科學之旅的技巧。
問題 (The Problem)

There are just too many resources available online.
在線上有太多可用資源。
Data science is a very deep field, with branches in statistics, mathematics, programming, and development.
數據科學是一個非常深入的領域,在統計,數學,編程和開發領域設有分支。
Due to this, it is very easy to get sidetracked during the learning process.
因此,在學習過程中很容易被忽視。
There are so many online courses that promise to make you a data scientist in three months, and many students end up in tutorial hell.
在線課程如此之多,有望使您在三個月內成為一名數據科學家,并且許多學生最終陷入了教程地獄 。
However, taking ten of these online courses will not make you a data scientist. You will need to hone skills in each of these areas, which means that you have to create personal projects and read more.
但是,參加這些在線課程中的十門課程不會使您成為數據科學家。 您將需要在每個領域中磨練技能,這意味著您必須創建個人項目并內容。
All of this takes time.
所有這些都需要時間。
If you are working a full time job or are a university student (I do both), you will need to find a way to manage your time to study. If you don’t do this well, you will end up getting frustrated, and eventually give up on trying to learn.
如果您正在從事全職工作或是大學生(我都做),則需要找到一種方法來管理學習時間。 如果做得不好,您最終會感到沮喪,最終放棄學習。
解決方案? (The Solution?)
有最終目標 (Have an end goal)

After trial and error, I have found that having an end goal really works for me.
經過反復試驗,我發現最終目標確實對我有用。
I always make it a point to learn something new, and give myself a time-frame to do it.
我總是以學習新知識為重點,并給自己一個時間表。
例如 (For example)
I want to enhance my data collection skills, and learn web scraping. I give myself one day to learn the basics of web scraping.
我想增強我的數據收集技能,并學習網絡抓取。 我給自己一天的時間來學習網絡抓取的基礎知識。
Then, I will allocate two days to create a complete web-scraping project.
然后,我將分配兩天時間來創建一個完整的網絡抓取項目。
In three days, I will have learnt something new — how to scrape any kind of data from the Internet. This will also deepen my knowledge of Python libraries.
在三天內,我將學到一些新知識-如何從Internet抓取任何類型的數據。 這也將加深我對Python庫的了解。
When I didn’t give myself an end goal like this, I found my focus constantly shifting.
當我沒有給自己這樣的最終目標時,我發現自己的注意力在不斷變化。
I used to get really excited about starting new things, but never ended up completing any of it. This made me feel like I wasn’t actually learning anything, and led to a lot of frustration.
我曾經對開始新事物感到非常興奮,但從未最終完成任何事情。 這讓我覺得我實際上并沒有學到任何東西,并導致很多挫敗感。
時間管理 (Time Management)
When I first started learning data science, I could spend around eight hours a day just studying.
剛開始學習數據科學時,我每天可以花大約8個小時來學習。
However,university classes have started now. I am also doing a data science internship, which takes up a large portion of my time.
但是,大學課程現在已經開始。 我也在進行數據科學實習,這占用了我很大一部分時間。
Even with all this, I still make it a point to put aside time to study and learn new things.
即使有所有這些,我仍然要特別留出時間去學習和學習新事物。
At least two hours a day on weekdays, and four hours on weekends is the kind of time I like to put aside to study.
在工作日中,每天至少要有兩個小時,而在周末,則要至少四個小時,這是我喜歡留給學習的時間。
However, I found that I tend to waste a lot of this time because my focus jumps easily from one thing to another.
但是,我發現我傾向于浪費很多時間,因為我的注意力很容易從一件事跳到另一件事。
To prevent that from happening, and to make sure I’m actually getting things done everyday, I use a Trello board to keep track of my tasks.
為了防止這種情況發生,并確保我每天都能真正完成工作,我使用Trello板來跟蹤我的任務。
Here is an example — my Trello board for today:
這是一個示例-我今天的Trello板:

I strongly suggest you create a Trello board, or simply write down a list of things you have to get done each day.
我強烈建議您創建Trello板,或者簡單地寫下您每天必須完成的事情清單。
You might not end up finishing all of it, but it will help you keep track of how much you’ve achieved each day.
您可能不會最終完成所有這一切,但是它將幫助您跟蹤每天的成就。
Remember, you don’t have to put too much pressure on yourself everyday and burn yourself out. Even if you got one thing done today that you didn’t yesterday, it is progress.
請記住,您不必每天對自己施加太大壓力,也不必筋疲力盡。 即使您今天完成了昨天沒有做過的一件事,這也是進步。
專注于學習技術 (Focus on Learning the Technique)
I mention this a lot, but I think this point needs re-iterating. Always focus on learning a technique, rather than the tools you can use to achieve it.
我經常提到這一點,但我認為這一點需要重申。 始終專注于學習一種技術,而不是可以用來實現該技術的工具。
Let’s go back to the web scraping example.
讓我們回到網絡抓取示例。
There are many different languages that can be used to scrape the Internet. Even in Python, there is a large variety of libraries that can do the job —BeautifulSoup, Scrapy, Selenium, etc.
有許多種可用于刮擦Internet的語言。 即使在Python中,也可以使用各種各樣的庫來完成這項工作-BeautifulSoup,Scrapy,Selenium等。
More important than the library you use, however, is the technique of web scraping. Once you learn the technique, you can quickly learn different tools to get the job done.
但是,比您使用的庫更重要的是Web抓取技術。 一旦學習了該技術,就可以快速學習不同的工具來完成工作。
After learning the technique, make sure to apply it.
學習完該技術后,請確保將其應用。
Create projects using the techniques you learnt. Use what you learnt in a variety of different real life scenarios, since this is where you will learn the most. Just following tutorials won’t get you very far, since you won’t really learn any topic in depth.
使用您學到的技術創建項目。 使用您在各種不同的現實生活場景中學到的知識,因為這是您學習最多的地方。 僅僅跟隨教程并不會使您走得太遠,因為您不會真正深入地學習任何主題。
學習是一種快樂—不要害怕 (Learning is a Joy — Don’t Fear It)

As mentioned above, data science is a very large field with branches in mathematics, statistics, and programming.
如上所述,數據科學是一個非常大的領域,在數學,統計和編程領域都有分支。
Learning even one of these topics can be daunting. There is just so much to know. People dedicate their entire lives towards learning these individual topics.
學習這些主題之一甚至可能也是艱巨的。 有太多要知道的事。 人們畢生致力于學習這些個人主題。
As a beginner data scientist, it is easy to get overwhelmed at the sheer amount of things you need to know.
作為初學者,數據科學家很容易為您需要了解的大量內容所淹沒。
This can lead to anxiety, and the fear that you are never going to reach your end goal of learning data science.
這可能會導致焦慮,并擔心您將永遠無法達到學習數據科學的最終目標。
To get over this fear, you first need to embrace the learning curve. Remember that everybody started somewhere.
為了克服這種恐懼,您首先需要擁抱學習曲線。 請記住,每個人都從某個地方開始。
Break down your end goal of learning data science into smaller chunks. Create a list of daily goals — things that you want to know by the end of each day.
將學習數據科學的最終目標分解為較小的塊。 創建每日目標列表,這是您希望在每天結束時知道的事情。
When you do this, you will realize that you are making progress and learning something new everyday. This will motivate you to continue your data science learning journey.
當您這樣做時,您將意識到自己每天都在進步并學習新東西。 這將激勵您繼續進行數據科學學習。
分解大任務 (Breaking Down Large Tasks)
If you have read any of my previous articles, I always advice aspiring data scientists to create projects. Creating personal projects is a way for your resume to stand out, and shows your interest in the subject.
如果您閱讀過我以前的任何文章,我總是建議有抱負的數據科學家創建項目。 創建個人項目是使您的簡歷脫穎而出的一種方式,并顯示出您對該主題的興趣。
I always get questions from aspiring data scientists on getting started with data science projects, such as this one:
我總是從有抱負的數據科學家那里獲得關于數據科學項目入門的問題,例如:
“How can I get started with creating data science projects? Online courses only teach us concepts. How do we apply these concepts in real life scenarios and create an end-to-end project?”
“如何開始創建數據科學項目? 在線課程僅教我們概念。 我們如何將這些概念應用于現實生活中并創建端到端項目?”
If you have the same question, I understand exactly what you are going through.
如果您有相同的問題,我將完全理解您正在經歷的事情。
I remember completing courses in statistics, data science, and programming, and thinking to myself — “I am now ready to start my own project.”
我記得完成統計學,數據科學和編程方面的課程后,對自己進行思考: “我現在準備開始自己的項目。”
I was excited, and ready to apply what I learnt to real life situations!
我很興奮,并準備將我學到的東西應用到現實生活中!
However, I didn’t know where to start. I searched for sample data science projects, and found some really amazing stuff on the Internet.
但是,我不知道從哪里開始。 我搜索了示例數據科學項目,并在Internet上找到了一些非常令人驚奇的東西。
I saw people deploy complex machine learning algorithms with an interactive interface. I saw systems like “fake news detector” — all you had to do was enter a URL, and it will predict whether or not the news was fake.
我看到人們通過交互界面部署復雜的機器學習算法。 我看到了諸如“假新聞檢測器”之類的系統,您所要做的就是輸入URL,它會預測新聞是否為偽造。
I was impressed, but wondered where they learnt how to do those things. I called it “real world stuff,” and was disappointed that there was no online course to teach me how to do them.
我印象深刻,但想知道他們在哪里學習了如何做這些事情。 我稱它為“現實世界的東西” ,但對沒有在線課程教我如何做它們感到失望。
隨著時間的推移,我了解到啟動這些項目的唯一方法就是開始。 (Over time, I learnt that the only way to start these projects was to just start.)
Here are the steps I take when creating data science project:
這是我創建數據科學項目時采取的步驟:
Have an idea: Come up with an idea first. Choose something that excites you, such as a Spotify music analysis project.
有一個想法:首先想出一個主意。 選擇一些讓您興奮的東西,例如Spotify音樂分析項目。
Break it down: Think of the different steps you will have to take to complete the project. In this case, it would be:
分解:考慮完成項目所必須采取的不同步驟。 在這種情況下,它將是:
- Data Collection: Think about where you are going to get the data from. You might need to build a web scraper, or use an API. Google is your best friend in this case, and you will learn these skills along the way. Give yourself a deadline to complete this task. 數據收集:考慮從何處獲取數據。 您可能需要構建網絡抓取工具或使用API??。 在這種情況下,Google是您最好的朋友,您將一路學習這些技能。 給自己一個完成任務的期限。
- Data Analysis: Next, you will have to come up with a way to analyze this data. If it was scraped from the web, it is going to be messy. You will need to clean it first, and store it in a data frame. 數據分析:接下來,您將不得不想出一種方法來分析這些數據。 如果是從網上刮下來的,那將是一團糟。 您將需要先對其進行清理,然后將其存儲在數據框中。
- Data Visualization: Real world datasets are very different from the kinds of data handed to you in Kaggle. Visualizing this data usually takes a bit of work. You will need to change variable types, and play around with different tools to get your desired result. 數據可視化:現實世界的數據集與Kaggle中提供給您的數據類型有很大不同。 可視化這些數據通常需要一些工作。 您將需要更改變量類型,并使用不同的工具來獲得所需的結果。
- Deployment: If you choose to deploy your project, you will need some knowledge of web development. There are a lot of tutorials out there on creating a user interface for your models and deploying them, and you will figure it out along the way. 部署:如果選擇部署項目,則需要一些Web開發知識。 關于為模型創建用戶界面并進行部署的大量教程,您將一路弄清楚。
My previous data science project was a simple movie recommender system and dashboard with a front-end UI. It started with an idea, and I drew out the flow of the project:
我之前的數據科學項目是一個簡單的電影推薦系統和帶有前端UI的儀表板。 它從一個想法開始,我得出了項目的流程:

If you take a look at the end product, you will see that my project is pretty similar to what I drew.
如果您看一下最終產品 ,您會發現我的項目與我繪制的項目非常相似。
This is because I first had and idea, and then proceeded to break it down into different parts. I drew those parts out, and gave myself some time to complete each of them.
這是因為我先有了主意,然后又將其分解為不同的部分。 我把那些部分抽出來,給自己一些時間來完成它們。
Breaking down large projects into simple tasks is very important. This way, you are slowly working towards your end goal one part at a time. You have some kind of direction on how to proceed. Most importantly, you will not get overwhelmed by the task at hand.
將大型項目分解為簡單的任務非常重要。 這樣,您一次就朝著最終目標緩慢地努力。 您對如何進行有一些指導。 最重要的是,您不會被手頭的任務淹沒。
I understand that it is easy to get sidetracked and lose motivation when learning a subject as deep as data science, especially if you’re already working a full time job and have other commitments.
我了解,在學習像數據科學這樣的深層次課程時,容易陷入歧途并失去動力,尤其是如果您已經在從事全職工作并且有其他承諾的時候。
Venturing into a completely new field and having to learn everything on your own can be daunting.
冒險進入一個全新的領域,必須自己學習所有東西,這可能令人生畏。
If you feel overwhelmed, or feel like you are losing interest, remember why you started your data science journey in the first place.
如果您感到不知所措或感到失去興趣,請記住為什么首先要開始數據科學之旅。
That’s it!
而已!
You don’t learn to walk by following the rules. You learn by doing, and by falling over.
您不會學會遵守規則。 您通過做事和跌倒來學習。
翻譯自: https://medium.com/swlh/how-to-stay-motivated-when-learning-data-science-ccab719ae7c1
數據科學學習心得
本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。 如若轉載,請注明出處:http://www.pswp.cn/news/389742.shtml 繁體地址,請注明出處:http://hk.pswp.cn/news/389742.shtml 英文地址,請注明出處:http://en.pswp.cn/news/389742.shtml
如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!