數據科學項目
In this article, I would like to showcase what might be my simplest data science project ever.
在本文中,我想展示一下有史以來最簡單的數據科學項目 。
I have spent hours training a much more complex models in the past, and struggled to find the right parameters to create machine learning pipelines.
過去,我花費了數小時來訓練更復雜的模型,并努力尋找合適的參數來創建機器學習管道。
Despite its simplicity, if I could only display one project on my resume, it would be this one.
盡管它很簡單,但如果我只能在簡歷中顯示一個項目,那就是這個。
Let me explain why.
讓我解釋一下原因。
包裝是否確定禮物的價值? (Does the package determine the value of the gift?)
As a child, I would always get excited about holidays because I could get gifts. (Just humour me here, I do have a point, I promise). My aunt presented me with this beautiful dress, perhaps more beautiful than any other gift I received that day.
小時候,我總是會對假期感到興奮,因為我可以得到禮物。 ( 我保證我在這里很幽默,我有一點要保證)。 我的姨媽給了我這件漂亮的衣服,也許比那天我收到的任何其他禮物都要漂亮。
Here’s the thing though — I didn’t even want to open it. She had shabbily wrapped it with newspaper, and the gift seemed to have lost half its value before I even saw what was inside.
不過,這是東西–我什至不想打開它。 她用報紙把它包裹起來,禮物似乎失去了一半的價值,我什至沒有看到里面的東西。
To answer the question above, no. The package by no means determines the value of the gift.
要回答上述問題, 否 。 包裝決不會決定禮物的價值。
However, it can greatly influence your expectation of what’s inside and can change the way you perceive it.
但是,它會極大地影響您對內部內容的期望,并會改變您對其的感知方式。
The machine learning models you spend weeks training are great. Demonstrate that. Don’t let them die in your Jupyter Notebook.
您花費數周訓練的機器學習模型很棒。 證明這一點。 不要讓它們在Jupyter Notebook中死亡。
Recruiters have hundreds of resumes to read. It is almost impossible for them to read through all your code on GitHub and understand all your projects.
招聘人員有數百份簡歷可供閱讀。 他們幾乎不可能閱讀GitHub上的所有代碼并理解所有項目。
To stand out, you need to do something slightly different. Create an interface they can interact with. Maybe a live dashboard they can play around with.
要脫穎而出,您需要做些不同的事情。 創建一個可以與之交互的界面。 也許他們可以玩的實時儀表板。
Even if it's not the best dashboard or interface out there, it will create interest, because you created something they can actually use.
即使不是最佳的儀表板或界面,它也會引起人們的興趣,因為您創建了它們可以實際使用的東西。
I wanted to do exactly that, which is why I came up with this portfolio project. In the next few sections, I will explain exactly what I did without going too much into the technical detail.
我想做到這一點,這就是為什么我提出這個投資組合項目的原因。 在接下來的幾節中,我將準確解釋我所做的事情,而無需過多地討論技術細節。
目標 (Aim)
I aimed to display skills in the following areas:
我旨在展示以下領域的技能:
- Data Collection 數據采集
- Data Wrangling 數據整理
- Data Visualization 數據可視化
- Machine Learning 機器學習
- Web Development Web開發
In order to do so, I created the following components in my project:
為此,我在項目中創建了以下組件:
- Front-end interface 前端界面
- Movie Dashboard 電影儀表板
- Movie Recommender System 電影推薦系統
I will explain and demonstrate each component in detail.
我將詳細解釋和演示每個組件。
Note: If you don’t want to read through the entire article and just want to take a look at the final product, just scroll down and take a look at the ‘Links’ section.
注意:如果您不想通讀整篇文章,只想看一下最終產品,只需向下滾動并看一下“ 鏈接 ”部分。
前端接口 (Front-End Interface)

In the past, I would create projects and let the code sit in my GitHub repository. I write an occasional article explaining the project on Medium.
過去,我將創建項目并將代碼放在我的GitHub存儲庫中。 我偶爾寫一篇文章,解釋Medium上的項目。
Here, I took a different approach.
在這里,我采取了另一種方法。
I created a web-page and explained the different components in my project. I wrote briefly about how users can interact with the systems I created, and put up links to my code and Medium article.
我創建了一個網頁,并解釋了項目中的不同組件。 我簡短地寫了關于用戶如何與我創建的系統進行交互的文章,并提供了指向我的代碼和中型文章的鏈接。
The entire project can be understood and accessed through just one page, which makes it so much easier for people to engage with.
整個項目僅需一頁即可理解和訪問,這使人們更容易進行互動。
You can check the site out here — View on laptop or PC for better UI experience.
您可以在此處 查看 該站點 — 在便攜式計算機或PC上查看以獲得更好的UI體驗。
電影儀表板 (Movie Dashboard)

Next, I created a movie dashboard with Tableau.
接下來,我使用Tableau創建了一個電影儀表板。
The steps involved:
涉及的步驟:
數據采集 (Data Collection)
I had to collect data from a variety of different places. I also wanted to visualize Bechdel scores of these movies (a measure of female representation in Hollywood), so I used an API to get that data.
我不得不從許多不同的地方收集數據。 我還想可視化這些電影的Bechdel分數( 好萊塢中女性代表的度量 ),因此我使用API??來獲取該數據。
數據整理 (Data Wrangling)
I cleaned the data and merged the datasets together. Once I was done, I could finally visualize it!
我清理了數據并將數據集合并在一起。 完成后,我終于可以將其可視化!
數據可視化 (Data Visualization)
Surprisingly, this took up a huge portion of my time compared to other parts of this project.
令人驚訝的是,與該項目的其他部分相比,這花費了我大量的時間。
I spent two days trying to create a visually appealing dashboard.
我花了兩天的時間來創建一個吸引人的儀表板。
I created one with a Python Dash app. I wasn’t too satisfied with the layout, and tried creating a Shiny web app in R instead.
我用Python Dash應用程序創建了一個。 我對布局不太滿意,而是嘗試在R中創建一個Shiny Web應用程序。
It turned out better than my Dash app, and I loved the functionality. However, I simply didn’t find the design appealing.
事實證明,它比我的Dash應用程序好,我喜歡它的功能。 但是,我只是覺得設計沒有吸引力。
Finally, I decided to use Tableau. This only took me about an hour to create. If you want to get started with Tableau, you can read this tutorial I created.
最后,我決定使用Tableau。 這只花了我大約一個小時的時間。 如果要開始使用Tableau,可以閱讀我創建的本教程 。
You can view my dashboard here — View on laptop or PC for better UI experience.
您可以在此處查看我的儀表板- 在筆記本電腦或PC上查看以獲得更好的UI體驗 。
推薦系統 (Recommender System)

Finally, machine learning!
最后,機器學習!
I created a simple recommendation system with the same data I used for the dashboard and deployed it with a Dash app.
我使用與儀表板相同的數據創建了一個簡單的推薦系統,并通過Dash應用程序進行了部署。
Just enter a movie name, and it uses the back-end recommendation system to generate movie suggestions for you.
只需輸入電影名稱,它就會使用后端推薦系統為您生成電影建議。
Actually, this recommendation system was created when I was just starting to learn machine learning.
實際上,這個推薦系統是在我剛開始學習機器學習時創建的。
I found the code in my Jupyter Notebook, and decided to clean it up a bit to create this simple application.
我在Jupyter Notebook中找到了代碼,并決定對其進行一些清理以創建此簡單應用程序。
You can take a look at the recommendation system here — View on laptop or PC for better UI experience.
您可以在這里 查看推薦系統- 在筆記本電腦或PC上查看以獲得更好的UI體驗 。
That’s it!
而已!
鏈接 (Links)
Front-End Interface
前端接口
Movie Dashboard
電影儀表板
Recommender System
推薦系統
Code (I apologize since the codes are pretty messy, I will clean them and re-upload soon.)
代碼 ( 我很抱歉,因為代碼太亂了,我將清理它們并盡快重新上傳。 )
I hope you enjoyed this article and found the tips above helpful. Jupyter Notebooks are great, but don’t let your projects just sit there.
希望您喜歡這篇文章,并發現以上提示對您有所幫助。 Jupyter Notebooks很棒,但不要讓您的項目只坐在那兒。
Use your creativity to create something other people can interact with.
利用您的創造力創造其他人可以與之互動的東西。
I’ve seen some incredible projects on GitHub with only one star. On the other hand, I’ve also seen some really simple projects gain a lot of attention just because of how it was presented.
我在GitHub上僅看到一顆星星就看到了一些令人難以置信的項目。 另一方面,我也看到一些非常簡單的項目因其呈現方式而引起了很多關注。
Most importantly though, create projects you like to work on and do what you feel is enjoyable!
不過,最重要的是,創建您喜歡的項目并做自己認為愉快的事情!
翻譯自: https://towardsdatascience.com/a-complete-data-science-portfolio-project-ebbced35ea84
數據科學項目
本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。 如若轉載,請注明出處:http://www.pswp.cn/news/390627.shtml 繁體地址,請注明出處:http://hk.pswp.cn/news/390627.shtml 英文地址,請注明出處:http://en.pswp.cn/news/390627.shtml
如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!