數據科學項目_完整的數據科學組合項目

數據科學項目

In this article, I would like to showcase what might be my simplest data science project ever.

在本文中,我想展示一下有史以來最簡單的數據科學項目

I have spent hours training a much more complex models in the past, and struggled to find the right parameters to create machine learning pipelines.

過去,我花費了數小時來訓練更復雜的模型,并努力尋找合適的參數來創建機器學習管道。

Despite its simplicity, if I could only display one project on my resume, it would be this one.

盡管它很簡單,但如果我只能在簡歷中顯示一個項目,那就是這個。

Let me explain why.

讓我解釋一下原因。

包裝是否確定禮物的價值? (Does the package determine the value of the gift?)

As a child, I would always get excited about holidays because I could get gifts. (Just humour me here, I do have a point, I promise). My aunt presented me with this beautiful dress, perhaps more beautiful than any other gift I received that day.

小時候,我總是會對假期感到興奮,因為我可以得到禮物。 ( 我保證我在這里很幽默,我有一點要保證)。 我的姨媽給了我這件漂亮的衣服,也許比那天我收到的任何其他禮物都要漂亮。

Here’s the thing though — I didn’t even want to open it. She had shabbily wrapped it with newspaper, and the gift seemed to have lost half its value before I even saw what was inside.

不過,這是東西–我什至不想打開它。 她用報紙把它包裹起來,禮物似乎失去了一半的價值,我什至沒有看到里面的東西。

To answer the question above, no. The package by no means determines the value of the gift.

要回答上述問題, 。 包裝決不會決定禮物的價值。

However, it can greatly influence your expectation of what’s inside and can change the way you perceive it.

但是,它會極大地影響您對內部內容的期望,并會改變您對其的感知方式。

The machine learning models you spend weeks training are great. Demonstrate that. Don’t let them die in your Jupyter Notebook.

您花費數周訓練的機器學習模型很棒。 證明這一點。 不要讓它們在Jupyter Notebook中死亡。

Recruiters have hundreds of resumes to read. It is almost impossible for them to read through all your code on GitHub and understand all your projects.

招聘人員有數百份簡歷可供閱讀。 他們幾乎不可能閱讀GitHub上的所有代碼并理解所有項目。

To stand out, you need to do something slightly different. Create an interface they can interact with. Maybe a live dashboard they can play around with.

要脫穎而出,您需要做些不同的事情。 創建一個可以與之交互的界面。 也許他們可以玩的實時儀表板。

Even if it's not the best dashboard or interface out there, it will create interest, because you created something they can actually use.

即使不是最佳的儀表板或界面,它也會引起人們的興趣,因為您創建了它們可以實際使用的東西。

I wanted to do exactly that, which is why I came up with this portfolio project. In the next few sections, I will explain exactly what I did without going too much into the technical detail.

我想做到這一點,這就是為什么我提出這個投資組合項目的原因。 在接下來的幾節中,我將準確解釋我所做的事情,而無需過多地討論技術細節。

目標 (Aim)

I aimed to display skills in the following areas:

我旨在展示以下領域的技能:

  • Data Collection

    數據采集
  • Data Wrangling

    數據整理
  • Data Visualization

    數據可視化
  • Machine Learning

    機器學習
  • Web Development

    Web開發

In order to do so, I created the following components in my project:

為此,我在項目中創建了以下組件:

  • Front-end interface

    前端界面
  • Movie Dashboard

    電影儀表板
  • Movie Recommender System

    電影推薦系統

I will explain and demonstrate each component in detail.

我將詳細解釋和演示每個組件。

Note: If you don’t want to read through the entire article and just want to take a look at the final product, just scroll down and take a look at the ‘Links’ section.

注意:如果您不想通讀整篇文章,只想看一下最終產品,只需向下滾動并看一下“ 鏈接 ”部分。

前端接口 (Front-End Interface)

Image for post

In the past, I would create projects and let the code sit in my GitHub repository. I write an occasional article explaining the project on Medium.

過去,我將創建項目并將代碼放在我的GitHub存儲庫中。 我偶爾寫一篇文章,解釋Medium上的項目。

Here, I took a different approach.

在這里,我采取了另一種方法。

I created a web-page and explained the different components in my project. I wrote briefly about how users can interact with the systems I created, and put up links to my code and Medium article.

我創建了一個網頁,并解釋了項目中的不同組件。 我簡短地寫了關于用戶如何與我創建的系統進行交互的文章,并提供了指向我的代碼和中型文章的鏈接。

The entire project can be understood and accessed through just one page, which makes it so much easier for people to engage with.

整個項目僅需一頁即可理解和訪問,這使人們更容易進行互動。

You can check the site out here — View on laptop or PC for better UI experience.

您可以在此處 查看 該站點 — 在便攜式計算機或PC上查看以獲得更好的UI體驗。

電影儀表板 (Movie Dashboard)

Image for post

Next, I created a movie dashboard with Tableau.

接下來,我使用Tableau創建了一個電影儀表板。

The steps involved:

涉及的步驟:

數據采集 (Data Collection)

I had to collect data from a variety of different places. I also wanted to visualize Bechdel scores of these movies (a measure of female representation in Hollywood), so I used an API to get that data.

我不得不從許多不同的地方收集數據。 我還想可視化這些電影的Bechdel分數( 好萊塢中女性代表的度量 ),因此我使用API??來獲取該數據。

數據整理 (Data Wrangling)

I cleaned the data and merged the datasets together. Once I was done, I could finally visualize it!

我清理了數據并將數據集合并在一起。 完成后,我終于可以將其可視化!

數據可視化 (Data Visualization)

Surprisingly, this took up a huge portion of my time compared to other parts of this project.

令人驚訝的是,與該項目的其他部分相比,這花費了我大量的時間。

I spent two days trying to create a visually appealing dashboard.

我花了兩天的時間來創建一個吸引人的儀表板。

I created one with a Python Dash app. I wasn’t too satisfied with the layout, and tried creating a Shiny web app in R instead.

我用Python Dash應用程序創建了一個。 我對布局不太滿意,而是嘗試在R中創建一個Shiny Web應用程序。

It turned out better than my Dash app, and I loved the functionality. However, I simply didn’t find the design appealing.

事實證明,它比我的Dash應用程序好,我喜歡它的功能。 但是,我只是覺得設計沒有吸引力。

Finally, I decided to use Tableau. This only took me about an hour to create. If you want to get started with Tableau, you can read this tutorial I created.

最后,我決定使用Tableau。 這只花了我大約一個小時的時間。 如果要開始使用Tableau,可以閱讀我創建的本教程 。

You can view my dashboard here — View on laptop or PC for better UI experience.

您可以在此處查看我的儀表板- 在筆記本電腦或PC上查看以獲得更好的UI體驗

推薦系統 (Recommender System)

Image for post

Finally, machine learning!

最后,機器學習!

I created a simple recommendation system with the same data I used for the dashboard and deployed it with a Dash app.

我使用與儀表板相同的數據創建了一個簡單的推薦系統,并通過Dash應用程序進行了部署。

Just enter a movie name, and it uses the back-end recommendation system to generate movie suggestions for you.

只需輸入電影名稱,它就會使用后端推薦系統為您生成電影建議。

Actually, this recommendation system was created when I was just starting to learn machine learning.

實際上,這個推薦系統是在我剛開始學習機器學習時創建的。

I found the code in my Jupyter Notebook, and decided to clean it up a bit to create this simple application.

我在Jupyter Notebook中找到了代碼,并決定對其進行一些清理以創建此簡單應用程序。

You can take a look at the recommendation system here — View on laptop or PC for better UI experience.

您可以在這里 查看推薦系統- 在筆記本電腦或PC上查看以獲得更好的UI體驗

That’s it!

而已!

鏈接 (Links)

  • Front-End Interface

    前端接口

  • Movie Dashboard

    電影儀表板

  • Recommender System

    推薦系統

  • Code (I apologize since the codes are pretty messy, I will clean them and re-upload soon.)

    代碼 ( 我很抱歉,因為代碼太亂了,我將清理它們并盡快重新上傳。 )

I hope you enjoyed this article and found the tips above helpful. Jupyter Notebooks are great, but don’t let your projects just sit there.

希望您喜歡這篇文章,并發現以上提示對您有所幫助。 Jupyter Notebooks很棒,但不要讓您的項目只坐在那兒。

Use your creativity to create something other people can interact with.

利用您的創造力創造其他人可以與之互動的東西。

I’ve seen some incredible projects on GitHub with only one star. On the other hand, I’ve also seen some really simple projects gain a lot of attention just because of how it was presented.

我在GitHub上僅看到一顆星星就看到了一些令人難以置信的項目。 另一方面,我也看到一些非常簡單的項目因其呈現方式而引起了很多關注。

Most importantly though, create projects you like to work on and do what you feel is enjoyable!

不過,最重要的是,創建您喜歡的項目并做自己認為愉快的事情!

翻譯自: https://towardsdatascience.com/a-complete-data-science-portfolio-project-ebbced35ea84

數據科學項目

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/news/390627.shtml
繁體地址,請注明出處:http://hk.pswp.cn/news/390627.shtml
英文地址,請注明出處:http://en.pswp.cn/news/390627.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

回溯算法和貪心算法_回溯算法介紹

回溯算法和貪心算法回溯算法 (Backtracking Algorithms) Backtracking is a general algorithm for finding all (or some) solutions to some computational problems, notably constraint satisfaction problems. It incrementally builds candidates to the solutions, and …

alpha沖刺day8

項目進展 李明皇 昨天進展 編寫完個人中心頁面今天安排 編寫首頁邏輯層問題困難 開始編寫數據傳遞邏輯,要用到列表渲染和條件渲染心得體會 小程序框架設計的內容有點忘了,而且比較抽象,需要理解文檔舉例和具體案例林翔 昨天進展 黑名單用戶的…

增加 processon 免費文件數

github 地址:github.com/96chh/Upgra… 關于 ProcessOn 非常好用的思維導圖網站,不僅支持思維導圖,還支持流程圖、原型圖、UML 等。比我之前用的百度腦圖強多了。 直接登錄網站就可以編輯,非常適合我在圖書館公用電腦學習使用。 但…

uni-app清理緩存數據_數據清理-從哪里開始?

uni-app清理緩存數據It turns out that Data Scientists and Data Analysts will spend most of their time on data preprocessing and EDA rather than training a machine learning model. As one of the most important job, Data Cleansing is very important indeed.事實…

高級人工智能之群體智能:蟻群算法

群體智能 鳥群: 魚群: 1.基本介紹 蟻群算法(Ant Colony Optimization, ACO)是一種模擬自然界螞蟻覓食行為的優化算法。它通常用于解決路徑優化問題,如旅行商問題(TSP)。 蟻群算法的基本步驟…

JavaScript標準對象:地圖

The Map object is a relatively new standard built-in object that holds [key, value] pairs in the order that theyre inserted. Map對象是一個相對較新的標準內置對象,按插入順序保存[key, value]對。 The keys and values in the Map object can be any val…

leetcode 483. 最小好進制

題目 對于給定的整數 n, 如果n的k(k>2)進制數的所有數位全為1,則稱 k(k>2)是 n 的一個好進制。 以字符串的形式給出 n, 以字符串的形式返回 n 的最小好進制。 示例 1: 輸入:“13” 輸…

圖像灰度變換及圖像數組操作

Python圖像灰度變換及圖像數組操作 作者:MingChaoSun 字體:[增加 減小] 類型:轉載 時間:2016-01-27 我要評論 這篇文章主要介紹了Python圖像灰度變換及圖像數組操作的相關資料,需要的朋友可以參考下使用python以及numpy通過直接操…

npx npm區別_npm vs npx —有什么區別?

npx npm區別If you’ve ever used Node.js, then you must have used npm for sure.如果您曾經使用過Node.js ,那么一定要使用npm 。 npm (node package manager) is the dependency/package manager you get out of the box when you install Node.js. It provide…

找出性能消耗是第一步,如何解決問題才是關鍵

作者最近剛接手一個新項目,在首頁列表滑動時就感到有點不順暢,特別是在滑動到有 ViewPager 部分的時候,如果是熟悉的項目,可能會第一時間會去檢查代碼,但前面說到這個是剛接手的項目,同時首頁的代碼邏輯比較…

bigquery_如何在BigQuery中進行文本相似性搜索和文檔聚類

bigqueryBigQuery offers the ability to load a TensorFlow SavedModel and carry out predictions. This capability is a great way to add text-based similarity and clustering on top of your data warehouse.BigQuery可以加載TensorFlow SavedModel并執行預測。 此功能…

bzoj 1996: [Hnoi2010]chorus 合唱隊

Description 為了在即將到來的晚會上有吏好的演出效果&#xff0c;作為AAA合唱隊負責人的小A需要將合唱隊的人根據他們的身高排出一個隊形。假定合唱隊一共N個人&#xff0c;第i個人的身髙為Hi米(1000<Hi<2000),并已知任何兩個人的身高都不同。假定最終排出的隊形是A 個人…

移動應用程序開發_什么是移動應用程序開發?

移動應用程序開發One of the most popular forms of coding in the last decade has been the creation of apps, or applications, that run on mobile devices.在過去的十年中&#xff0c;最流行的編碼形式之一是創建在移動設備上運行的應用程序。 Today there are two main…

leetcode 1600. 皇位繼承順序(dfs)

題目 一個王國里住著國王、他的孩子們、他的孫子們等等。每一個時間點&#xff0c;這個家庭里有人出生也有人死亡。 這個王國有一個明確規定的皇位繼承順序&#xff0c;第一繼承人總是國王自己。我們定義遞歸函數 Successor(x, curOrder) &#xff0c;給定一個人 x 和當前的繼…

vlookup match_INDEX-MATCH — VLOOKUP功能的升級

vlookup match電子表格/索引匹配 (SPREADSHEETS / INDEX-MATCH) In a previous article, we discussed about how and when to use VLOOKUP functions and what are the issues that we might face while using them. This article, on the other hand, will take you to a jou…

java基礎-BigDecimal類常用方法介紹

java基礎-BigDecimal類常用方法介紹 作者&#xff1a;尹正杰 版權聲明&#xff1a;原創作品&#xff0c;謝絕轉載&#xff01;否則將追究法律責任。 一.BigDecimal類概述 我們知道浮點數的計算結果是未知的。原因是計算機二進制中&#xff0c;表示浮點數不精確造成的。這個時候…

節點對象轉節點_節點流程對象說明

節點對象轉節點The process object in Node.js is a global object that can be accessed inside any module without requiring it. There are very few global objects or properties provided in Node.js and process is one of them. It is an essential component in the …

PAT——1018. 錘子剪刀布

大家應該都會玩“錘子剪刀布”的游戲&#xff1a;兩人同時給出手勢&#xff0c;勝負規則如圖所示&#xff1a; 現給出兩人的交鋒記錄&#xff0c;請統計雙方的勝、平、負次數&#xff0c;并且給出雙方分別出什么手勢的勝算最大。 輸入格式&#xff1a; 輸入第1行給出正整數N&am…

leetcode 1239. 串聯字符串的最大長度

題目 二進制手表頂部有 4 個 LED 代表 小時&#xff08;0-11&#xff09;&#xff0c;底部的 6 個 LED 代表 分鐘&#xff08;0-59&#xff09;。每個 LED 代表一個 0 或 1&#xff0c;最低位在右側。 例如&#xff0c;下面的二進制手表讀取 “3:25” 。 &#xff08;圖源&am…

flask redis_在Flask應用程序中將Redis隊列用于異步任務

flask redisBy: Content by Edward Krueger and Josh Farmer, and Douglas Franklin.作者&#xff1a; 愛德華克魯格 ( Edward Krueger) 和 喬什法默 ( Josh Farmer )以及 道格拉斯富蘭克林 ( Douglas Franklin)的內容 。 When building an application that performs time-co…