udacity開源的數據
by David Venturi
大衛·文圖里(David Venturi)
評論:Udacity數據分析師納米學位計劃 (Review: Udacity Data Analyst Nanodegree Program)
Udacity’s Data Analyst Nanodegree program was one of the first online data science programs in the online education revolution. It aims to “ensure you master the exact skills necessary to build a career in data science.” Does it accomplish its goal? Is it the best option available?
Udacity的Data Analyst Nanodegree計劃是在線教育革命中最早的在線數據科學計劃之一。 它旨在“確保您掌握建立數據科學職業所需的確切技能。” 它實現了目標嗎? 它是最好的選擇嗎?
I completed the program in Fall 2016. Using inspiration from Class Central’s open-source review template, here is my review for Udacity’s Data Analyst Nanodegree program.
我于2016年秋季完成了該計劃。借鑒Class Central的開源審查模板的啟發,這是我對Udacity的Data Analyst Nanodegree計劃的審查。
UPDATE: The Data Analyst Nanodegree program was refreshed with new content and student services in September 2017. Details here. I was also brought on board to help recreate some of this new content. The majority of this review is unchanged. Factual updates are indicated by italic font.
更新: 數據分析Nanodegree計劃于9月與新的內容和學生服務2017年刷新細節在這里 。 我也被帶去幫助重新創建一些新內容。 此評論的大部分內容保持不變。 實際更新以斜體字體表示。
背景資料 (Background information)
是什么讓我決定參加此課程的? (What made me decide to take this program?)
In early 2016, I started creating my own data science master’s program using online resources. (You can read about that here.) I enrolled in the Data Analyst Nanodegree program for a few reasons:
2016年初,我開始使用在線資源創建自己的數據科學碩士課程。 (您可以在此處閱讀有關內容。)我注冊Data Analyst Nanodegree程序有以下幾個原因:
- I wanted a guide for my introduction to data science. 我想要一個有關數據科學入門的指南。
- I wanted a cohesive program instead of individual courses from a variety of providers. 我想要一個有凝聚力的計劃,而不是來自各種提供商的單獨課程。
It received stellar reviews.
它得到了好評 。
- I had taken a few Udacity courses before and I was a fan of their teaching style. 之前我參加過一些Udacity課程,并且我很喜歡他們的教學風格。
我的目標是什么? (What were my goals?)
Though the program can act as a bridge to a job (more on that later), I wanted to use the program as an introduction to more advanced material. This “more advanced material” applies to both subjects that are covered in the program and subjects that aren’t.
盡管該程序可以充當工作的橋梁(稍后會詳細介紹),但我還是想將該程序用作對更高級材料的介紹。 此“更高級的材料”適用于計劃中涵蓋的主題和未涵蓋的主題。
什么是Udacity納米學位課程? (What is a Udacity Nanodegree program?)
Udacity is one of the leading online education providers. Sebastian Thrun, ex-Stanford professor and Google X founder, founded the company and focuses on innovation at Udacity as president and chairman. Vish Makhijani is CEO.
Udacity是領先的在線教育提供商之一。 斯坦福大學前教授,Google X創始人塞巴斯蒂安·特倫(Sebastian Thrun)創立了公司,并在Udacity擔任總裁兼董事長,致力于創新。 Vish Makhijani是首席執行官 。
Nanodegree programs are online credentials provided by Udacity. They are compilations of Udacity courses (some available for free, others not) that have projects attached to them, which are reviewed by Udacity’s paid project reviewers. They also come with a bunch of student services.
納米學位課程是Udacity提供的在線憑證。 它們是Udacity課程的匯編(有些是免費提供的,有些不是免費的),這些課程已附加項目,并由Udacity的付費項目審閱者進行審閱。 他們還提供大量學生服務。
Slack is used as a community tool, where Udacity students can interact with other students as well as their program’s instructors and other Udacity staff. In most programs, students have assigned mentors and communicate with them through a private chat channel that is always available in the Udacity classroom.
Slack用作社區工具,Udacity的學生可以在其中與其他學生以及他們的計劃的講師和其他Udacity員工進行交互。 在大多數計劃中,學生分配了導師并通過Udacity教室中始終可用的私人聊天頻道與他們進行交流。
The Data Analyst Nanodegree program was originally released in 2014. It was Udacity’s second Nanodegree program. Though it has undergone some changes over the years, the core of the program is intact.
Data Analyst納米學位計劃最初于2014年發布。它是Udacity的第二個納米學位計劃。 盡管多年來已經發生了一些變化,但該計劃的核心是完整??的。
誰是講師,他們的背景是什么? (Who are the instructors and what are their backgrounds?)
Because the Data Analyst Nanodegree program is a compilation of Udacity courses (again, some free, others not), there are several instructors. Their resumes often include prestigious roles in major tech companies and degrees from top U.S. schools.
由于Data Analyst Nanodegree程序是Udacity課程的匯編(同樣,有些是免費的,有些則不是),因此有幾位講師。 他們的簡歷通常包括在大型科技公司中的重要角色以及美國頂尖學校的學位。
They aren’t “instructors” per se, but Udacity’s project reviewers, mentors, and student experience staff (who monitor Slack along with instructors) are among the people you interact with the most. They are so, so helpful. More on that later.
他們本身并不是“講師”,但是與您互動最多的人是Udacity的項目審閱者, 導師和學生體驗人員(他們與講師一起監控Slack) 。 他們是如此,非常有幫助。 以后再說。
成本 (Cost)
The program is split into two terms. The first term costs $499 USD. The second term costs $699 USD. If you have a strong grasp on the skills taught in the first term, you can skip it, complete the second term only, and still obtain the credential.
該程序分為兩個術語。 第一學期的費用為499美元。 第二學期的費用為699美元。 如果您對第一學期所教授的技能有很強的把握,則可以跳過該課程,僅完成第二學期,仍然獲得證書。
建議的先決條件 (Recommended prerequisites)
For Term 1, Udacity recommends that students are familiar with descriptive statistics and have some experience working with data in spreadsheets or SQL.
對于第一學期,Udacity建議學生熟悉描述性統計數據,并具有處理電子表格或SQL中的數據的經驗。
For Term 2, students should have experience analyzing data using Python, as well as a solid understanding of inferential statistics and its applications.
對于第二學期,學生應具有使用Python分析數據的經驗,并對推理統計及其應用有扎實的理解。
我的背景/進入程序的技能 (My background / skills entering the program)
I started the program in May 2016 when I had a few months of programming experience, mostly in C and Python. The vast majority of this experience was from the bridging module for my data science master’s program, where I took Harvard’s CS50: Introduction to Computer Science and Udacity’s Intro to Programming Nanodegree program.
我于2016年5月開始該程序,當時我有幾個月的編程經驗,主要是使用C和Python。 這些經驗的絕大部分來自于我的數據科學碩士課程的橋接模塊,在那里我學習了哈佛大學的CS50:計算機科學入門和Udacity的程序設計納米學位入門 。
I had also finished my undergraduate chemical engineering program and had 24 months of quant-related job experience. This meant I had taken several statistics courses and was comfortable with data.
我還完成了本科化學工程課程,并且擁有24個月與量化相關的工作經驗。 這意味著我參加了幾門統計學課程并且對數據感到滿意。
該程序 (The Program)
結構體 (Structure)
The Data Analyst Nanodegree program is split up into two terms. Each term has three courses and four projects (the extra project being an intro project that helps you get used to the Udacity learning environment). Mat Leonard, the program’s curriculum lead at the time of the refresh, is present throughout the program as he introduces each course, its purpose in the program, and its instructor(s).
Data Analyst Nanodegree程序分為兩個術語。 每個學期都有三門課程和四個項目(額外的項目是一個介紹性項目,可以幫助您適應Udacity的學習環境)。 Mat倫納德 ( Mat Leonard)是刷新時該課程的課程負責人,在他介紹每門課程,其在課程中的目的以及其講師時,他在課程中始終存在。
Course content is made up of a combination of videos, text, and quizzes. Videos tend to range from 30 seconds to five minutes, as per Udacity’s style. Automatically graded quizzes often follow these short videos. These quizzes are usually multiple choice, fill-in-the-blank, or small programming tasks. After acquiring CloudLabs, these programming tasks are now carried out in Jupyter Notebook and SQL coding environments in the Udacity classroom.
課程內容由視頻,文本和測驗組成。 根據Udacity的風格,視頻的長度通常在30秒到5分鐘之間。 這些短片通常會自動評分。 這些測驗通常是多項選擇,填空或小型編程任務。 收購CloudLabs之后 ,現在可以在Udacity教室的Jupyter Notebook和SQL編碼環境中執行這些編程任務。
Again, each section has a graded project. These projects and the feedback from Udacity’s paid project reviewers are where a lot of the value lies for students.
同樣,每個部分都有一個分級項目。 這些項目以及Udacity付費項目審閱者的反饋對學生來說是很多價值所在。
教學大綱 (Syllabus)
My edition of the program had seven parts:
我的程序版本包括七個部分:
- P1: Descriptive and Inferential Statistics P1:描述性和推斷性統計
- P2: Intro to Data Analysis (with NumPy and pandas) P2:數據分析簡介(使用NumPy和Pandas)
- P3: Data Wrangling with MongoDB (or SQL) P3:使用MongoDB(或SQL)處理數據
- P4: Exploratory Data Analysis (with R) P4:探索性數據分析(帶R)
- P5: Intro to Machine Learning P5:機器學習入門
- P6: Data Visualization with D3.js P6:使用D3.js進行數據可視化
- P7: Design an A/B Test P7:設計A / B測試
The new program’s first term is called Data Analysis with Python and SQL. The courses and projects include:
新程序的第一個術語稱為使用Python和SQL進行數據分析 。 這些課程和項目包括:
Intro project: Explore Weather Trends. SQL and spreadsheets (or Python/R if you are already familiar) are used to analyze and visualize temperature data.
簡介項目: 探索天氣趨勢。 SQL和電子表格(如果您已經熟悉,則為Python / R)用于分析和可視化溫度數據。
Course: Introduction to Python. Project: Explore US Bikeshare Data.
課程: Python入門。 項目:探索美國Bikeshare數據。
Course: Introduction to Data Analysis, which includes The Data Analysis Process and SQL for Data Analysis. Project: Investigate a Dataset.
課程: 數據分析簡介,其中包括數據分析過程和用于數據分析SQL。 項目:研究數據集。
Course: Practical Statistics. Project: Analyze A/B Test Results.
課程: 實踐統計。 項目:分析A / B測試結果。
The second term is called Advanced Data Analysis. The courses and projects include:
第二個術語稱為高級數據分析 。 這些課程和項目包括:
Intro project: Test a Perceptual Phenomenon. Compute descriptive statistics and perform a statistical test on a dataset based on a psychological phenomenon called the Stroop Effect.
簡介項目: 測試一種感知現象。 基于稱為Stroop效應的心理現象,計算描述性統計數據并對數據集進行統計檢驗。
Course: Data Wrangling (with Python). Project: Wrangle and Analyze Data. This is the course and project that I created. ?
課程: 數據整理(使用Python)。 項目: Wrangle和分析數據。 這是我創建的課程和項目。 ?
Course: Exploratory Data Analysis (with R). Project: Explore and Summarize Data.
課程: 探索性數據分析(帶R)。 項目:探索和匯總數據。
Course: Data Storytelling (with Tableau). Project: Create a Tableau Story.
課程: 數據故事講述(與Tableau一起使用)。 項目:創建一個Tableau Story。
The big changes, with full details described in this blog post:
重大更改,此博客文章中描述了全部詳細信息:
Python is now taught in the program.
現在在程序中教授Python。
Machine Learning and A/B Testing are now included as optional material and are no longer requirements to graduate from the program. Reasoning: “The focus of this program is to prepare you for data analyst jobs. Our research shows that machine learning is not a requirement for the vast majority of data analyst positions.” The basics of A/B testing are now covered in the new practical stats course, giving students the exposure that they’ll need on the job.
現在,機器學習和A / B測試已作為可選材料包括在內,不再需要從該程序中畢業。 推理:“該計劃的重點是為您做好數據分析師工作做準備。 我們的研究表明,機器學習并不是絕大多數數據分析師職位的必要條件。” 新的實用統計課程現在涵蓋了A / B測試的基礎知識,使學生有工作所需的知識。
New courses and projects. Specifically, Intro to Data Analysis (which includes Python for Data Analysis and SQL for Data Analysis), Practical Statistics (taught by Sebastian Thrun), and Data Wrangling.
新課程和新項目。 具體來說,數據分析簡介(包括用于數據分析的Python和用于數據分析SQL),實用統計(由Sebastian Thrun教授)和數據整理。
Grading
等級
Projects are graded on a pass/fail (officially, “meets specifications” and “requires changes”) basis according to a unique rubric. Your project must satisfy all sections of the rubric. If all of your projects meet specifications, you graduate. This means that the automatically-graded quizzes do not count towards your grade.
根據唯一的評判標準對項目的通過/失敗(正式地,“符合規格”和“需要更改”)進行分級。 您的項目必須滿足所有規則。 如果您所有的項目都符合規范,那么您就畢業了。 這意味著自動評分的測驗不會計入您的成績。
If a project submission requires changes, your project reviewer will give you actionable feedback. After you implement these changes, you can resubmit. There is no submission limit.
如果項目提交需要更改,則項目審閱者將為您提供可行的反饋。 實施這些更改后,您可以重新提交。 沒有提交限制。
我的經驗 (My experience)
時間線 (Timeline)
Udacity’s estimated timeline for the Data Analyst Nanodegree program was 378 hours when I started, which meant students took 6–7 months on average to complete it. According to Toggl (a time tracking app), the whole program took me 369 hours over five months. This timeline included dedicating serious time to making my projects portfolio-quality, as opposed to producing the minimum to satisfy the pass/fail rubric.
我剛開始時,Udacity估計的Data Analyst納米學位課程的時間表為378小時,這意味著學生平均需要6-7個月才能完成該課程。 根據Toggl (一個時間跟蹤應用程序),整個程序在五個月內花了我369個小時。 這個時間表包括花大量的時間來提高我的項目的投資組合質量,而不是花最少的時間來滿足通過/失敗的標準。
The program was condensed in the Fall 2017 refresh. The new estimated timeline is 260 hours. Each term is paced at 10 hours per week over 13 weeks, though students are given 19 weeks to complete each term.
該程序在2017年秋季更新中得到了壓縮。 新的預計時間表是260小時 。 每個學期的課程安排為在13周內每周10個小時,盡管學生有19周的時間完成每個學期。
課程內容如何? (How was the course content?)
For my edition of the program, the course content from P1 (Statistics), P2 (Intro to Data Analysis), P4 (Exploratory Data Analysis), P5 (Machine Learning), and P7 (A/B Testing) get five stars out of five from me. P3 (Data Wrangling) and P6 get three-and-a-half stars.
在我的程序版本中,P1(統計),P2(數據分析入門),P4(探索性數據分析),P5(機器學習)和P7(A / B測試)的課程內容獲得5星我五個。 P3(數據整理)和P6獲得三顆半星。
The exploratory data analysis content with Facebook employees (P4) was so illuminating. The intro to machine learning course with Sebastian Thrun and Katie Malone (P5) was the most fun I’ve had in any online course. The A/B testing content with Google employees (P7) is so unique. I’d give those three courses six stars if I could.
與Facebook員工(P4)進行的探索性數據分析內容非常具有啟發性。 Sebastian Thrun和Katie Malone(P5)開設的機器學習課程入門是我在任何在線課程中獲得的最大樂趣。 Google員工(P7)的A / B測試內容是如此獨特。 如果可以的話,我會給這三個課程六個星。
The SQL and Data Wrangling content (P3) weren’t amazing. Same with the data visualization content (P6), though that probably was because D3.js is super difficult to teach to JavaScript newbies. These opinions aren’t uncommon, according to the Class Central’s reviews for those courses. Check them out here and here.
SQL和數據整理內容(P3)并不令人驚訝。 與數據可視化內容(P6)相同,但這可能是因為D3.js很難向JavaScript新手教。 根據Class Central對這些課程的評論,這些意見并不少見。 在這里和這里檢查一下 。
This “not amazing” content from the old program was removed in the Fall 2017 refresh. Revamped content for intro to data analysis, SQL, statistics, data wrangling, and data visualization is now included. The Practical Statistics content focuses on inferential statistics, with descriptive statistics being a prerequisite and taught in the Data Foundations Nanodegree program. The data visualization course is now taught with Tableau instead of D3.js.
舊程序中的此“不驚人”內容已在2017年秋季更新中刪除 。 現在包括用于數據分析,SQL,統計信息,數據整理和數據可視化的新內容。 實用統計學的內容側重于推論統計學,描述性統計學是前提條件,并在Data Foundations Nanodegree程序中進行了講授。 現在使用Tableau而不是D3.js講授數據可視化課程。
項目進展如何? (How were the projects?)
Again, projects are where Udacity sets themselves apart from the rest of the online education platforms. They invest in their project review process and it pays off. The Data Analyst Nanodegree program was no exception.
再次,項目是Udacity與其他在線教育平臺區分開來的地方。 他們在項目審查過程中進行了投資,并且得到了回報。 Data Analyst Nanodegree程序也不例外。
All of the projects reinforce the content you learned in the videos. The project reviewers know their stuff. They tell you where you succeeded and where your mistakes and/or omissions are. Supervised learning by doing. It works.
所有項目都鞏固了您在視頻中學到的內容。 項目審閱者知道他們的東西。 他們會告訴您成功的地方以及錯誤和/或遺漏的地方。 有監督地邊做邊學。 有用。
The forums and the forum mentors are especially helpful when you get stuck. Search the forums to see if your problem is a common one (they usually are). No luck? Post a new question yourself. There is one forum mentor, Myles Callan, who seems to know everything about everything and responds within hours. I have my doubts that he sleeps.
當您遇到困難時,論壇和論壇指導者特別有用。 搜索論壇以查看您的問題是否很常見(通常是)。 沒運氣? 自己發布一個新問題。 有一位論壇指導者Myles Callan,他似乎了解所有事情,并在數小時內做出回應。 我懷疑他睡著了。
Though forums still exist and work, Slack and classroom mentors are now the recommended support avenues. Students can post questions, and answers are provided with the same or greater level of immediacy (within hours and often sooner). The Slack community is overseen by Udacity instructors as well as their student experience staff, who ensure that student questions, comments, etc. are addressed in a timely fashion. The famed Myles Callan is now a mentor.
盡管論壇仍然存在并且可以正常工作,但是現在推薦使用Slack和課堂指導者作為支持途徑。 學生可以發布問題,并在相同或更高級別的即時性下(在幾個小時內,通常更快)提供答案。 Slack社區由Udacity講師及其學生體驗人員監督,他們確保及時解決學生的問題,評論等。 著名的邁爾斯·卡倫(Myles Callan)現在是一名導師。
If you’re curious to see what these projects look like, check out this Github repository.
如果您想看看這些項目的樣子,請查看此Github存儲庫 。
有多難? (How hard was it?)
The statistics content was easy for me because I had taken several stats courses in undergrad. This would probably be true for every topic in the Nanodegree program if you had prior experience in it.
統計數據內容對我來說很容易,因為我在本科生上過幾門統計學課程。 如果您已有納米學位課程的經驗,那么這對于每一個主題都是正確的。
I’d categorize most of the program as intermediate difficulty. Lecture content that doesn’t have many quizzes (they often do, though) can be a breeze, which isn’t necessarily a bad thing. The projects exercise your brain. Each will probably take you more than twenty hours if you want to be thorough.
我會將大多數程序歸為中等難度。 沒有很多測驗的演講內容(盡管經常有),可以輕而易舉,這不一定是一件壞事。 這些項目可以鍛煉您的大腦。 如果您想徹底了解,每個過程可能會花費您20多個小時。
The Exploratory Data Analysis project was the most challenging to pass. It took me 3.5 submissions. Check out this Twitter thread for more details.
探索性數據分析項目是最具挑戰性的。 我花了3.5份意見書。 查看此Twitter線程以了解更多詳細信息。
您可以在畢業后立即申請工作嗎? (Can you apply for jobs immediately post-graduation?)
You can. The program should equip you with the required skills for an entry-level data analyst role if you take it seriously. Eli Kastelein is a perfect example of that. You can read more about his story below.
您可以。 如果您認真對待的話,該程序應該為您提供入門級數據分析師角色所需的技能。 Eli Kastelein就是一個很好的例子。 您可以在下面閱讀有關他的故事的更多信息。
How to Build a Career in Tech Without a CS DegreeIn the spring of 2014, I was a fresh college dropout on a Greyhound bus headed nowhere in particular.medium.com
如何在沒有CS學士學位的情況下建立技術職業 2014年Spring,我是乘坐灰狗公車上大學的新人,頭也不回。 medium.com
You can also continue onto more advanced courses, both for the subjects covered in the program and for other subjects. This is what I chose to do.
您還可以繼續學習更高級的課程,包括該課程涵蓋的主題和其他主題。 這就是我選擇要做的。
最后的想法 (Final thoughts)
我是否會再次知道我現在知道的程序? (Would I take the program again knowing what I know now?)
Somewhere towards the end of the program, I started creating Class Central’s Data Science Career Guide. This entailed researching every single online course offered for every subject within data science.
在計劃結束的某個地方,我開始創建Class Central的《 數據科學職業指南》 。 這需要研究為數據科學中的每個學科提供的每一個在線課程。
Though I enjoyed the majority of courses within the Nanodegree program (update: new courses have replaced the courses I didn’t enjoy), there are courses from other providers that receive better reviews for certain subjects. Statistics, for example. If I had access to my guide back when I started, I would consider the separate-course-for-each-subject route. Udacity’s student services and project review process, however, are so effective for learning that I would take the Data Analyst Nanodegree program regardless.
盡管我喜歡Nanodegree計劃中的大多數課程(更新:新課程取代了我不喜歡的課程) ,但有些其他提供商的課程則對某些學科給予了更好的評價。 例如, 統計信息 。 如果我一開始就可以訪問我的指南,那么我將考慮針對每個主題的單獨課程路線。 但是,Udacity的學生服務和項目審核過程對于學習是如此有效,以至于無論如何我都會參加Data Analyst納米學位課程。
If you’re the kind of person who wants a 100% custom online education experience but wants to take advantage of Udacity’s projects and services, researching your favorite courses for each subject (I recommend using Class Central) then enrolling in the Nanodegree program to complete the projects is something to consider.
如果您是那種希望獲得100%自定義在線教育經驗,但又想利用Udacity的項目和服務的人,請針對每個主題研究自己喜歡的課程(我建議使用Class Central ),然后注冊Nanodegree計劃以完成這些項目是要考慮的。
替代品 (The alternatives)
These are the five alternative programs that I was considering when I enrolled in the Data Analyst Nanodegree program:
這些是我注冊Data Analyst Nanodegree程序時正在考慮的五個替代程序:
Johns Hopkins University’s Data Science Specialization on Coursera
約翰霍普金斯大學Coursera的數據科學專業
Microsoft’s Professional Program Certificate in Data Science on edX
edX上的Microsoft 數據科學專業計劃證書
Wesleyan University’s Data Analysis and Interpretation Specialization on Coursera
衛斯理大學在Coursera上的數據分析和解釋專業
DataCamp’s Python and R tracks
DataCamp的Python和R軌道
Dataquest’s Data Analyst and Data Scientist paths
Dataquest的數據分析師和數據科學家路徑
Note: I have removed my comments on these programs due to Udacity policy regarding commenting on other providers.
注意:由于Udacity關于對其他提供商進行評論的政策,我已刪除了對這些程序的評論。
結論 (Conclusion)
Udacity’s Data Analyst Nanodegree program gives you the foundational skills you need for a career in data science. Post-graduation, you’ll be able to target your strengths and weaknesses, and supplement your learning where necessary. Plus, you’ll leave with a handful of portfolio-ready projects.
Udacity的Data Analyst Nanodegree程序為您提供從事數據科學職業所需的基礎技能。 畢業后,您將能夠針對自己的長處和短處,并在必要時補充學習內容。 另外,您將離開一些準備就緒的項目。
I loved it, as did others.
我喜歡它, 其他人也喜歡。
★★★★?
★★★★?
翻譯自: https://www.freecodecamp.org/news/review-udacity-data-analyst-nanodegree-1e16ae2b6d12/
udacity開源的數據