機器學習實際應用
Some of my previous introductory posts to machine learning and data science were a bit technical. However, my purpose of this post is to explain some of the practical use-cases of ML solely from a non-technical savvy layman’s perspective who has had nil exposure to it previously. To satisfy your curiosity, I will also mention the specific ML algorithms that are generally applicable to each use-case if you want to learn more about them.
我以前的一些機器學習和數據科學入門文章有些技術性。 但是,我的這篇文章的目的是僅從非技術過硬的外行的角度解釋ML的一些實際用例,而以前他幾乎沒有接觸過它。 為了滿足您的好奇心,如果您想了解更多有關它們的信息,我還將提到通常適用于每個用例的特定ML算法。
What type of problems does ML help us with? Irrespective of the specific domain, what answers or actionable insights it offers? Instead of the ‘how’, our focus here will be more on ‘what’ and ‘why’.
機器學習可以幫助我們解決哪些類型的問題? 無論特定領域如何,它提供了哪些答案或可行的見解? 除了“如何”,我們在這里的重點將更多地放在“什么”和“為什么”上。
這是什么? A還是B? (What is This? A or B?)
This family of ML algorithms predicts in which one of the only two possible categories an observation belongs to. There is no other third potential option. Consider that the management wants to predict which of your existing customers will churn. The answer can only be whether a specific customer will churn or not. Other practical examples include:
這個ML算法系列可預測觀察結果僅屬于兩種可能的類別之一。 沒有其他第三種可能的選擇。 考慮到管理層希望預測您現有的哪些客戶會流失。 答案只能是特定客戶是否會流失。 其他實際示例包括:
- Is this email spam or not? 這是垃圾郵件嗎?
- Will this customer default or not? 該客戶是否會違約?
- Are these symptoms symptomatic of a specific disease or not? 這些癥狀是否是特定疾病的癥狀?
- Will this customer continue with a purchase or not? 該客戶會繼續購物嗎?
- Is this an image of a boy or a girl? 這是男孩還是女孩的畫像?
Formally known as Binary Classification, the relevant algorithms include:
正式稱為二進制分類 ,相關算法包括:
- Logistic Regression 邏輯回歸
- Support Vector Machine 支持向量機
- k-Nearest Neighbor k最近鄰居
- Classification Decision Tree 分類決策樹
這是什么? A或B或C或D(或其他)? (What is This? A or B or C or D (Or Something Else)?)
An extension of binary classification, here, the number of potential categories can be more than two. Consider that you are working on a face recognition model; the person in a specific picture can be any of the individuals in your database. The number of possible correct answers is only limited to the amount of data used during model development. Other practical examples include:
二進制分類的擴展,在這里,潛在類別的數量可以超過兩個。 考慮您正在開發人臉識別模型; 特定圖片中的人可以是您數據庫中的任何人。 可能的正確答案的數量僅限于模型開發期間使用的數據量。 其他實際示例包括:
- Optical Character Recognition: which character is this? 光學字符識別:這是哪個字符?
- Which animal is in this image? 該圖像中的哪只動物?
- Which genre does this movie belong to? 這部電影屬于哪種類型?
- Sentiment Analysis: what is the feeling associated with this tweet? 情緒分析:此推文有什么感覺?
- Whose voice is it in this audio recording? 這段錄音是誰的聲音?
Formally known as Multi-Class Classification, the relevant algorithms include:
正式稱為多類分類 ,相關算法包括:
- Random Forests 隨機森林
- Classification Decision Tree 分類決策樹
- XGBoost XGBoost
- k-Nearest Neighbor k最近鄰居
- Artificial Neural Networks 人工神經網絡
有多少期望值? (How Much or How Many of Something To Expect?)
This family of ML algorithms predicts quantities of something as a continuous output or number (i.e., the prediction can be any of the unlimited numbers of possible outcomes). There are no fixed possible categories that can be predicted — for example, predicting sales volume for the next quarter. That sales prediction can be 1,000 units, 10,000 units, 1,200 units, or any other positive real number.
這個ML算法系列以連續的輸出或數量的形式預測某物的數量(即,該預測可以是無限數量的可能結果)。 沒有可以預測的固定可能類別,例如,預測下一季度的銷量。 該銷售預測可以是1,000個單位,10,000個單位,1,200個單位或任何其他正實數。
The output of these algorithms can be any real number (positive, negative, zero, fractions); however, your specific use-case will determine whether negatives or fractions can be expected and accepted. For example, a sales forecast cannot be negative.
這些算法的輸出可以是任何實數(正數,負數,零,分數)。 但是,您的特定用例將確定是否可以預期和接受負數或分數。 例如,銷售預測不能為負。
Other practical use-cases of this class of algorithms include:
此類算法的其他實際用例包括:
- What will be tomorrow’s temperature? 明天的溫度是多少?
- How many prospects can we sign up as customers in the next quarter? 在下一季度,我們可以簽約多少潛在客戶?
- What will be our energy consumption next month? 下個月我們的能源消耗是多少?
- How long will it take for an event to occur? 事件發生需要多長時間?
Formally known as Regression, the relevant algorithms include:
正式稱為回歸 ,相關算法包括:
- Linear Regression 線性回歸
- Regression Decision Tree 回歸決策樹
- XGBoost XGBoost
- Artificial Neural Networks 人工神經網絡
該數據正常還是異常? (Is This Data Normal or Abnormal?)
Oftentimes, we are more interested in whether a specific observation is atypical, abnormal, or anomaly. Or is it merely a normal and usual observation. We can have historical observations classified as abnormal or not. Or it could be the case that such historical classification does not exist, and an ML algorithm will be used to detect any outliers.
通常,我們對特定觀察結果是非典型,異常還是異常更感興趣。 還是僅僅是正常和通常的觀察。 我們可以將歷史觀測分為異常與否。 或者可能是不存在這種歷史分類的情況,并且將使用ML算法來檢測任何異常值。
Typical use-cases include:
典型的用例包括:
- Is this purchase materially different from the customer’s past purchases? 這次購買與客戶過去的購買有實質性的不同嗎?
- Is this traffic pattern from a computer network typical? 來自計算機網絡的這種流量模式是否典型?
- Are these outputs from a piece of industrial equipment atypical? 這些來自工業設備的輸出是否非典型?
Formally known as Outlier or Anomaly Detection, the relevant algorithms include:
正式稱為異常值或異常檢測 ,相關算法包括:
- Isolation Forest 隔離林
- Density-Based Spatial Clustering of Applications with Noise (DBSCAN) 基于密度的噪聲應用空間聚類(DBSCAN)
- Z-Scores (not technically an ML algorithm, instead a statistical test to identify outliers) Z分數(從技術上講不是ML算法,而是統計測試以識別異常值)
- One-Class Support Vector Machine 一類支持向量機
我們如何組織這些數據? (How Can We Organize this Data?)
Are there any underlying identifiable characteristics that can be used to categorize and organize data into specific groups (also known as clusters or segments)? These unique characteristics are not known to us, and often, even the number of potential clusters is unknown. Clustering or organizing your data may assist you with further analysis or developing cluster-specific strategies.
是否存在可用于將數據分類和組織為特定組(也稱為群集或段)的潛在可識別特征? 這些獨特的特征對我們來說是未知的,而且甚至潛在簇的數量通常也是未知的。 對數據進行聚類或組織可以幫助您進一步分析或制定特定于聚類的策略。
For example, we may segment our customers into distinct groups based on their age, gender, purchase history, etc. to devise segment-specific sales, marketing, or promotion strategies.
例如,我們可能會根據客戶的年齡,性別,購買歷史記錄等將客戶劃分為不同的群體,以制定針對特定細分市場的銷售,營銷或促銷策略。
Other practical use-cases of this class of algorithms include:
此類算法的其他實際用例包括:
- Which of our subscribers like similar movies or songs? 我們哪個訂閱者喜歡類似的電影或歌曲?
- How can we categorize several text documents or audio recordings? 我們如何對幾個文本文檔或錄音進行分類?
- How can we better segment our products or services? 我們如何更好地細分我們的產品或服務?
- Which model of a specific machine is more prone to breakdowns? 特定機器的哪種型號更容易發生故障?
Formally known as Clustering, the relevant algorithms include:
正式稱為聚類 ,相關算法包括:
- k-Means Clustering k均值聚類
- Mean-Shift Clustering 均值漂移聚類
- Density-Based Spatial Clustering of Applications with Noise (DBSCAN) 基于密度的噪聲應用空間聚類(DBSCAN)
- Agglomerative Hierarchical Clustering 聚集層次聚類
- Balanced Iterative Reducing and Clustering using Hierarchies (BIRCH) 使用層次結構(BIRCH)進行平衡的迭代減少和聚類
接下來做什么? (What To Do Next?)
This is where ML gets really interesting, whereby the ML algorithm not only predicts but also tells us what to do given its prediction. This family of ML algorithms might not be mature enough yet for all use-cases; however, substantial progress has been made recently in the light of advanced deep learning algorithms and the greater processing power available to us.
這是ML真正令人感興趣的地方,據此ML算法不僅可以預測,而且可以告訴我們根據其預測該怎么做。 對于所有用例,這種ML算法系列可能還不夠成熟。 但是,鑒于先進的深度學習算法和我們可以使用的更大處理能力,最近已經取得了實質性進展。
These algorithms rely on trial and error and multiple feedback loops while not being as heavily dependant upon data as other algorithms. Mostly applicable in automated systems, the recommended action is usually taken by the machine.
這些算法依賴反復試驗和多個反饋循環,而沒有像其他算法那樣嚴重依賴數據。 通常適用于自動化系統,建議的操作通常由機器執行。
Formally known as Reinforcement Learning, it is usually implemented through deep neural networks.
正式稱為強化學習 ,通常是通過深度神經網絡來實現的。
Some practical applications of reinforcement learning include:
強化學習的一些實際應用包括:
- What should the robot do next in its situation in an industrial concern? 在工業方面,機器人在其情況下下一步該怎么做?
- Should we adjust the temperature or leave it untouched? 我們應該調節溫度還是保持不變?
- How should a self-driving car react (accelerate, decelerate, apply brakes, etc.) given the hazard ahead? 鑒于前方存在危險,無人駕駛汽車應如何React(加速,減速,剎車等)?
結論 (Conclusion)
There you have it: a practical, no-nonsense introduction to functional scenarios where ML assists us in plain language.
在那里,您可以找到實用的,實用的功能介紹,其中ML以簡單的語言幫助我們。
Free free to comment or reach out to me if you would like to discuss anything further related to machine learning, data analytics, risk scoring, and financial analysis.
如果您想討論與機器學習,數據分析,風險評分和財務分析有關的任何其他內容,可以免費發表評論或與我聯系 。
Till next time, code on!
直到下一次,編碼!
翻譯自: https://towardsdatascience.com/what-are-the-practical-benefits-of-machine-learning-c9820dbdd67c
機器學習實際應用
本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。 如若轉載,請注明出處:http://www.pswp.cn/news/392265.shtml 繁體地址,請注明出處:http://hk.pswp.cn/news/392265.shtml 英文地址,請注明出處:http://en.pswp.cn/news/392265.shtml
如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!