條件概率分布_條件概率

條件概率分布

If you’re currently in the job market or looking to switch careers, you’ve probably noticed an increase in popularity of Data Science jobs. In 2019, LinkedIn ranked “data scientist” the №1 most promising job in the U.S. based on job openings, salary, and career advancement opportunities and reported a 56% rise in job openings for data scientists over the previous year. Despite its popularity, however, data science can me a difficult field to enter, let alone to learn. I know from my personal experience, the amount of statistics involved made it very challenging. Probability, in particular, can be quite complicated but is fundamental to many machine learning models such as decision tree learning. So the purpose of this article is to provide a rudimentary undertanding of conditional probability.

如果您目前正處于就業市場或正在尋求轉行,您可能已經注意到Data Science職位的受歡迎程度有所提高。 根據職位空缺,薪水和職業晉升機會,LinkedIn在2019年將“數據科學家”排在美國最有前途的工作之一,并報告說數據科學家的職位空缺比上一年增長了56%。 盡管它非常流行,但是數據科學還是一個很難進入的領域,更不用說學習了。 從我的親身經歷,我知道所涉及的統計數據非常具有挑戰性。 概率尤其可能非常復雜,但是對于許多機器學習模型(例如決策樹學習)而言,這是基礎。 因此,本文的目的是提供對條件概率的基本理解。

How To Calculate Probability

如何計算概率

Simply put, the probability of an event happening is equal to the number of times an event could happen divided by the total number of outcomes. For example, imagine you have a deck of cards and you want to calculate the probability that you’ll randomly pull a king from the deck. How would you calculate that? Well, since there are 4 kings in a deck of cards, there are 4 possible ways you can draw a king from the deck; and since there are 52 cards in the deck, there’s 52 possible outcomes. So 4 divided by 52 is .076 or 7.6% chance your card will be a king. Now say you want to figure out the probability of drawing another king — the answer will depend on how you handle replacement. Sampling with replacement means that you place the first card back into the deck making the two events independant (the probability of drawing each king doesn’t change). Sampling without replacement means you’re not placing the first card back, which affects the probability of drawing the second king (total number of outcomes is now 51). If event A is drawing the first king card and event B os drawing the second king card, then we’d say the probability of B given A is equal to the probability of event A multiplied by the probability of event B given that A occurs.

簡而言之,事件發生的概率等于事件可能發生的次數除以結果總數。 例如,假設您有一副撲克牌,并且想要計算隨機從該副牌中拉出國王的概率。 您將如何計算? 好吧,由于在一副紙牌中有4個國王,因此有四種方法可以從紙牌中抽出一張國王; 而且由于套牌中有52張牌,因此有52種可能的結果。 因此,將4除以52得出的結果是.076,即7.6%的機會是您的卡成為王牌。 現在,您要確定吸引另一位國王的可能性-答案將取決于您如何進行替換 進行替換采樣意味著您將第一張卡放回卡組中,從而使兩個事件無關(抽出每位國王的概率不變)。 無需更換就可以進行采樣,這意味著您不會放回第一張紙牌,這會影響抽出第二張王牌的可能性(現在總結果為51)。 如果事件A吸引第一張王牌而事件B os吸引第二張王牌,那么我們說給定A的B概率等于事件A的概率乘以給定A發生的事件B的概率。

Mathematical Notation
P(A and B) = P(A) x P(B|A) = 4/52 x 3/51 = .45%

Tree Diagram

樹狀圖

Mathematics isn’t intuitive to everyone; it certainly wasn’t for me as I was just starting out in this field. Visualizations, however, can be a great tool when it comes to reenforcing complex topics. A tree diagram is one example that can help you break down a general problem into smaller components — perfect for probability problems that involves multiple events that lead to a variety of outcomes. For example, take a look at the diagram I’ve created that helps answer the following question: If you have a bag of 23 marbles (5 green, 8 blue, and 10 red), what’s the probability that you’ll randomly pull out a blue marble and a green marble? Let’s break it down.

數學不是每個人都直觀的。 因為我剛開始涉足這一領域,所以對我當然不是。 但是,在強化復雜主題時,可視化可能是一個很好的工具。 樹形圖是一個示例,可以幫助您將一般問題分解為較小的部分-非常適合涉及多個事件并導致各種結果的概率問題。 例如,看一下我創建的有助于回答以下問題的圖表:如果您有一袋23顆大理石(5顆綠色,8顆藍色和10顆紅色),那么您隨機抽出的概率是多少?藍色大理石和綠色大理石? 讓我們分解一下。

  1. The probability of grabbing a blue marble is 35%, because there are 8 way you can get a blue marble and 23 total potential outcomes.

    抓住藍色大理石的可能性為35%,因為有8種方法可以獲取藍色大理石,并且有23種潛在結果。
  2. Now given that you pulled out a blue marble, the probability of grabbing a green marble from the bag is 23% — 5 green marbles divided by 22 potential outcomes (notice how the total number of outcomes changes the second time, hence the change in probability).

    現在,假設您拔出一塊藍色大理石,則從袋子中抓取綠色大理石的概率為23%-5個綠色大理石除以22個潛在結果(請注意結果總數如何第二次更改,因此概率發生變化)

  3. Finally, calculating the probability of both these events happening involves multiplying the probability of both events (.35 x .23 = 8%).

    最后,計算這兩個事件發生的概率涉及將兩個事件的概率相乘(.35 x .23 = 8%)。

Conclusion

結論

Hopefully this demsonstration has given you a clearer mental picture of statistical probability. Even though conditional probability may seem elementary compared to the more advanced concepts in machine learning, having a solid understanding of the foundation of which data science is built on is extremely important. So whenever you begin to learn something new, remember that no topic is too small and relearning is reenforcement.

希望這種演示能使您對統計概率有更清晰的認識。 盡管與機器學習中更高級的概念相比,條件概率似乎是基本的,但對數據科學所基于的基礎有扎實的了解仍然非常重要。 因此,每當您開始學習新知識時,請記住,沒有一個主題太小,重新學習就是強化。

翻譯自: https://medium.com/swlh/conditional-probability-7f519a81655e

條件概率分布

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/news/389475.shtml
繁體地址,請注明出處:http://hk.pswp.cn/news/389475.shtml
英文地址,請注明出處:http://en.pswp.cn/news/389475.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

MP實戰系列(十七)之樂觀鎖插件

聲明,目前只是僅僅針對3.0以下版本,2.0以上版本。 意圖: 當要更新一條記錄的時候,希望這條記錄沒有被別人更新 樂觀鎖實現方式: 取出記錄時,獲取當前version 更新時,帶上這個version 執行更新時…

二叉樹刪除節點,(查找二叉樹最大值節點)

從根節點往下分別查找左子樹和右子樹的最大節點,再比較左子樹,右子樹,根節點的大小得到結果,在得到左子樹和右子樹最大節點的過程相似,因此可以采用遞歸的 //樹節點結構 public class TreeNode { TreeNode left;…

Tensorflow框架:InceptionV3網絡概念及實現

卷積神經網絡遷移學習-Inception ? 有論文依據表明可以保留訓練好的inception模型中所有卷積層的參數,只替換最后一層全連接層。在最后 這一層全連接層之前的網絡稱為瓶頸層。 ? 原理:在訓練好的inception模型中,因為將瓶頸層的輸出再通過…

View詳解(4)

在上文中我們簡單介紹了Canvas#drawCircle()的使用方式,以及Paint#setStyle(),Paint#setStrokeWidth(),Paint#setColor()等相關函數,不知道小伙伴們了解了多少?那么是不是所有的圖形都能通過圓來描述呢?當然不行,那么熟…

成為一名真正的數據科學家有多困難

Data Science and Machine Learning are hard sports to play. It’s difficult enough to motivate yourself to sit down and learn some maths, let alone to becoming an expert on the matter.數據科學和機器學習是一項艱巨的運動。 激勵自己坐下來學習一些數學知識是非常…

Ubuntu 裝機軟件

Ubuntu16.04 軟件商店閃退打不開 sudo apt-get updatesudo apt-get dist-upgrade# 應該執行一下更新就好,不需要重新安裝軟件中心 sudo apt-get install –reinstall software-center Ubuntu16.04 深度美化 https://www.jianshu.com/p/4bd2d9b1af41 Ubuntu18.04 美化…

數據分析中的統計概率_了解統計和概率:成為專家數據科學家

數據分析中的統計概率Data Science is a hot topic nowadays. Organizations consider data scientists to be the Crme de la crme. Everyone in the industry is talking about the potential of data science and what data scientists can bring in their BigTech and FinT…

Keras框架:Mobilenet網絡代碼實現

Mobilenet概念: MobileNet模型是Google針對手機等嵌入式設備提出的一種輕量級的深層神經網絡,其使用的核心思想便是depthwise separable convolution。 Mobilenet思想: 通俗地理解就是3x3的卷積核厚度只有一層,然后在輸入張量上…

clipboard 在 vue 中的使用

簡介 頁面中用 clipboard 可以進行復制粘貼&#xff0c;clipboard能將內容直接寫入剪切板 安裝 npm install --save clipboard 使用方法一 <template><span>{{ code }}</span><iclass"el-icon-document"title"點擊復制"click"co…

數據驅動開發_開發數據驅動的股票市場投資方法

數據驅動開發Data driven means that your decision are driven by data and not by emotions. This approach can be very useful in stock market investment. Here is a summary of a data driven approach which I have been taking recently數據驅動意味著您的決定是由數據…

前端之sublime text配置

接下來我們來了解如何調整sublime text的配置&#xff0c;可能很多同學下載sublime text的時候就是把它當成記事本來使用&#xff0c;也就是沒有做任何自定義的配置&#xff0c;做一些自定義的配置可以讓sublime text更適合我們的開發習慣。 那么在利用剛才的命令面板我們怎么打…

python 時間序列預測_使用Python進行動手時間序列預測

python 時間序列預測Time series analysis is the endeavor of extracting meaningful summary and statistical information from data points that are in chronological order. They are widely used in applied science and engineering which involves temporal measureme…

keras框架:目標檢測Faster-RCNN思想及代碼

Faster-RCNN&#xff08;RPN CNN ROI&#xff09;概念 Faster RCNN可以分為4個主要內容&#xff1a; Conv layers&#xff1a;作為一種CNN網絡目標檢測方法&#xff0c;Faster RCNN首先使用一組基礎的convrelupooling層提取 image的feature maps。該feature maps被共享用于…

算法偏見是什么_算法可能會使任何人(包括您)有偏見

算法偏見是什么在上一篇文章中&#xff0c;我們展示了當數據將情緒從動作中剝離時會發生什么 (In the last article, we showed what happens when data strip emotions out of an action) In Part 1 of this series, we argued that data can turn anyone into a psychopath, …

大數據筆記-0907

2019獨角獸企業重金招聘Python工程師標準>>> 復習: 1.clear清屏 2.vi vi xxx.log i-->edit esc-->command shift:-->end 輸入 wq 3.cat xxx.log 查看 --------------------------- 1.pwd 查看當前光標所在的path 2.家目錄 /boot swap / 根目錄 起始位置 家…

Tensorflow框架:目標檢測Yolo思想

Yolo-You Only Look Once YOLO算法采用一個單獨的CNN模型實現end-to-end的目標檢測&#xff1a; Resize成448448&#xff0c;圖片分割得到77網格(cell)CNN提取特征和預測&#xff1a;卷積部分負責提取特征。全鏈接部分負責預測&#xff1a;過濾bbox&#xff08;通過nms&#…

線性回歸非線性回歸_了解線性回歸

線性回歸非線性回歸Let’s say you’re looking to buy a new PC from an online store (and you’re most interested in how much RAM it has) and you see on their first page some PCs with 4GB at $100, then some with 16 GB at $1000. Your budget is $500. So, you es…

樸素貝葉斯和貝葉斯估計_貝葉斯估計收入增長的方法

樸素貝葉斯和貝葉斯估計Note from Towards Data Science’s editors: While we allow independent authors to publish articles in accordance with our rules and guidelines, we do not endorse each author’s contribution. You should not rely on an author’s works wi…

numpy統計分布顯示

import numpy as np from sklearn.datasets import load_iris dataload_iris()petal_lengthnumpy.array(list(len[2]for len in data[data]))#取出花瓣長度數據 print(np.max(petal_length))#花瓣長度最大值 print(np.mean(petal_length))#花瓣長度平均值 print(np.std(petal_l…

python數據結構:進制轉化探索

*********************************第一部分******************************************************************************************************************************************************************************************# 輸入excel的行號&#xff0c;…