How many of you use p=0.05 as an absolute cut off? p ≥ 0.05 means not significant. No evidence. Nada. And then p < 0.05 great it’s significant. This is a crude way of using p-values, and hopefully I will convince you of this.
你們中有多少人使用p = 0.05作為絕對截止值? p≥0.05表示不顯著。 沒有證據。 娜達 然后p <0.05很好,很有意義。 這是使用p值的粗略方法,希望我能說服您。
什么是p值? (What is a p-value?)
A lot of us use p-values following this arbitrary cut off but don’t actually know the theoretical background of a p-value. A p-value is the probability, under the null hypothesis, of observing data at least as extreme as the observed data. It is not, for example, the probability that some population parameter x = 0. x either equals 0 or it does not (in a frequentist setting).
我們中的許多人都在此任意取舍之后使用p值,但實際上并不了解p值的理論背景。 p值是在零假設下觀察數據至少與觀察數據一樣極端的概率。 例如,這不是某個總體參數x = 0的概率。x等于0或不等于0(在常客設置中)。
So, the smaller the p-value, the more unlikely it is that this data would have been observed under the null hypothesis. In essence, the smaller the p-value, the stronger the evidence against the null hypothesis.
因此,p值越小,在原假設下觀察到該數據的可能性就越小。 本質上,p值越小,針對原假設的證據越強。
什么會影響p值? (What affects p-values?)
Two things mainly. The first is the strength of effect. The greater the difference from the null hypothesis. The smaller the p-value will be.
主要有兩件事。 首先是效果的強度。 與原假設的差異越大。 p值越小。
The second is the sample size. The larger the sample, the smaller the p-value will be (if in fact the null hypothesis is false).
第二個是樣本量。 樣本越大,p值就越小(如果實際上零假設是假的)。
So, this means that if p ≥ 0.05, it could be because the effect isn’t that strong (or doesn’t exist) or that our sample is too small, resulting in our test being underpowered to detect a difference.
因此,這意味著如果p≥0.05,則可能是因為效果不那么強烈(或不存在)或我們的樣本太小,導致我們的測試能力不足以檢測差異。
一些例子 (Some examples)
致命藥 (A deadly drug)
Suppose we were looking at adverse events of a new drug. Now suppose p=0.051 for evidence that the drug increases the rate of deaths. Now, if we used p=0.05 as a cut-off then it’s great. No evidence that the drug increases the rate of deaths — let’s put it into production. Now imagine that p=0.049 of an increase in the rate of deaths. Oh no! There’s evidence that the drug is harmful. Let’s not put it into production.
假設我們正在研究一種新藥的不良React。 現在假設p = 0.051作為該藥物增加死亡率的證據。 現在,如果我們使用p = 0.05作為臨界值,那就太好了。 沒有證據表明這種藥物會增加死亡率,我們將其投入生產。 現在,假設死亡率增加了p = 0.049。 不好了! 有證據表明這種藥物有害。 我們不要將其投入生產。
Mathematically, there’s not really a difference between the two. They are essentially the same. But by using this arbitrary cut off we reach very different conclusions.
從數學上來說,兩者之間并沒有真正的區別。 它們本質上是相同的。 但是,通過使用這種任意截斷,我們得出了截然不同的結論。
這種藥物有效嗎 (Does this drug work)
Now imagine another drug. We’ve got a very large sample (n=10,000) and we want to know whether this drug cures cancer. So we get p=0.049 that it cures cancer. Great! Significant evidence this drug cures cancer. Let’s give it to everyone.
現在想象另一種藥物。 我們有一個非常大的樣本(n = 10,000),我們想知道這種藥物是否可以治愈癌癥。 因此我們得到p = 0.049可以治愈癌癥。 大! 重要證據表明該藥可治愈癌癥。 讓我們給大家。
Though, it’s a large sample. Wouldn’t we expect p to be smaller? It’s not that strong evidence against the null hypothesis. There’s approximately a one in twenty chance that our results are down to chance. Now suppose this drug is really expensive. Do we really want to start giving it out to everyone based on some fairly weak evidence? Probably not.
雖然,這是一個很大的樣本。 我們難道不希望p變小嗎? 并非沒有證據支持原假設。 我們的結果接近偶然的可能性大約為十分之一。 現在假設這種藥真的很貴。 我們是否真的要根據一些相當薄弱的證據開始向所有人分發? 可能不是。
Now of course if p=0.001 this would be a one in a hundred chance that our results our down to chance. This would be much stronger evidence that the drug works.
當然,現在如果p = 0.001,這將是我們得出結果的機會的百分之一。 這將是該藥有效的更有力證據。
那么我們應該如何解釋p值呢? (So how should we interpret p-values?)
As a continuous scale. The smaller the p-value is, the stronger the evidence is. But, you should take the sample size and effect size into account. You should also consider whether you are looking at something positive or negative. If looking at something like our deadly drug example, we should be concerned even if the evidence is very weak. However, with something like wanting to know whether a drug works, we can afford to be much more sceptical about our result.
作為連續的規模。 p值越小,證據越強。 但是,您應該考慮樣本大小和效果大小。 您還應該考慮看的是正面還是負面。 如果以類似我們致命毒品的例子來看,即使證據不足,我們也應予以關注。 但是,由于想知道某種藥物是否有效,我們可以對我們的結果持懷疑態度。
So, hopefully in the future, you’ll stop using p=0.05?as some threshold picked out of threshold and consider it as what it truly is?—?the weight of evidence against the null hypothesis. And, of course, if you don’t have the evidence you need that isn’t necessarily because it doesn’t exist it could be that you lack statistical power to detect an effect.
因此,希望在將來,您將停止使用p = 0.05作為從閾值中選出的某個閾值,并將其視為真正的閾值-反對原假設的證據權重。 而且,當然,如果您沒有所需的證據,不一定是因為該證據不存在,可能是您缺乏統計能力來檢測效果。
翻譯自: https://towardsdatascience.com/stop-using-p-0-05-4a059e622c75
本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。 如若轉載,請注明出處:http://www.pswp.cn/news/387939.shtml 繁體地址,請注明出處:http://hk.pswp.cn/news/387939.shtml 英文地址,請注明出處:http://en.pswp.cn/news/387939.shtml
如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!