用位思考 (Thinking in terms of Bits)

Imagine you want to send outcomes of 3 coin flips to your friend's house. Your friend knows that you want to send him those messages but all he can do is get the answer of Yes/No questions arranged by him. Let's assume the arranged question: Is it head? You will send him the sequence of zeros or ones as an answer to those questions which is commonly known as a bit(binary digit). If Zero represents No and one represents Yes and the actual outcome of the toss was Head, Head, and Tail. Then you would need to send him 1 1 0 to convey your facts(information). so it costs us 3 bits to send those messages.

我想像一下，您想將3次硬幣翻轉的結果發送到您朋友的房子。您的朋友知道您想向他發送這些消息，但是他所能做的就是得到他安排的是/否問題的答案。讓我們假設一個安排好的問題：是頭嗎？您將把零或一的序列發送給他，作為對這些問題的解答(通常稱為位(二進制數字))。如果零代表否，一個代表是，并且折騰的實際結果是頭，頭和尾。然后，您需要向他發送1 1 0來傳達您的事實(信息)。因此發送這些消息要花費我們3位。

How many bits does it take to make humans?
做一個人需要多少位？

Our entire genetic code is contained in a sequence of 4 states T, A, G, C in DNA. Now we would need 2 bits (00,01,10,11) to encode these states. And multiplied by 6 billion letters of genetic code in the genome yields 1.5 GB of information. So we can fit our entire genetic code on a single DVD.

我們的整個遺傳密碼包含在DNA的4種狀態T，A，G，C中。現在我們將需要2位(00,01,10,11)來編碼這些狀態。乘以基因組中60億個字母的遺傳密碼可產生1.5 GB的信息。因此，我們可以將整個遺傳密碼放在一張DVD中。

多少信息？ (How much Information?)

Suppose I flip a fair coin with a 50% chance of getting head and a 50% chance of getting tails. Similarly now instead of a fair coin, I flip a biased coin with head on both sides. What do you think in which case I am more certain about the outcome of the toss? Obviously the answer would be a biased coin. It is because in the case of the fair coin I am uncertain about the outcome because none of the possibilities are more likely to happen than the other while in biased coin I am not uncertain about the outcome because I know I will get heads.

小號 uppose我翻轉一個公平的硬幣獲得頭部的幾率為50％和獲得尾巴的幾率為50％。現在，類似地，我用偏頭硬幣在兩頭都翻轉而不是公平的硬幣。在這種情況下，您認為我對投擲的結果有何把握？答案顯然是有偏見的。這是因為在公平硬幣的情況下，我不確定結果，因為沒有一種可能性比另一種可能性更大；而在有偏見的硬幣中，我不確定結果，因為我知道我會獲得正面的。

Let's look at the other way. Would you be surprised if I told you the coin with head on both sides gave head as an outcome? No. Because you did not learn anything new with that statement. Outcomes did not give you any further information. On the other hand, when the coin is fair you have the least knowledge of what will happen next. So each toss gives you the new information.

讓我們看看另一種方式。如果我告訴您兩側都帶硬幣的硬幣作為結果給了您，您會感到驚訝嗎？不。因為您沒有從該聲明中學到任何新知識。結果未提供任何進一步的信息。另一方面，當代幣公平時，您對接下來將要發生的事情的了解最少。因此，每次折騰都會為您提供新的信息。

So the intuition behind quantifying information is the idea of measuring how much surprise there is in an event. Those events that are rare (low probability) are more surprising and therefore have more information than those events that are common (high probability).
因此，量化信息背后的直覺是測量事件中有多少驚喜的想法。那些罕見的事件(低概率)比那些常見的事件(高概率)更令人驚訝，因此具有更多的信息。

Low Probability Event: High Information (surprising)
低概率事件 ：高信息( 令人驚訝)

High Probability Event: Low Information (unsurprising)
高概率事件 ：低信息( 不足為奇 )

As a prereqisite, If you want to learn about basic probability theory, I wrote about that here.
作為先決條件，如果您想了解基本概率論，我在這里寫過。

So information seems to be randomness. So if we want to know how much information does something contains we need to know how random and unpredictable it is. Mathematically, Information gained by observing an event X with probability P is given by:

因此，信息似乎是隨機的。因此，如果我們想知道某物包含多少信息，就需要知道它是多么隨機和不可預測。在數學上， 通過觀察概率為P的事件X獲得的信息由下式給出 ：

By plugging the values in the formula, we can clearly see information contained in certain events like observing head by tossing a coin with heads on both sides is 0 while the uncertain event leads to more information after it is observed. So this definition satisfies the basic requirement that it is a decreasing function of p.

通過插入公式中的值，我們可以清楚地看到某些事件中包含的信息，例如通過擲一個硬幣，使硬幣的正反兩面都是正面，而觀察正面，則不確定事件會導致更多信息。因此，該定義滿足了它是p的遞減函數的基本要求。

But You may have the question …………..

但是您可能有一個問題…………..

Why the logarithmic function?
為什么是對數函數？

And what is the base of the logarithm?
對數的底是什么？

As an answer to the second question, You can use any base of the logarithm. In information theory, we use base 2 in which case the unit of information is called a bit.

作為第二個問題的答案，您可以使用任何對數的底數。在信息論中，我們使用以2為底的情況，在這種情況下，信息的單位稱為位。

The answer to the first question is that the resulting definition has several elegant resulting properties, and it is the simplest function that provides these properties. One of these properties is additivity. If you have two independent events (i.e., events that have nothing to do with each other), then the probability that they both occur is equal to the product of the probabilities with which they each occur. What we would like is for the corresponding information to add up.

第一個問題的答案是，生成的定義具有多個優美的生成屬性，而這是提供這些屬性的最簡單函數。這些屬性之一是可加性。如果您有兩個獨立的事件(即彼此無關的事件)，則它們都發生的概率等于它們各自發生的概率的乘積 。 我們希望將相應的信息加起來。

For instance, the event that it rained in Kathmandu yesterday and the event that the number of views in this post is independent, and if I am told something about both events, the amount of information I now have should be the sum of the information in being told individually of the occurrence of the two events.

例如，昨天在加德滿都下雨的事件，以及該帖子中的視圖數量是獨立的事件，并且如果被告知有關這兩個事件的信息，那么我現在擁有的信息量應該是被分別告知兩個事件的發生。

The logarithmic definition provides us with the desired additivity because given two independent events A and B

對數定義為我們提供了所需的可加性，因為給定了兩個獨立的事件A和B

熵 (Entropy)

Entropy is the average(expected) amount of information conveyed by identifying the outcome of some random source.

?ntropy是信息的平均(預期)量傳送通過識別一些隨機源的結果。

It is simply the sum of several terms, each of which is the information of a given event weighted by the probability of that event occurring.
它只是幾項的總和，每一項都是給定事件的信息，由該事件發生的概率加權。

Like information, it is also measured in bits. If we use log?2?? for our calculation we can interpret entropy as the number of bits it would take us on average to encode our information.

像信息一樣，它也以位為單位。如果我們使用l og 2進行計算，我們可以將熵解釋為編碼我們的信息平均需要的位數。

In the important special case of two mutually exclusive events (i.e., exactly one of the two events can occur), occurring with probabilities p and 1 ? p, respectively, the entropy is:

在兩個互斥事件的重要特殊情況下(即恰好可以發生兩個事件之一)，它們分別以概率p和1-p發生，熵為：

帶走 (Takeaway)

Information is randomness. The more uncertain we are about the outcome of an event the more information we will get after observing an event.
信息就是隨機性。我們對事件的結果越不確定，觀察事件后我們將獲得更多的信息。
The equation for calculating the range of Entropy: 0 ≤ Entropy ≤ log(n), where n is the number of outcomes
計算熵范圍的公式：0≤熵≤log(n)，其中n是結果數
Entropy 0(minimum entropy) occurs when one of the probabilities is 1 and the rest are 0’s.
當其中一個概率為1且其余概率為0時，發生熵0(最小熵)。
Entropy log(n)(maximum entropy) occurs when all the probabilities have equal values of 1/n.
當所有概率都具有等于1 / n的值時，就會出現熵log(n)(最大熵)。

翻譯自: https://medium.com/@regmi.sobit/why-randomness-is-information-f2468966b29d

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/news/389898.shtml
繁體地址，請注明出處：http://hk.pswp.cn/news/389898.shtml
英文地址，請注明出處：http://en.pswp.cn/news/389898.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！