走出囚徒困境的方法

You and your friend have committed a murder. A few days later, the cops pick the two of you up and put you in two separate interrogation rooms such that you have no communication with each other. You think your life is over, but the police offer up a deal:

您和您的朋友謀殺了。幾天后，警察將你們兩個人抓起來，并把您放在兩個單獨的審訊室中，以便您彼此之間沒有任何溝通。您以為自己的生活已經結束，但是警察提出了一項協議：

Rat your friend out, and your sentence will be lighter.
把你的朋友趕出去，你的句子就會輕些。

The catch is that your friend is offered this deal, too.

問題是您的朋友也被提供了這筆交易。

More specifically, if you rat your friend out but your friend says nothing, they get a heavy sentence and you get a light one. If you rat each other out, there’s a heavy penalty for both. If both of you stay silent, the sentences are light for you and your friend.

更具體地說，如果您淘汰了您的朋友，但您的朋友什么也沒說，他們的句子就會很重，您的句子就會很輕。如果您互相淘汰，則兩者都將受到重罰。如果你們兩個都保持沉默，那么對您和您的朋友來說句子很輕。

The decisions must be made without communicating with each other, and both you and your friend only have two choices: “defect” and rat them out, or “withhold” information from the cops and stay silent.

這些決定必須在不相互溝通的情況下進行，您和您的朋友只有兩種選擇：“瑕疵”并將其淘汰，或者“隱瞞”警察的信息并保持沉默。

Take a look at the following diagram that describes your choices and sentences:

看下面的圖表，它描述了您的選擇和句子：

Image for post — A punishment diagram for you and your friend.

The matrix describes the number of years the two of you get depending on what you and your friend independently choose to do. The first number represents your prison time and the second represents your friend’s.

該矩陣描述了你們兩個人的年限，具體取決于您和您的朋友獨立選擇做什么。 第一個數字代表您的入獄時間，第二個數字代表您朋友的入獄時間。

For example, if you withhold information from the cops, but your friend chooses to rat you out, you get five years in a maximum-security prison but your friend only gets one.

例如，如果您從警察那里隱瞞信息，但您的朋友選擇將您殺了出去，您將在最高安全牢獄中服刑五年，而您的朋友只會被服刑一年。

What would you do in this situation?
在這種情況下您會怎么做？

What should you do in this situation?
你應該在這種情況下怎么辦？

There’s no easy solution to this problem and there are a lot of interpretations to this. A utilitarian might say withholding is the better option because it minimizes the total number of years either person spends in prison (5 + 1 or 3 + 3 as opposed to 10 + 10). What does probability say?

這個問題沒有簡單的解決方案，對此有很多解釋。功利主義者可能會說，扣繳稅是更好的選擇，因為它使每個人在監獄中度過的總年限最小化(5 + 1或3 + 3，而不是10 + 10)。概率說什么？

解決困境 (Solving the Dilemma)

I created a population of 1000 prisoners, each with a 50–50 chance of being a defector or withholder. I then simulated an interaction for 500 pairs of prisoners and took note of the outcomes. I repeated this experiment 200 times.

我創建了1000名囚犯，每名囚犯都有50–50個叛逃者或保留者的機會。然后，我模擬了500對囚犯的互動，并記錄了結果。我重復了這個實驗200次。

As you can see, if the number of defectors and withholders is roughly the same, it’s far better to attempt to withhold information. Most of the time, the payoff will be in your favor. The average number of years a defector spent in prison was 5.75 years vs. a withholders 4.01.

如您所見，如果叛逃者和撤回者的數量大致相同，那么最好隱瞞信息。大多數時候，回報將對您有利。 叛逃者在監獄中的平均年限為5.75年，而扣留者的平均年限為4.01年。

We can make this a bit more interesting by varying the proportion of defectors and withholders. Say you’re in a country where everyone is untrusting of one another and that 75% of the population is a defector vs. 25% withholder.

我們可以通過改變叛逃者和持股者的比例使這一點變得更加有趣。假設您所在的國家/地區每個人都不互相信任，并且75％的人口屬于叛逃者，而25％的人是叛逃者。

Your best bet is to withhold information from the police. This makes sense, since if you were to defect as well there’s a much higher probability that you’ll end up with the worst option in the game — 10 years a piece.

最好的選擇是不向警方提供信息。這是有道理的，因為如果您也要叛逃，那么您最終有可能在游戲中遇到最糟糕的選擇，那就是每10年一次。

Let’s try one more. Say everyone is overly trusting (despite the fact that they’ve just committed a murder).

讓我們再試一次。假設每個人都過分信任(盡管他們剛剛犯了謀殺罪)。

This one is more interesting. It seems that you’d be slightly better off withholding information, but the payoff is wildly inconsistent (and in some cases ends up being worse than defecting!). This is because the possible sentences for withholding while the other defects can be either three or five years while the possible sentences for defecting while the other withholds can be either one or ten years. The difference between 3 and 5 is a lot less than the difference between 1 and 10 — so there’s a lot more variance in the latter scenario.

這個更有趣。 似乎您最好不隱瞞信息，但是收益卻是完全不一致的(在某些情況下最終要比背叛更糟！)。 這是因為其他缺陷的可能扣留可能是三年或五年，而其他缺陷的可能扣留可能是一年或十年。 3與5之間的差異遠小于1與10之間的差異-因此在后一種情況下存在更多差異。

What would you do in this particular situation?

在這種特殊情況下，您會怎么做？

If you enjoyed this article, you can follow me on Medium for more content like this. Thanks for reading!

如果您喜歡這篇文章，可以在Medium上關注我，以獲得更多類似的內容。謝謝閱讀！

翻譯自: https://towardsdatascience.com/a-computational-approach-to-the-prisoners-dilemma-837a799cedf0

走出囚徒困境的方法

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/news/389724.shtml
繁體地址，請注明出處：http://hk.pswp.cn/news/389724.shtml
英文地址，請注明出處：http://en.pswp.cn/news/389724.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！

2016. 增量元素之間的最大差值

2016. 增量元素之間的最大差值給你一個下標從 0 開始的整數數組 nums ，該數組的大小為 n ，請你計算 nums[j] - nums[i] 能求得的最大差值 ，其中 0 < i < j < n 且 nums[i] < nums[j] 。返回最大差值。如果不存在滿足要求的…

Zookeeper系列四：Zookeeper實現分布式鎖、Zookeeper實現配置中心

一、Zookeeper實現分布式鎖分布式鎖主要用于在分布式環境中保證數據的一致性。包括跨進程、跨機器、跨網絡導致共享資源不一致的問題。 1. 分布式鎖的實現思路說明： 這種實現會有一個缺點，即當有很多進程在等待鎖的時候，在釋放鎖的時候會有…

resize 按鈕不會被偽元素遮蓋

textarea默認有個resize樣式，效果就是下面這樣讀《css 揭秘》時發現兩個亮點： 其實這個屬性不僅適用于 textarea 元素，適用于下面所有元素：elements with overflow other than visible, and optionally replaced elements repre…

平臺api對數據收集的影響_收集您的數據不是那么怪異的api

平臺api對數據收集的影響A data analytics cycle starts with gathering and extraction. I hope my previous blog gave an idea about how data from common file formats are gathered using python. In this blog, I’ll focus on extracting the data from files that are…

709. 轉換成小寫字母

709. 轉換成小寫字母給你一個字符串 s ，將該字符串中的大寫字母轉換成相同的小寫字母，返回新的字符串。示例 1：輸入：s "Hello" 輸出："hello"示例 2：輸入：s "here…

前端技術周刊 2018-09-10：Redux Mobx

前端快爆在 Chrome 10 周年之際，正式發布 69 版本，整體 UI 重新設計，同時iOS 版本重新將工具欄放置在了底部。API 層面，支持了 CSS Scroll Snap、前端資源鎖 Web Lock API、WebWorker 里面可以跑的 OffscreenCanvas API、toggleA…

PPT制作

0.【整體風格】整體風格統一界面排版 0.1 字體大小； 0.2 字體顏色； 0.3 字體的種類統一(不是指只取一種字體)） 1.【表達】結構化表達； 2.【取色】取色風格統一； 技巧：主色不超過三種，色彩不宜多…

1984. 學生分數的最小差值

1984. 學生分數的最小差值給你一個下標從 0 開始的整數數組 nums ，其中 nums[i] 表示第 i 名學生的分數。另給你一個整數 k 。從數組中選出任意 k 名學生的分數，使這 k 個分數間最高分和最低分的差值達到最小化。返回可能的最小差值。…

WBLoadingIndicatorView（加載等待動畫)

中文說明基于CALayer封裝加載等待動畫，目前支持6種類型動畫： typedef NS_ENUM(NSInteger, WBLoadingAnimationType) { WBLoadingAnimationcircleStrokeSpinType, WBWBLoadingAnimationBallPulseType, WBWBLoadingAnimationBallClipRotateType, WBWBLoad…

邏輯回歸概率回歸_概率規劃的多邏輯回歸

邏輯回歸概率回歸There is an interesting dichotomy in the world of data science between machine learning practitioners (increasingly synonymous with deep learning practitioners), and classical statisticians (both Frequentists and Bayesians). There is gener…

sys.modules[name]的一個實例

關于sys.modules[__name__]的用法，百度上閱讀量比較多得一個帖子是：https://www.cnblogs.com/robinunix/p/8523601.html 對于里面提到的基礎性的知識點這里就不再重復了，大家看原貼就好。這里為大家提供一個詳細的例子，幫助大家更…

ajax不利于seo_利于探索移動選項的界面

ajax不利于seoLately, my parents will often bring up in conversation their desire to move away from their California home and find a new place to settle down for retirement. Typically they will cite factors that they perceive as having altered the essence o…