[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference rusure::math

Title:	Mathematics at DEC

Moderator:	RUSURE::EDP

Created:	Mon Feb 03 1986
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	2083
Total number of notes:	14613

518.0. "prisoner's dilemma" by LATOUR::JMUNZER () Tue Jun 17 1986 17:19

A fine article in today's NY Times ("Prisoner's Dilemma:  Why People
Cooperate", page C1) starts:

	You and your accomplice have been caught red-handed.  Together
	your best hope is to cooperate by remaining silent:  You will
	each get off with a 30-day sentence.  But either of you can do
	better for yourself:  Double-cross your partner and you will go
	free while he serves 10 years.  The problem is, if you each
	betray the other, you will both go to prison -- and not just for
	30 days, but for eight years.

It goes on to talk about the game below, strategies for it, computer
tournaments, and real-world applications.

Want to try strategies?  Or, as a spoiler, read the article.

John

 **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  ** 

     \		If...
       \	    You				You
	 \	    cooperate			defect
and if...  \______________________________________________________
	   |	then...			| then...
Opponent   |	      You get 3 points	| 	You get 5 points
cooperates |	           and          |            and
	   |          Opponent gets	|       Opponent gets
	   |             3 points       |          0 points
	   |-------------------------------------------------------
	   |	then...			| then...
Opponent   |	      You get 0 points	| 	You get 1 point
defects    |	           and          |            and
	   |          Opponent gets	|       Opponent gets
	   |             5 points       |          1 point
           --------------------------------------------------------

T.R	Title	User	Personal Name	Date	Lines
518.1	Who needs rational decisions?	MODEL::YARBROUGH		`Tue Jun 17 1986 18:03`	15
	The amusing thing about this game is that the best strategy is not (entirely) rational. If it were entirely rational, then both players would use the same strategy and make the same moves, since the game is competely symmetrical. But if both are going to make the same moves, then they should both cooperate. And cooperation is always the worst choice, as defecting will always be more profitable! So some irrational strategy must evolve from the dilemma. A random strategy can also be demonstrated to be bad, as any random selection to cooperate will be punished by the other player's choice of defecting. Typical game-theory considerations for this game break down because the game is non-zero-sum, i.e. it is not the case that your gain is the other player's loss.
518.2		BEING::POSTPISCHIL	Always mount a scratch monkey.	`Wed Jun 18 1986 13:15`	12
	Re .1: > A random strategy can also be demonstrated to be bad, as any random > selection to cooperate will be punished by the other player's choice > of defecting. I'm not sure that's correct. A selection to cooperate will not always be punished by the other player's choice of defecting, since the other player will not always chose that (assuming a good player). -- edp
518.3	one-shot vs serial	MODEL::YARBROUGH		`Wed Jun 18 1986 14:33`	8
	Re .-1; What do you mean by 'always'? The problem was originally stated as a one-shot situation. Your remark makes sense if you assume the game might be played many times, but then we are into the area of comparing dynamic strategies, which is fun, but for which random strategies tend not to work well in comparison with others. I'm a little sorry I got wound up on this. The whole game is a great time sink.
518.4		BEING::POSTPISCHIL	Always mount a scratch monkey.	`Wed Jun 18 1986 16:50`	21
	Re .3: I said "always" because we must consider all the possible variations on the play before concluding what would happen. There is no difference in what you must decide to do if you are playing the game only once or are going to play it many times. Consider that zero-sum games have the same problem: Selecting a certain strategy will often be disadvantageous if the opponent selects some strategy. That doesn't affect the fact that a random choice is the best way to go in games without a minimax point. Also, this game has a minimax point. Consider that your opponent will either cooperate or double-cross you. If your opponent cooperates and you double-cross, you go off free instead of receiving 30 days. If your opponent double-crosses you and you double-cross, you get eight years instead of ten years. In both cases, double-crossing is the better choice. -- edp
518.5	Nope, multi-games ARE different	TLE::BRETT		`Wed Jun 18 1986 17:46`	33
	> There is no difference in what you must decide to do if you are > playing the game once or are going to play it many times. Totally wrong. If you are going to play the game repeatedly, you can use you successive plays to, say, code messages to the other person. If the game is such that cooperation benefits you both then such messages can be VERY useful. For example: Punishment - if you screw me in this game I will ALWAYS screw you in the next. Such a technique can not be applied to "play once" games. In fact the zero-sum, one-shot games are the most probabilistic. Consider this game instead... Me \ 0 1 --------------------- You 0\| 1/0 -2/1000 Your points/my points \| 1\| -1/0 -1/1 Now, no matter what you do, I do best by playing a 1, so in a one-round game, I should play 1, in which case you should play a 1 also. If you repeat this analysis for 10 games the total score is -10/10 In a multi-round game I could play a 0 repeatedly until you switched to playing 0's also, and then every tenth game I could play a 1. In this case in 10 games the total score would be 8/1000 - we are both better off! /Bevin
518.6		BEING::POSTPISCHIL	Always mount a scratch monkey.	`Wed Jun 18 1986 21:32`	26
	Re .5: The n-th case is still the same as the first case, assuming you know your partner/opponent is intelligent. You don't what the other player will do, and you must decide what to do now. Unless there is a pure strategy, as in the original example, you must make a random selection, and the question is reduced to asking the probabilities you should assign to each choice. If the probability you (labeled "Me" in your diagram) choose 0 is p and the probability I choose 1 is q, it is simple to write our points as functions of p and q. If p and q were independent, p would be 0 or 1, depending on the value of q. But they are not independent, because if you made p 0 or 1, I might adjust q accordingly. Finding the appropriate values of p and q then becomes a problem (for me) of selecting a function f such that q = f(p) and my points are maximized. You have the task of selecting a function g such that p = g(q) and your points are maximized. This again brings us to a game, that of selecting functions. The new game involves an infinite number of choices, but the strategy will be pure (because a random selection between two functions assigning probabilities will be equivalent to some single function assigning probabilities, and that function exists as a play in the new game). How do we find those functions? -- edp
518.7	There is a better theoretical basis	MODEL::YARBROUGH		`Thu Jun 19 1986 14:43`	17
	The arguments of .6 appear to me to be based on theory developed for two-person, zero-sum games, and appear to be correct for that basis; unfortunately, the game in question is two-person NON-zero-sum, and the theory is not wholly applicable. In particular, the Prisoner's Dilemma game has a saddle point in its matrix, but there is another set of strategies for which the sum is significantly different, so there is motivation to find a way to influence the other player to choose a strategy with the largest SUM, not necessarily the largest immediate payoff. Another way of looking at the game is as a THREE-person zero-sum game in which only one of the players (the State) is allowed to offer coalitions with another player. In this interpretation, each of the other players (prisoners) can choose to accept the offered coalition, but if there is some way to encode by play selection an offer of the prisoner-prisoner coalition, the State loses and the prisoners each win more than with the State-prisoner coalition.
518.8		BEING::POSTPISCHIL	Always mount a scratch monkey.	`Thu Jun 19 1986 16:37`	6
	Can somebody demonstrate a game with a scheme for cooperation such that the scheme produces better results than any assignment of probabilities to the plays? -- edp
518.9	The coalition game	MODEL::YARBROUGH		`Fri Jun 20 1986 12:59`	6
	Well, there's the "Coalition" game: three-person, zero-sum. At each turn the majority players win, the minority loses. The only strategy in the game is to form a coalition with one other player. A secondary issue is whether or not you can pay one of the other players enough to break up an established (and very profitable) coalition. I don't see any way in which randomness is relevant; negotiation is everything.
518.10		BEING::POSTPISCHIL	Always mount a scratch monkey.	`Fri Jun 27 1986 16:34`	10
	Okay, I concede. Now, in games where cooperation is a benefit, what happens? If there is no simple solution, such as in the first game where both players remain silent, I don't think we can do much analysis without more specification. For example, if there is a choice of "best" plays of one player getting 8 and the other player getting 10 or vice-versa, which will be chosen? In reality, it depend on the wills of the players and other factors. -- edp
518.11	tit for tat	LATOUR::JMUNZER		`Thu Jul 10 1986 21:31`	20
	From the article of .0: The basic prisoner's dilemma model is more than three decades old, invented by two scientists at the Rand Corporation. The modern era of prisoner's dilemma research, however, began in 1978, when Dr. Axelrod decided to test different strategies experimentally, by pitting them against each other in a series of computer tournaments. Contestants submitted tiny programs that competed with one another in a computerized round-robin prisoner's dilemma universe. The most fertile legacy of Dr. Axelrod's original tournaments was the discovery of what is, in almost all circumstances, the best strategy for playing prisoner's dilemma. It is called "Tit for Tat," and it can be summed up this way: Do unto your opponent as he has just done unto you. ...Tit for Tat did better than strategies that always betrayed, or betrayed at random, or used sophisticated statistical strategies.... Those strategies never lost when playing Tit for Tat -- they couldn't. But when they were playing each other, they tended to do badly, because they defected too much.... John
518.12		ERIS::CALLAS	Jon Callas	`Fri Jul 11 1986 16:02`	10
	No, tit for tat is not the best strategy. It is simply the best strategy that participated in Axelrod's tournaments. In "The Evolution of Cooperation," Axelrod's book on the subject, he mentions a couple times that there were strategies that would have done better than tit for tat. And it is certainly not proved that tit for tat is optimal. However, the strategies that do well are all related to tit for tat, and tit for tat will always do well. Its biggest advantage is that it is so simple. Jon