[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference rusure::math

Title:Mathematics at DEC
Moderator:RUSURE::EDP
Created:Mon Feb 03 1986
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:2083
Total number of notes:14613

518.0. "prisoner's dilemma" by LATOUR::JMUNZER () Tue Jun 17 1986 17:19

A fine article in today's NY Times ("Prisoner's Dilemma:  Why People
Cooperate", page C1) starts:

	You and your accomplice have been caught red-handed.  Together
	your best hope is to cooperate by remaining silent:  You will
	each get off with a 30-day sentence.  But either of you can do
	better for yourself:  Double-cross your partner and you will go
	free while he serves 10 years.  The problem is, if you each
	betray the other, you will both go to prison -- and not just for
	30 days, but for eight years.

It goes on to talk about the game below, strategies for it, computer
tournaments, and real-world applications.

Want to try strategies?  Or, as a spoiler, read the article.

John

 **  **  **  **  **  **  **  **  **  **  **  **  **  **  **  ** 

     \		If...
       \	    You				You
	 \	    cooperate			defect
and if...  \______________________________________________________
	   |	then...			| then...
Opponent   |	      You get 3 points	| 	You get 5 points
cooperates |	           and          |            and
	   |          Opponent gets	|       Opponent gets
	   |             3 points       |          0 points
	   |-------------------------------------------------------
	   |	then...			| then...
Opponent   |	      You get 0 points	| 	You get 1 point
defects    |	           and          |            and
	   |          Opponent gets	|       Opponent gets
	   |             5 points       |          1 point
           --------------------------------------------------------
T.RTitleUserPersonal
Name
DateLines
518.1Who needs rational decisions?MODEL::YARBROUGHTue Jun 17 1986 18:0315
    The amusing thing about this game is that the best strategy is not
    (entirely) rational. If it were entirely rational, then both players
    would use the same strategy and make the same moves, since the game
    is competely symmetrical. But if both are going to make the same
    moves, then they should both cooperate. And cooperation is always
    the worst choice, as defecting will always be more profitable! So
    some irrational strategy must evolve from the dilemma.
    
    A random strategy can also be demonstrated to be bad, as any random
    selection to cooperate will be punished by the other player's choice
    of defecting.
    
    Typical game-theory considerations for this game break down because
    the game is non-zero-sum, i.e. it is not the case that your gain
    is the other player's loss.
518.2BEING::POSTPISCHILAlways mount a scratch monkey.Wed Jun 18 1986 13:1512
    Re .1:
    
    > A random strategy can also be demonstrated to be bad, as any random
    > selection to cooperate will be punished by the other player's choice
    > of defecting.
    
    I'm not sure that's correct.  A selection to cooperate will not
    always be punished by the other player's choice of defecting, since
    the other player will not always chose that (assuming a good player).
    
    
    				-- edp
518.3one-shot vs serialMODEL::YARBROUGHWed Jun 18 1986 14:338
    Re .-1; What do you mean by 'always'? The problem was originally
    stated as a one-shot situation. Your remark makes sense if you assume
    the game might be played many times, but then we are into the area
    of comparing dynamic strategies, which is fun, but for which random
    strategies tend not to work well in comparison with others.
    
    I'm a little sorry I got wound up on this. The whole game is a great
    time sink.
518.4BEING::POSTPISCHILAlways mount a scratch monkey.Wed Jun 18 1986 16:5021
    Re .3:
    
    I said "always" because we must consider all the possible variations
    on the play before concluding what would happen.  There is no
    difference in what you must decide to do if you are playing the
    game only once or are going to play it many times.
    
    Consider that zero-sum games have the same problem:  Selecting a
    certain strategy will often be disadvantageous if the opponent selects
    some strategy.  That doesn't affect the fact that a random choice
    is the best way to go in games without a minimax point.
    
    Also, this game has a minimax point.  Consider that your opponent
    will either cooperate or double-cross you.  If your opponent cooperates
    and you double-cross, you go off free instead of receiving 30 days.
    If your opponent double-crosses you and you double-cross, you get
    eight years instead of ten years.  In both cases, double-crossing
    is the better choice.
    
    
    				-- edp
518.5Nope, multi-games ARE differentTLE::BRETTWed Jun 18 1986 17:4633
    > There is no difference in what you must decide to do if you are
    > playing the game once or are going to play it many times.
    
    Totally wrong.  If you are going to play the game repeatedly, you
    can use you successive plays to, say, code messages to the other
    person.  If the game is such that cooperation benefits you both
    then such messages can be VERY useful.
    
    For example: Punishment - if you screw me in this game I will ALWAYS
    screw you in the next.
    
    Such a technique can not be applied to "play once" games.  In fact
    the zero-sum, one-shot games are the most probabilistic.
    
    Consider this game instead...
    
    				Me
                      \	    0         1
                       ---------------------
    		You   0|  1/0        -2/1000        Your points/my points
                       |
                      1|  -1/0       -1/1
    
    Now, no matter what you do, I do best by playing a 1, so in a one-round
    game, I should play 1, in which case you should play a 1 also.  If
    you repeat this analysis for 10 games the total score is -10/10
    
    In a multi-round game I could play a 0 repeatedly until you switched
    to playing 0's also, and then every tenth game I could play a 1.
    In this case in 10 games the total score would be 8/1000 - we are
    both better off!
    
    /Bevin
518.6BEING::POSTPISCHILAlways mount a scratch monkey.Wed Jun 18 1986 21:3226
    Re .5:
    
    The n-th case is still the same as the first case, assuming you
    know your partner/opponent is intelligent.  You don't what the other
    player will do, and you must decide what to do now.  Unless there
    is a pure strategy, as in the original example, you must make a
    random selection, and the question is reduced to asking the
    probabilities you should assign to each choice.
    
    If the probability you (labeled "Me" in your diagram) choose 0 is p and
    the probability I choose 1 is q, it is simple to write our points as
    functions of p and q.  If p and q were independent, p would be 0 or 1,
    depending on the value of q.  But they are not independent, because if
    you made p 0 or 1, I might adjust q accordingly.  Finding the
    appropriate values of p and q then becomes a problem (for me) of
    selecting a function f such that q = f(p) and my points are maximized.
    You have the task of selecting a function g such that p = g(q) and your
    points are maximized.  This again brings us to a game, that of
    selecting functions.  The new game involves an infinite number of
    choices, but the strategy will be pure (because a random selection
    between two functions assigning probabilities will be equivalent to
    some single function assigning probabilities, and that function
    exists as a play in the new game).  How do we find those functions?
    
    
    				-- edp 
518.7There is a better theoretical basisMODEL::YARBROUGHThu Jun 19 1986 14:4317
    The arguments of .6 appear to me to be based on theory developed
    for two-person, zero-sum games, and appear to be correct for that
    basis; unfortunately, the  game in question is two-person NON-zero-sum,
    and the theory is not wholly applicable. In particular, the Prisoner's
    Dilemma game has a saddle point in its matrix, but there is another
    set of strategies for which the sum is significantly different,
    so there is motivation to find a way to influence the other player
    to choose a strategy with the largest SUM, not necessarily the largest
    immediate payoff.
    
    Another way of looking at the game is as a THREE-person zero-sum
    game in which only one of the players (the State) is allowed to
    offer coalitions with another player. In this interpretation, each
    of the other players (prisoners) can choose to accept the offered
    coalition, but if there is some way to encode by play selection
    an offer of the prisoner-prisoner coalition, the State loses and
    the prisoners each win more than with the State-prisoner coalition.
518.8BEING::POSTPISCHILAlways mount a scratch monkey.Thu Jun 19 1986 16:376
    Can somebody demonstrate a game with a scheme for cooperation such
    that the scheme produces better results than any assignment of
    probabilities to the plays?
    
    
    				-- edp
518.9The coalition gameMODEL::YARBROUGHFri Jun 20 1986 12:596
    Well, there's the "Coalition" game: three-person, zero-sum. At each
    turn the majority players win, the minority loses. The only strategy
    in the game is to form a coalition with one other player. A secondary
    issue is whether or not you can pay one of the other players enough
    to break up an established (and very profitable) coalition. I don't
    see any way in which randomness is relevant; negotiation is everything.
518.10BEING::POSTPISCHILAlways mount a scratch monkey.Fri Jun 27 1986 16:3410
    Okay, I concede.  Now, in games where cooperation is a benefit,
    what happens?  If there is no simple solution, such as in the first
    game where both players remain silent, I don't think we can do much
    analysis without more specification.  For example, if there is a
    choice of "best" plays of one player getting 8 and the other player
    getting 10 or vice-versa, which will be chosen?  In reality, it
    depend on the wills of the players and other factors.
    
    
    				-- edp
518.11tit for tatLATOUR::JMUNZERThu Jul 10 1986 21:3120
From the article of .0:

	The basic prisoner's dilemma model is more than three decades old,
invented by two scientists at the Rand Corporation.  The modern era of
prisoner's dilemma research, however, began in 1978, when Dr. Axelrod decided
to test different strategies experimentally, by pitting them against each
other in a series of computer tournaments.  Contestants submitted tiny programs
that competed with one another in a computerized round-robin prisoner's
dilemma universe.
	The most fertile legacy of Dr. Axelrod's original tournaments was
the discovery of what is, in almost all circumstances, the best strategy
for playing prisoner's dilemma.  It is called "Tit for Tat," and it can
be summed up this way:  Do unto your opponent as he has just done unto you.
	...Tit for Tat did better than strategies that always betrayed,
or betrayed at random, or used sophisticated statistical strategies....
	Those strategies never lost when playing Tit for Tat -- they couldn't.
But when they were playing each other, they tended to do badly, because
they defected too much....

    John
518.12ERIS::CALLASJon CallasFri Jul 11 1986 16:0210
    No, tit for tat is not the best strategy. It is simply the best
    strategy that participated in Axelrod's tournaments. In "The Evolution
    of Cooperation," Axelrod's book on the subject, he mentions a couple
    times that there were strategies that would have done better than tit
    for tat. And it is certainly not proved that tit for tat is optimal.
    However, the strategies that do well are all related to tit for tat,
    and tit for tat will always do well. Its biggest advantage is that it
    is so simple. 

    	Jon