Back to the index

The evolution of cooperation

Table of Contents

Review notes

This was one of those books where you are already aware of most of the ideas, but reading it brings them together into a much more exciting whole.

UNDER WHAT CONDITIONS will cooperation emerge in a world of egoists without central authority? This question has intrigued people for a long time. And for good reason. We all know that people are not angels, and that they tend to look after themselves and their own first. Yet we also know that cooperation does occur and that our civilization is based upon it. But, in situations where each individual has an incentive to be selfish, how can cooperation ever develop?

This is what I was thinking about when I chose this book. Can a world of egoists be 'nice'. Do we need to reach for morality or force (government) to explain cooperation, or is there a game-theoretic reason it emerges by itself? Under what conditions can it emerge, under what conditions will it break down?

The game explores the question by looking at what behaviours are the most successful in the iterated prisoner's dilemna. Defection is of course the correct choice in a one-off prisoner's dilemna, but when the game is repeated an unknown number of times it is much less obvious what you should do (this is interesting because the iterated prisoner's dilemna happens to model many interpersonal interactions quite well. When you order a pizza, the pizzeria could defect by giving you an inferior product to save money and time. You could defect by not paying for your pizza. That this is rare is essential for the world to run smoothly.)

One surprising conclusion in the book is that tit-for-tat is a very robust way to play the game. Essentially: be nice to start with, but punish defection immediately. Forgive if the opponent begins to cooperate again.

This rule is hard to take advantage of (it punishes defection) while also being nice (allowing it to rack up a high score with other nice rules).

The author also looks at different structural reasons that nice rules might be more or less succesful. Things that can cause the un-nice rules to take over are:

  • short time horizon (you expect you'll leave your job soon so you don't bother working quite so hard or coming to work on time)
  • unlikely to interact again (people swear at you in London, not so much in a village in Midsomer)
  • hard to recognise players you interacted with before (anonymity in general, 4chan)

(In other words, to promote cooperation you want frequent interaction, with the same players, on a long time horizon. You can imagine people living their lives in a small village as being the best example of this, and people who live a few years in a big city as the worst! Of course, in cities, people find ways to join smaller sub communities such as their company, clubs, gyms, etc.)

He also looks for ways cooperation could evolve from nothing, bootstrapping itself via cooperation amongst related organisms.

Some other interesting conclusions I took

  • Foregiveness is important to avoid falling into a spiral of defections
  • However, you would be better off with a reputation of being horribly unforgiving, to avoid people trying to take advantage of you
  • Reputation is very important – if people know you are a tit-for-tat player, they will be incentivised to cooperate with you
  • This is how organised crime works (according to television) – your reputation is valuable, and you work hard to keep it
  • It is also how buisness works, trust and reputation are much cheaper than using the courts
  • Indeed, cooperation is stable without anything external to enforce it, as long as the structure of the game is correct – for example families cooperate naturally, whereas whole countries have difficulty.
  • A small cluster of nice rules can often invade a population of mean rules. They will gain enough when they cooperate with each other to offset their losses when interacting with the natives

It is a short book, but there is much more of interest that I have not mentioned here!

Highlights

Intro

THIS PROJECT began with a simple question: When should a person cooperate, and when should a person be selfish, in an ongoing interaction with another person?

loc: 45 iterated Prisoner’s Dilemma. The game allows the players to achieve mutual gains from cooperation, but it also allows for the possibility that one player will exploit the other, or the possibility that neither will cooperate. As in most realistic situations, the players do not have strictly opposing interests.

loc: 51 To my considerable surprise, the winner was the simplest of all the programs submitted, TIT FOR TAT. TIT FOR TAT is merely the strategy of starting with cooperation, and thereafter doing what the other player did on the previous move.

loc: 57 Something very interesting was happening here. I suspected that the properties that made TIT FOR TAT so successful in the tournaments would work in a world where any strategy was possible. If so, then cooperation based solely on reciprocity seemed possible. But I wanted to know the exact conditions that would be needed to foster cooperation on these terms. This led me to an evolutionary perspective: a consideration of how cooperation can emerge among egoists without central authority.

loc: 61 how can a potentially cooperative strategy get an initial foothold in an environment which is predominantly noncooperative? Second, what type of strategy can thrive in a variegated environment composed of other individuals using a wide diversity of more or less sophisticated strategies? Third, under what conditions can such a strategy, once fully established among a group of people, resist invasion by a less cooperative strategy?

loc: 94 As Darwinians, we start pessimistically by assuming deep selfishness at the level of natural selection, pitiless indifference to suffering, ruthless attention to individual success at the expense of others. And yet from such warped beginnings, something can come that is in effect, if not necessarily in intention, close to amicable brotherhood and sisterhood. This is the uplifting message of Robert Axelrod’s remarkable book.

loc: 126 I was invited by the world’s largest computer company to organize and supervise a whole day’s game of strategy among their executives, whose purpose was to bond them together in amicable cooperation. They were divided into three teams—the reds, the blues, and the greens—and the game was a variant on the prisoner’s dilemma game that is the central topic of this book. Unfortunately, the cooperative bonding that was the company’s goal failed to materialize—spectacularly. As Robert Axelrod could have predicted, the fact that the game was known to be coming to an end at exactly 4 p.m. precipitated a massive defection by the reds against the blues immediately before the appointed hour. The bad feeling generated by this sudden break with the previous day-long goodwill was palpable at the postmortem session that I conducted, and the executives had to have counseling before they could be persuaded to work together again.

loc: 154 The world’s leaders should all be locked up with this book and not released until they have read it. This would be a pleasure to them and might save the rest of us. The Evolution of Cooperation deserves to replace the Gideon Bible. RICHARD DAWKINS

CHAPTER 1 The Problem of Cooperation

loc: 174 UNDER WHAT CONDITIONS will cooperation emerge in a world of egoists without central authority? This question has intrigued people for a long time. And for good reason. We all know that people are not angels, and that they tend to look after themselves and their own first. Yet we also know that cooperation does occur and that our civilization is based upon it. But, in situations where each individual has an incentive to be selfish, how can cooperation ever develop?

> This is what I wanted to find an answer to when I chose this book. Can a world of egoists be 'nice'. Do we need to reach for altruism or force (government) to explain cooperation, or does it emerge by itself?

loc: 181 “solitary, poor, nasty, brutish, and short” (Hobbes 1651/1962, p. 100). In his view, cooperation could not develop without a central authority, and consequently a strong government was necessary. Ever since, arguments about the proper scope of government have often focused on whether one could, or could not, expect cooperation to emerge in a particular domain if there were not an authority to police the situation.

loc: 185 Today nations interact without central authority. Therefore the requirements for the emergence of cooperation have relevance to many of the central issues of international politics.

loc: 194 In everyday life, we may ask ourselves how many times we will invite acquaintances for dinner if they never invite us over in return. An executive in an organization does favors for another executive in order to get favors in exchange. A journalist who has received a leaked news story gives favorable coverage to the source in the hope that further leaks will be forthcoming. A business firm in an industry with only one other major company charges high prices with the expectation that the other firm will also maintain high prices—to their mutual advantage and at the expense of the consumer.

loc: 212 The approach of this book is to investigate how individuals pursuing their own interests will act, followed by an analysis of what effects this will have for the system as a whole. Put another way, the approach is to make some assumptions about individual motives and then deduce consequences for the behavior of the entire system

loc: 259 The definition of Prisoner’s Dilemma requires that several relationships hold among the four different potential outcomes. The first relationship specifies the order of the four payoffs. The best a player can do is get T, the temptation to defect when the other player cooperates. The worst a player can do is get S, the sucker’s payoff for cooperating while the other player defects. In ordering the other two outcomes, R, the reward for mutual cooperation, is assumed to be better than P, the punishment for mutual defection. This leads to a preference ranking of the four payoffs from best to worst as T, R, P, and S. The second part of the definition of the Prisoner’s Dilemma is that the players cannot get out of their dilemma by taking turns exploiting each other. This assumption means that an even chance of exploitation and being exploited is not as good an outcome for a player as mutual cooperation. It is therefore assumed that the reward for mutual cooperation is greater than the average of the temptation and the sucker’s payoff. This assumption, together with the rank ordering of the four payoffs, defines the Prisoner’s Dilemma.

loc: 296 What makes it possible for cooperation to emerge is the fact that the players might meet again. This possibility means that the choices made today not only determine the outcome of this move, but can also influence the later choices of the players. The future can therefore cast a shadow back upon the present and thereby affect the current strategic situation. But the future is less important than the present—for two reasons. The first is that players tend to value payoffs less as the time of their obtainment recedes into the future. The second is that there is always some chance that the players will not meet again. An ongoing relationship may end when one or the other player moves away, changes jobs, dies, or goes bankrupt.

> Part of the reason they value the future payouts less is probably because they intuitively account for the possibility of not meeting again (among other good reasons – this is not necessarily irrational!)

loc: 309 In general, getting one point on each move would be worth 1 + w + w2 + w3. . . . A very useful fact is that the sum of this infinite series for any w greater than zero and less than one is simply 1/(1 –w). To take another case, if each move is worth 90 percent of the previous move, a string of 1’s would be worth ten points because 1/(1–w) = 1/(1–.9) = 1/.1 = 10. Similarly, with w still equal to .9, a string of 3 point mutual rewards would be worth three times this, or 30 points.

> w is the discount rate

loc: 338 the discount parameter, w, must be large enough to make the future loom large in the calculation of total payoffs. After all, if you are unlikely to meet the other person again, or if you care little about future payoffs, then you might as well defect now and not worry about the consequences for the future.

> this is why small communities'work better' than large ones.

loc: 354 The very possibility of achieving stable mutual cooperation depends upon there being a good chance of a continuing interaction, as measured by the magnitude of w. As it happens, in the case of Congress, the chance of two members having a continuing interaction has increased dramatically as the biennial turnover rates have fallen from about 40 percent in the first forty years of the republic to about 20 percent or less in recent years

loc: 366 The payoffs certainly do not have to be symmetric. It is a convenience to think of the interaction as exactly equivalent from the perspective of the two players, but this is not necessary. One does not have to assume, for example, that the reward for mutual cooperation, or any of the other three payoff parameters, have the same magnitude for both players. As mentioned earlier, one does not even have to assume that they are measured in comparable units. The only thing that has to be assumed is that, for each player, the four payoffs are ordered as required for the definition of the Prisoner’s Dilemma.

loc: 407 four properties which tend to make a decision rule successful: avoidance of unnecessary conflict by cooperating as long as the other player does, provocability in the face of an uncalled for defection by the other, forgiveness after responding to a provocation, and clarity of behavior so that the other player can adapt to your pattern of action.

loc: 416 cooperation can get started even in a world of unconditional defection. The development cannot take place if it is tried only by scattered individuals who have virtually no chance to interact with each other. However, cooperation can evolve from small clusters of individuals who base their cooperation on reciprocity and have even a small proportion of their interactions with each other.

loc: 420 cooperation, once established on the basis of reciprocity, can protect itself from invasion by less cooperative strategies. Thus, the gear wheels of social evolution have a ratchet.

II The Emergence of Cooperation

CHAPTER 2 The Success of TIT FOR TAT in Computer Tournaments

loc: 486 The ubiquitous problems of collective action to produce a collective good are analyzable as Prisoner’s Dilemmas with many players (G. Hardin 1982).

> Reasons and Persons

loc: 548 Surprisingly, there is a single property which distinguishes the relatively high-scoring entries from the relatively low-scoring entries. This is the property of being nice, which is to say never being the first to defect.

loc: 552 Each of the eight top-ranking entries (or rules) is nice. None of the other entries is. There is even a substantial gap in the score between the nice entries and the others. The nice entries received tournament averages between 472 and 504, while the best of the entries that were not nice received only 401 points.

> !!

Yellow highlight | Page: 3 When a single defection can set off a long string of recriminations and counterrecriminations, both sides suffer. A sophisticated analysis of choice must go at least three levels deep to take account of these echo effects. The first level of analysis is the direct effect of a choice. This is easy, since a defection always earns more than a cooperation. The second level considers the indirect effects, taking into account that the other side may or may not punish a defection. This much of the analysis was certainly appreciated by many of the entrants. But the third level goes deeper and takes into account the fact that in responding to the defections of the other side, one may be repeating or even amplifying one’s own previous exploitative choice. Thus a single defection may be successful when analyzed for its direct effects, and perhaps even when its secondary effects are taken into account. But the real costs may be in the tertiary effects when one’s own isolated defections turn into unending mutual recriminations.

Yellow highlight | Page: 3 it was easy to find several rules that would have performed substantially better than TIT FOR TAT in the environment of the tournament. The existence of these rules should serve as a warning against the facile belief that an eye for an eye is necessarily the best strategy. There are at least three rules that would have won the tournament if submitted.

> however, in the next tournament they did not win, tit for tat did again.

Yellow highlight | Page: 3 The sample program defects only if the other player defected on the previous two moves. It is a more forgiving version of TIT FOR TAT in that it does not punish isolated defections. The excellent performance of this TIT FOR TWO TATS rule highlights the fact that a common error of the contestants was to expect that gains could be made from being relatively less forgiving than TIT FOR TAT, whereas in fact there were big gains to be made from being even more forgiving. The implication of this finding is striking, since it suggests that even expert strategists do not give sufficient weight to the importance of forgiveness.

Yellow highlight | Page: 4 These results from supplementary rules reinforce a theme from the analysis of the tournament entries themselves: the entries were too competitive for their own good. In the first place, many of them defected early in the game without provocation, a characteristic which was very costly in the long run. In the second place, the optimal amount of forgiveness was considerably greater than displayed by any of the entries (except possibly DOWNING). And in the third place, the entry that was most different from the others, DOWNING, floundered on its own misplaced pessimism regarding the initial responsiveness of the others.

Yellow highlight | Page: 7 For example, TIT FOR TWO TATS defects only after the other player defects on the preceding two moves. But TESTER never does defect twice in a row. So TIT FOR TWO TATS always cooperates with TESTER, and gets badly exploited for its generosity. Notice that TESTER itself did not do particularly well in the tournament. It did, however, provide low scores for some of the more easygoing rules.

Yellow highlight | Page: 9 Would the results of the second round have been much different if the distribution of entries had been substantially different? Put another way, does TIT FOR TAT do well in a wide variety of environments? That is to say, is it robust? A good way to examine this question is to construct a series of hypothetical tournaments, each with a very different distribution of the types of rules participating.

Yellow highlight | Page: 9 The results were that TIT FOR TAT won five of the six major variants of the tournament, and came in second in the sixth. This is a strong test of how robust the success of TIT FOR TAT really is.

Yellow highlight | Page: 9 To make this precise, we can say that the number of copies (or offspring) of a given entry will be proportional to that entry’s tournament score. We simply have to interpret the average payoff received by an individual as proportional to the individual’s expected number of offspring. For example, if one rule gets twice as high a tournament score in the initial round as another rule, then it will be twice as well-represented in the next round.

Yellow highlight | Page: 10 In human terms, a rule which was not scoring well might be less likely to appear in the future for several different reasons. One possibility is that a player will try different strategies over time, and then stick with what seems to work best. Another possibility is that a person using a rule sees that other strategies are more successful and therefore switches to one of those strategies. Still another possibility is that a person occupying a key role, such as a member of Congress or the manager of a business, would be removed from that role if the strategy being followed was not very successful. Thus, learning, imitation, and selection can all operate in human affairs to produce a process which makes relatively unsuccessful strategies less likely to appear later.

Yellow highlight | Page: 12 A good example of ecological extinction is provided by HARRINGTON, the only non-nice rule among the top fifteen finishers in the second round. In the first two hundred or so generations of the ecological tournament, as TIT FOR TAT and the other successful nice programs were increasing their percentage of the population, HARRINGTON was also increasing its percentage. This was because of HARRINGTON’S exploitative strategy. By the two hundredth generation or so, things began to take a noticeable turn. Less successful programs were becoming extinct, which meant that there were fewer and fewer prey for HARRINGTON to exploit. Soon HARRINGTON could not keep up with the successful nice rules, and by the one thousandth generation HARRINGTON was as extinct as the exploitable rules on which it preyed. The ecological analysis shows that doing well with rules that do not score well themselves is eventually a self-defeating process. Not being nice may look promising at first, but in the long run it can destroy the very environment it needs for its own success.

Yellow highlight | Page: 12 The overall record of TIT FOR TAT is very impressive. To recapitulate, in the second round, TIT FOR TAT achieved the highest average score of the sixty-two entries in the tournament. It also achieved the highest score in five of the six hypothetical tournaments which were constructed by magnifying the effects of different types of rules from the second round. And in the sixth hypothetical tournament it came in second. Finally, TIT FOR TAT never lost its first-place standing in a simulation of future generations of the tournament.

Yellow highlight | Page: 12 What can be said for the empirical successes of TIT FOR TAT is that it is a very robust rule: it does very well over a wide range of environments. Part of its success might be that other rules anticipate its presence and are designed to do well with it. Doing well with TIT FOR TAT requires cooperating with it, and this in turn helps TIT FOR TAT. Even rules like TESTER that were designed to see what they could get away with, quickly apologize to TIT FOR TAT. Any rule which tries to take advantage of TIT FOR TAT will simply hurt itself. TIT FOR TAT benefits from its own nonexploitability because three conditions are satisfied: The possibility of encountering TIT FOR TAT is salient. Once encountered, TIT FOR TAT is easy to recognize. Once recognized, TIT FOR TAT’s nonexploitability is easy to appreciate. Thus TIT FOR TAT benefits from its own clarity.

> Reputation of honesty

CHAPTER 3 The Chronology of Cooperation

Yellow highlight | Page: 14 the whole population can be imagined to be using a single strategy, while a single individual enters the population with a new strategy. The newcomer will then be interacting only with individuals using the native strategy. Moreover, a native will almost certainly be interacting with another native since the single newcomer is a negligible part of the population. Therefore a new strategy is said to invade a native strategy if the newcomer gets a higher score with a native than a native gets with another native.

Yellow highlight | Page: 14 A strategy is collectively stable if no strategy can invade it.

Yellow highlight | Page: 14 A warning is in order about this definition of a collectively stable strategy. It assumes that the individuals who are trying out novel strategies do not interact too much with one another.2 As will be shown further on, if they do interact in clusters, then new and very important developments are possible.

Yellow highlight | Page: 15 The significance of this proposition is that if everyone in a population is cooperating with everyone else because each is using the TIT FOR TAT strategy, no one can do better using any other strategy providing that the future casts a large enough shadow onto the present. In other words, what makes it impossible for TIT FOR TAT to be invaded is that the discount parameter, w, is high enough relative to the requirement determined by the four payoff parameters. For example, suppose that T=5, R=3, P=1, and S=0 as in the payoff matrix shown in figure 1. Then TIT FOR TAT is collectively stable if the next move is at least ⅔ as important as the current move. Under these conditions, if everyone else is using TIT FOR TAT, you can do no better than to do the same, and cooperate with them. On the other hand, if w falls below this critical value, and everyone else is using TIT FOR TAT, it will pay to defect on alternative moves. If w is less than ½, it even pays to always defect.

> Short term leads to defection

Yellow highlight | Page: 16 if the other player is unlikely to be around much longer because of apparent weakness, then the perceived value of w falls and the reciprocity of TIT FOR TAT is no longer stable. We have Caesar’s explanation of why Pompey’s allies stopped cooperating with him. “They regarded his [Pompey’s] prospects as hopeless and acted according to the common rule by which a man’s friends become his enemies in adversity”

Yellow highlight | Page: 16 The great enforcer of morality in commerce is the continuing relationship, the belief that one will have to do business again with this customer, or this supplier, and when a failing company loses this automatic enforcer, not even a strong-arm factor is likely to find a substitute.

Yellow highlight | Page: 17 It was the French practice to “let sleeping dogs lie” when in a quiet sector . . . and of making this clear by retorting vigorously only when challenged. In one sector which we took over from them they explained to me that they had practically a code which the enemy well understood: they fired two shots for every one that came over, but never fired first.

Yellow highlight | Page: 19 it pays to use a TIT FOR TAT strategy rather than be a meanie like the bulk of the population. And this will be true even if only 5 percent of the interactions of the TIT FOR TAT players are with other TIT FOR TAT players.8 Thus, even a small cluster of TIT FOR TAT players can get a higher average score than the large population of meanies they enter. Because the TIT FOR TAT players do so well when they do meet each other, they do not have to meet each other very often to make their strategy the superior one to use.

Yellow highlight | Page: 20 Proposition 7. If a nice strategy cannot be invaded by a single individual, it cannot be invaded by any cluster of individuals either.

III Cooperation Without Friendship or Foresight

CHAPTER 4 The Live-and-Let-Live System in Trench Warfare in World War I

Yellow highlight | Page: 22

  1. How could the live-and-let-live system have gotten started? 2. How was it sustained? 3. Why did it break down toward the end of the war? 4. Why was it characteristic of trench warfare in World War I, but of few other wars?

Yellow highlight | Page: 24 Locally, the dilemma persisted: at any given moment it was prudent to shoot to kill, whether the other side did so or not. What made trench warfare so different from most other combat was that the same small units faced each other in immobile sectors for extended periods of time. This changed the game from a one-move Prisoner’s Dilemma in which defection is the dominant choice, to an iterated Prisoner’s Dilemma in which conditional strategies are possible. The result accorded with the theory’s predictions: with sustained interaction, the stable outcome could be mutual cooperation based upon reciprocity. In particular, both sides followed strategies that would not be the first to defect, but that would be provoked if the other defected.

Yellow highlight | Page: 25 It would be child’s play to shell the road behind the enemy’s trenches, crowded as it must be with ration wagons and water carts, into a bloodstained wilderness . . . but on the whole there is silence. After all, if you prevent your enemy from drawing his rations, his remedy is simple: he will prevent you from drawing yours. (Hay 1916, pp. 224-25) Once started, strategies based on reciprocity could spread in a variety of ways. A restraint undertaken in certain hours could be extended to longer hours. A particular kind of restraint could lead to attempting other kinds of restraint. And most importantly of all, the progress achieved in one small sector of the front could be imitated by the units in neighboring sectors.

Yellow highlight | Page: 26 During the periods of mutual restraint, the enemy soldiers took pains to show each other that they could indeed retaliate if necessary. For example, German snipers showed their prowess to the British by aiming at spots on the walls of cottages and firing until they had cut a hole (The War the Infantry Knew 1938, p. 98). Likewise the artillery would often demonstrate with a few accurately aimed shots that they could do more damage if they wished. These demonstrations of retaliatory capabilities helped police the system by showing that restraint was not due to weakness, and that defection would be self-defeating.

Yellow highlight | Page: 28 What is clear in retrospect is that the indirect effect of the raids was to destroy the conditions needed for the stability of the tacit restraints widely exercised on the Western Front. Without realizing exactly what they were doing, the high command effectively ended the live-and-let-live system by preventing their battalions from exercising their own strategies of cooperation based on reciprocity.

Yellow highlight | Page: 29 The cooperative exchanges of mutual restraint actually changed the nature of the interaction. They tended to make the two sides care about each other’s welfare. This change can be interpreted in terms of the Prisoner’s Dilemma by saying that the very experience of sustained mutual cooperation altered the payoffs of the players, making mutual cooperation even more valued than it was before.

Yellow highlight | Page: 29 the point is that not only did preferences affect behavior and outcomes, but behavior and outcomes also affected preferences.

CHAPTER 5 The Evolution of Cooperation in Biological Systems

Yellow highlight | Page: 33 Apart from being the solution in game theory, defection in a single encounter is also the solution in biological evolution.5 It is the outcome of inevitable evolutionary trends through mutation and natural selection: if the payoffs are in terms of fitness, and the interactions between pairs of individuals are random and not repeated, then any population with a mixture of heritable strategies evolves to a state where all individuals are defectors. Moreover, no single differing mutant strategy can do better than others when the population is using this strategy. When the players will never meet again, the strategy of defection is the only stable strategy.

Yellow highlight | Page: 34 an organism does not need a brain to employ a strategy. Bacteria, for example, have a basic capacity to play games in that (1) bacteria are highly responsive to selected aspects of their environment, especially their chemical environment; (2) this implies that they can respond differentially to what other organisms around them are doing; (3) these conditional strategies of behavior can certainly be inherited; and (4) the behavior of a bacterium can affect the fitness of other organisms around it, just as the behavior of other organisms can affect the fitness of a bacterium.

Yellow highlight | Page: 34 The discrimination of others may be among the most important of abilities because it allows one to handle interactions with many individuals without having to treat them all the same, thus making possible the rewarding of cooperation from one individual and the punishing of defection from another.

Yellow highlight | Page: 36 Close relatedness of players permits true altruism—sacrifice of fitness by one individual for the benefit of another. True altruism can evolve when the conditions of cost, benefit, and relatedness yield net gains for the altruism-causing genes that are resident in the related individuals (Fisher 1930; Haldane 1955; Hamilton 1963). Not defecting in a single-move Prisoner’s Dilemma is altruism of a kind (the individual is foregoing proceeds that might have been taken); so this kind of behavior can evolve if the two players are sufficiently related (Hamilton 1971; Wade and Breden 1980). In effect, recalculation of the payoffs can be done in such a way that an individual has a part interest in the partner’s gain (that is, reckoning payoffs in terms of what is called inclusive fitness). This recalculation can often eliminate the inequalities T > R and P >S, in which case cooperation becomes unconditionally favored.

> !!

Yellow highlight | Page: 37 The chronological story that emerges from this analysis is the following. ALL D is the primeval state and is evolu-tionarily stable. But cooperation based on reciprocity can gain a foothold through two different mechanisms. First, there can be kinship between mutant strategies, giving the genes of the mutants some stake in each other’s success, thereby altering the payoff of the interaction when viewed from the perspective of the gene rather than the individual. A second mechanism to overcome total defection is for the mutant strategies to arrive in a cluster so that they provide a nontrivial proportion of the interactions each has, even if they are so few as to provide a negligible proportion of the interactions which the ALL D individuals have.

Yellow highlight | Page: 38 The basic idea is that an individual must not be able to get away with defecting without the other individuals being able to retaliate effectively. The response requires that the defecting individual not be lost in a sea of anonymous others.

> Applies to large human societies, the internet (apart from smaller niche communities), etc.

Yellow highlight | Page: 38 When an organism is not able to recognize the individual with which it had a prior interaction, a substitute mechanism is to make sure that all of its interactions are with the same player. This can be done by maintaining continuous contact with the other. This method is applied in most mutualisms, situations of close association of mutual benefit between members of different species. Examples include a hermit crab and its sea-anemone partner, a cicada and the varied colonies of microorganisms housed in its body, or a tree and its mycorrhizal fungi.

> Marriage, companies, clubs

Yellow highlight | Page: 39 These mechanisms could operate even at the microbial level. Any symbiont that still has a chance to spread to other hosts by some process of infection would be expected to shift from mutualism to parasitism when the probability of continued interaction with the original host lessened. In the more parasitic phase, it could exploit the host more severely by producing more of the forms able to disperse and infect. This phase would be expected when the host is severely injured, has contracted some other wholly parasitic infection that threatens death, or when it manifests signs of age. In fact, bacteria that are normal and seemingly harmless or even beneficial in the gut can be found contributing to sepsis in the body when the gut is perforated, implying a severe wound (Savage 1977). And normal inhabitants of the body surface (like Candida albicans) can become invasive and dangerous in either sick or elderly persons.

> Really?

IV Advice for Participants and Reformers

CHAPTER 6 How to Choose Effectively

Yellow highlight | Page: 41 The advice takes the form of four simple suggestions for how to do well in a durable iterated Prisoner’s Dilemma: 1. Don’t be envious. 2. Don’t be the first to defect. 3. Reciprocate both cooperation and defection. 4. Don’t be too clever.

Yellow highlight | Page: 43 TIT FOR TAT won the tournament because it did well in its interactions with a wide variety of other strategies. On average, it did better than any other rule with the other strategies in the tournament. Yet TIT FOR TAT never once scored better in a game than the other player! In fact, it can’t. It lets the other player defect first, and it never defects more times than the other player has defected. Therefore, TIT FOR TAT achieves either the same score as the other player, or a little less. TIT FOR TAT won the tournament, not by beating the other player, but by eliciting behavior from the other player which allowed both to do well.

Yellow highlight | Page: 43 So in a non-zero-sum world you do not have to do better than the other player to do well for yourself. This is especially true when you are interacting with many different players. Letting each of them do the same or a little better than you is fine, as long as you tend to do well yourself. There is no point in being envious of the success of the other player, since in an iterated Prisoner’s Dilemma of long duration the other’s success is virtually a prerequisite of your doing well for yourself.

Yellow highlight | Page: 49 There is yet a third way in which some of the tournament rules are too clever: they employ a probabilistic strategy that is so complex that it cannot be distinguished by the other strategies from a purely random choice. In other words, too much complexity can appear to be total chaos. If you are using a strategy which appears random, then you also appear unresponsive to the other player. If you are unresponsive, then the other player has no incentive to cooperate with you. So being so complex as to be incomprehensible is very dangerous.

CHAPTER 7 How to Promote Cooperation

Yellow highlight | Page: 50 The advice dealing with how this mutual cooperation can be promoted comes in three categories: making the future more important relative to the present; changing the payoffs to the players of the four possible outcomes of a move; and teaching the players values, facts, and skills that will promote cooperation.

Yellow highlight | Page: 54 Concentrating the interactions is one way to make two individuals meet more often. In a bargaining context, another way to make their interactions more frequent is to break down the issues into small pieces. An arms control or disarmament treaty, for example, can be broken down into many stages. This would allow the two parties to make many relatively small moves rather than one or two large moves. Doing it this way makes reciprocity more effective. If both sides can know that an inadequate move by the other can be met with a reciprocal defection in the next stage, then both can be more confident that the process will work out as anticipated.

> !!

Yellow highlight | Page: 55 What governments do is to change the effective payoffs. If you avoid paying your taxes, you must face the possibility of being caught and sent to jail. This prospect makes the choice of defection less attractive. Even quasi-governments can enforce their laws by changing the payoffs faced by the players. For example, in the original story of the Prisoner’s Dilemma, there were two accomplices arrested and interrogated separately. If they belonged to an organized gang, they could anticipate being punished for squealing. This might lower the payoffs for double-crossing their partner so much that neither would confess—and both would get the relatively light sentence that resulted from the mutual cooperation of their silence.

Yellow highlight | Page: 58 A community using strategies based upon reciprocity can actually police itself. By guaranteeing the punishment of any individual who tries to be less than cooperative, the deviant strategy is made unprofitable. Therefore the deviant will not thrive, and will not provide an attractive model for others to imitate. This self-policing feature gives you an extra private incentive to teach it to others—even those with whom you will never interact. Naturally, you want to teach reciprocity to those with whom you will interact so that you can build a mutually rewarding relationship. But you also have a private advantage from another person using reciprocity even if you never interact with that person: the other’s reciprocity helps to police the entire community by punishing those who try to be exploitive. And this decreases the number of uncooperative individuals you will have to deal with in the future.

V Conclusions

CHAPTER 8 The Social Structure of Cooperation

Yellow highlight | Page: 60 This chapter explores the consequences of additional forms of social structure. Four factors are examined which can give rise to interesting types of social structure: labels, reputation, regulation, and territoriality. A label is a fixed characteristic of a player, such as sex or skin color, which can be observed by the other player. It can give rise to stable forms of stereotyping and status hierarchies. The reputation of a player is malleable and comes into being when another player has information about the strategy that the first one has employed with other players. Reputations give rise to a variety of phenomena, including incentives to establish a reputation as a bully, and incentives to deter others from being bullies. Regulation is a relationship between a government and the governed. Governments cannot rule only through deterrence, but must instead achieve the voluntary compliance of the majority of the governed. Therefore regulation gives rise to the problems of just how stringent the rules and the enforcement procedures should be. Finally, territoriality occurs when players interact with their neighbors rather than with just anyone. It can give rise to fascinating patterns of behavior as strategies spread through a population.

Yellow highlight | Page: 62 A player’s reputation is embodied in the beliefs of others about the strategy that player will use. A reputation is typically established through observing the actions of that player when interacting with other players. For example, Britain’s reputation for being provocable was certainly enhanced by its decision to take back the Falkland Islands in response to the Argentine invasion. Other nations could observe Britain’s decisions and make inferences about how it might react to their own actions in the future.

Yellow highlight | Page: 63 Having a firm reputation for using TIT FOR TAT is advantageous to a player, but it is not actually the best reputation to have. The best reputation to have is the reputation for being a bully. The best kind of bully to be is one who has a reputation for squeezing the most out of the other player while not tolerating any defections at all from the other. The way to squeeze the most out of the other is to defect so often that the other player just barely prefers cooperating all the time to defecting all the time. And the best way to encourage cooperation from the other is to be known as someone who will never cooperate again if the other defects even once.

Yellow highlight | Page: 64 it is not easy to establish a reputation as a bully. To become known as a bully you have to defect a lot, which means that you are likely to provoke the other player into retaliation. Until your reputation is well established, you are likely to have to get into a lot of very unrewarding contests of will. For example, if the other player defects even once, you will be torn between acting as tough as the reputation you want to establish requires and attempting to restore amicable relations in the current interaction.

Yellow highlight | Page: 65 what is true for tax collection is also true for many forms of policing: the key to maintaining compliant behavior from the citizenry is that the government remains able and willing to devote resources far out of proportion to the stakes of the current issue in order to maintain its reputation for toughness.

Yellow highlight | Page: 65 In each case, the problem is to prevent challenges by maintaining a reputation for firmness in dealing with them. To maintain this reputation might well require meeting a particular challenge with a toughness out of all proportion to the stakes involved in that particular issue.

Yellow highlight | Page: 65 a government must elicit compliance from the majority of the governed. To do this requires setting and enforcing the rules so that it pays for most of the governed to obey most of the time.

Yellow highlight | Page: 66 The agency can adopt a strategy such as TIT FOR TAT which would give the company an incentive to comply voluntarily and thereby avoid the retaliation represented by the coercive enforcement policy. Under suitable conditions of the payoff and discount parameters, the relationship between the regulated and the regulator could be the socially beneficial one of repeated voluntary compliance and flexible enforcement.

> UK generally acts like this, they will assume you are being good until they find evidence to the opposite (the most obvious and obnoxious exception is the TV license people, who will soon be irrelavant anyway)

Yellow highlight | Page: 66 To set a tough pollution standard, for example, would make the temptation to evade very great. On the other hand, to set a very lenient standard would mean more allowable pollution, thereby lessening the payoff from mutual cooperation which the agency would attain from voluntary compliance. The trick is to set the stringency of the standard high enough to get most of the social benefits of regulation, and not so high as to prevent the evolution of a stable pattern of voluntary compliance from almost all of the companies.

CHAPTER 9 The Robustness of Reciprocity

Yellow highlight | Page: 71 The individuals do not have to be rational: the evolutionary process allows the successful strategies to thrive, even if the players do not know why or how. Nor do the players have to exchange messages or commitments: they do not need words, because their deeds speak for them. Likewise, there is no need to assume trust between the players: the use of reciprocity can be enough to make defection unproductive. Altruism is not needed: successful strategies can elicit cooperation even from an egoist. Finally, no central authority is needed: cooperation based on reciprocity can be self-policing.

Yellow highlight | Page: 71 if the other player defects once, TIT FOR TAT will always respond with a defection, and then if the other player does the same in response, the result would be an unending echo of alternating defections. In this sense, TIT FOR TAT is not forgiving enough. But another problem is that TIT FOR TAT is too forgiving to those rules which are totally unresponsive, such as a completely random rule.

> With a random player should always defect? ev for TRPS=5310 is 0.5*1 + 0.5*5 = 3 vs cooperating is 0.5*3 = 1.5

Yellow highlight | Page: 71 A common business attitude is expressed by a purchasing agent who said that “if something comes up you get the other man on the telephone and deal with the problem. You don’t read legalistic contract clauses at each other if you ever want to do business again” (Macaulay 1963, p. 61). This attitude is so well established that when a large manufacturer of packaging materials inspected its records it found that it had failed to create legally binding contracts in two-thirds of the orders from its customers (Macaulay 1963). The fairness of the transactions is guaranteed not by the threat of a legal suit, but rather by the anticipation of mutually rewarding transactions in the future.

> !!

Yellow highlight | Page: 71 To explore the implications of misperception, the first round of the tournament was run again with the modification that every choice had a 1 percent chance of being misperceived by the other player. As expected, these misunderstandings resulted in a good deal more defection between the players. A surprise was that TIT FOR TAT was still the best decision rule. Although it got into a lot of trouble when a single misunderstanding led to a long echo of alternating retaliations, it could often end the echo with another misperception.

Yellow highlight | Page: 71 one of my biggest surprises in working on this project has been the value of provocability. I came to this project believing one should be slow to anger. The results of the Computer Tournament for the Prisoner’s Dilemma demonstrate that it is actually better to respond quickly to a provocation. It turns out that if one waits to respond to uncalled for defections, there is a risk of sending the wrong signal. The longer defections are allowed to go unchallenged, the more likely it is that the other player will draw the conclusion that defection can pay. And the more strongly this pattern is established, the harder it will be to break it. The implication is that it is better to be provocable sooner, rather than later.

Yellow highlight | Page: 71 The speed of response depends upon the time required to detect a given choice by the other player. The shorter this time is, the more stable cooperation can be. A rapid detection means that the next move in the interaction comes quickly, thereby increasing the shadow of the future as represented by the parameter w. For this reason the only arms control agreements which can be stable are those whose violations can be detected soon enough. The critical requirement is that violations can be detected before they can accumulate to such an extent that the victim’s provocability is no longer enough to prevent the challenger from having an incentive to defect.

Back to the index

Last modified 2019-08-16 Fri 16:27. Contact max@maxjmartin.com