October 14, 2004

Strong Reciprocity Rules OK

Via Stephen Laniel (in e-mail) and Discourse.net, I see that slashdot is all agog about a new result on Prisoners' Dilemma, supposedly dethroning Tit For Tat:

Tit for Tat, the reigning champion of the Iterated Prisoner's Dilemma Competition, has been defeated by a group of cooperating programs from the University of Southampton. The Prisoner's Dilemma is a game with two players and two possible moves: cooperate or defect. If the two players cooperate, they both have small wins. If one player cooperates and the other defects, the cooperator has a big loss and the defector has a big win. If both players defect, they both have small losses. Tit for Tat cooperates in the first round and imitates its opponent's previous move for the rest of the game. Tit for Tat is similar to the Mutual Assured Destruction strategy used by the two nuclear superpowers during the Cold War. Southampton's programs executed a known series of 5 to 10 moves which allowed them to recognize each other. After recognition, the two Southampton programs became 'master and slave': one program would keep defecting and the other would keep cooperating. If a Southampton program determined that another program was non-Southampton, it would defect.

While this is a very clever trick (particularly the use of error-correcting codes), I'm a spiteful, critical, negative man, so I'm going to pour some cold water on it, or rather on what's being made of it. I'm sure the group at Southampton fully appreciate everything I'm about to say. (Here is the original Wired article, by Wendy Grossman; there doesn't seem to be a paper yet.)

It's been known for a long time that Tit For Tat (hereafter TFT) is not an evolutionarily stable strategy. This means that if you have a population of agents all playing TFT against each other, and then introduce a smaller number of agents playing another strategy, the new-comers will not, necessarily, do worse that the TFT players (and so be crowded out). For instance, there's a strategy called "forgiving Tit For Tat" or "Tit For Two Tats", where you have to see your opponent defecting twice before you retaliate. If play is slightly noisy, forgiving-TFT will actually invade a population which just consists of pure TFT, because it's less likely to get accidentally locked into a cycle of retaliation. Even if play isn't noisy, forgiving-TFT does just fine in a population that's mostly TFT. So the mere fact that there's some strategy which does better than TFT is unsurprising. The virtue of TFT, and its importance for the evolution of cooperation, lies not in its invincibility but in its robustness: it does pretty well in a huge range of circumstances. (Update, later that afternoon: Cris Moore reminds me of this paper by Kristian Lindgren, which, among many other neat things, shows how noise can favor the evolution of strategies which are nicer than TFT.)

One illustration of this robustness comes from spatial evolutionary games (briefly mentioned by Grossman's informants). Spatial Prisoners' Dilemma can give you very nice pictures, where there are clusters of very forgiving, cooperative agents, all being nice to each other, surrounded by a periphery of unforgiving TFT players, with completely uncooperative, defection-prone hordes beyond. This works because TFT does pretty well against both the nasties and the good citizens; the good citizens prosper because they have the TFT types to ward off those who would take advantage of them. (As Karl Sigmund says someplace in his wonderful popular book about theoretical biology, Games of Life, this is the scenario of innumerable westerns and samurai movies.)

The clever thing the Southampton group did was to engineer a situation that TFT couldn't cope with, namely collusion among the competing players. If, indeed, one agent is willing to be stomped on, forever, to the greater glory of another, without getting anything out of it, then its master will indeed get all the benefits that the dilemma is capable of providing. (At this point, you can add your own allusions to Hegel, or "safe, sane and consensual" jokes, as you prefer.) This does not seem to me at all an evolutionarily stable situation, however, since the slave agents have, by construction, exactly no incentive to participate in the arrangement. In fact, a mutant which used the coding scheme to recognize supposed masters and always defected against them, but played TFT with everyone else, should do better than a slave, and without slaves the master-type agents are not going to do well. (I will leave it to others [Bill? Gary? Tim?] to draw the obvious morals.) So I strongly doubt that in the wild, e.g., in actual social dilemmas, we will ever see Southampton-type strategies, which means that TFT should still be robust, and strong reciprocity is saved for another day.

(For the really hard core people. I can't begin to imagine Southampton strategies appearing in biological evolution. Case 1: slaves are not related to masters. Then slaves obviously go extinct, after which masters are not long for this world. If memory serves, Darwin, in the Origin, gives basically this case as an example of an observation which would refute evolution by natural selection. Case 2: slaves are related to masters. Then we've got the usual kin selection case of, e.g., sterile castes in eusocial insects. Since the pay-off function has to be inclusive fitness, we don't really have the Prisoners' Dilemma at all!)

Update, that evening: Crumb Trail draws some morals.

Update, 14 November: Someone else appears to have liked this post so much they've reproduced it almost verbatim as their own (twice, in fact). The sincerest form of flattery, I suppose.

Manual Trackback: Muck and Mystery.

Complexity; The Dismal Science; Biology

Posted at October 14, 2004 13:39 | permanent link

Three-Toed Sloth