**5.3 Heterogeneous situations with PD, AG, and CG
players**

Back to contents - Previous file - Next file

In real-world settings, groups may not be homogeneous as we have assumed so far. This certainly applies to communities with respect to the management of local-level natural resources. It is indeed often observed that members of a particular user group behave differently because they do not derive the same benefits from a given action. This may be due to a variety of reasons, including differential endowments, different characteristics in terms of the technique employed and the pattern of use of the resource concerned (think of nomadic herders and sedentary agriculturalists), different social identities, different exit possibilities, varying perceptions of the stake involved in resource preservation, etc.

In game-theoretical terms, we will say that, in this case, encounters arc heterogeneous in the sense that different types of players have to deal with each other. The type of a player is characterized by a particular payoff vector, which may be known or not by the other players. In the following, attention will be focused on heterogeneous games in which players with a payoff structure characteristic of the assurance game face players with a payoff structure characteristic of the prisoner's dilemma. These games are especially interesting because they portray a situation that has much relevance in many human encounters, namely that in which people who do not like to 'exploit' others meet with opportunists. The question that arises in such games is theoretically rich, in so far as it is not a priori clear who among the 'fair' players and the opportunists will determine the final outcome. Before turning to these games, however, mention will be made of two other kinds of heterogeneous encounters. First, we will consider a game in which the two players have an AG payoff structure, yet the benefits accruing to them are not identical. Second, a game in which a player with a chicken game (CG) structure encounters a player with an AG structure will be analysed.

*Encounters between two
different AG players*

Let us assume that the two players who meet in an one-shot game have an AG payoff structure, implying that both of them have no incentive to free-ride on the other's efforts. However, player A has a greater interest in joint co-operation than player B. as illustrated in Figure 5.14.

As usual, there are three Nash equilibria: (C,C), (D,D), and the mixed strategy in which the probability that A plays C is equal to 1/2 and the probability that B plays C is 1/4. Which of these equilibria will emerge depends on the expectations that the players hold about the likelihood that the other co-operates. Assuming that they both hold the same expectation, p, both players co-operate if p > 1/2 and defect if p < 1/4. Clearly, there exists a range, 1/4 < p < 1/2, in which A co-operates while B abstains from doing so. Such an outcome, however, is not an equilibrium (player A will not accept to be 'exploited' by player 13). If it may arise, it is actually because there exists an inverse relationship between the size of the payoff accruing to the player in case of joint co-operation and the degree of trust required to prompt the player to co operate.

**Fig. 5.14. A 2 X 2 asymmetrical
assurance game**

**FIG. 5.15. A CG player meets an AG
player**

**FIG. 5.16. A CG player meets an AG
player and leads the sequential game**

*Encounters between AG and CG
players*

An interesting situation emerges when an AG player faces a CG player, since it allows us to realize the importance of leadership in determining the equilibrium outcome. To start with, consider the one-shot game with simultaneous moves represented in Figure 5.15.

It is easy to check that, in this game, m' equilibrium in pure strategy exists. There is only one equilibrium in mixed strategy, where the probability of the CG player playing C is 2/3 and the probability of the AG player playing C is 4/5.

Equilibria in pure strategies arc nevertheless possible as soon as the game is played sequentially. Yet, which equilibrium will arise depends on which player is in the first-mover position. Assume first that the CG player is the leader, as in the sequential game described in Figure 5.16.

It is immediately apparent from Figure 5.16 that joint co-operation will occur: it is in the interest of the CG player to start by co-operating so as to induce the AC player to follow suit. Indeed, if the CG player makes a non-co-operative first move, he is sure to bring about a situation of mutual defection, which he wants absolutely to avoid. In other words, when a party with a leadership role is keen that a collective action is undertaken, but preferably not by himself, whereas the other party tends to follow the leader's behaviour, but prefers bilateral cooperation to bilateral free-riding, joint cooperation will be established. This happy outcome entirely depends on the fact that the leader has a CG payoff structure. Indeed, had the leadership roles been inverted (the leader is the AG player and the follower is the CG player), universal co-operation would be prevented from arising. This is a straightforward conclusion from Figure 5.17 where the game is led by the AG player.

**Fig. 5.17. An AG player meets a CG
player and leads the game**

As is evident from the figure, the collective action will also be undertaken but only by one of the players: having the right to the initial move, the AG player uses the advantage of knowing that the other player has a CG payoff structure to force him to bear the whole cost of this action. Notice carefully, however, that the AG player as a leader is unable to bring about the outcome which he best prefers (universal cooperation) since by co-operating he would incite the follower to defect. This frustration would not have occurred if both players had a CG payoff structure: forcing the other player to co-operate by defecting in the first place is then the ideal outcome which each player wishes for.

An interesting feature which emerges from any encounter between an AG player and a CG player is that both players have an interest in granting leadership to the latter: indeed, the outcome of the first game (Figure 5.16) dominates the outcome of the second game (Figure 5. 17). This means that, in a more complex game in which the players would be invited to select the leader before deciding sequentially whether to co-operate or not, the unique subgameperfect equilibrium path is as follows: the players select the CG player as the leader, thereafter this player co-operates and, in the final stage, the AG player responds by co-operating too. The lesson from such a three-stage game is that, by binding himself to the leadership position, the CG player commits himself to co-operation.

An example which illustrates the aforedescribed situation can again be borrowed from studies of irrigation management. Consider once more a situation in which water users are divided into two subgroups according to whether they are head-enders or tail-enders. Headenders have a CG payoff structure since they are keen that maintenance of the water control infrastructure is undertaken, but would very much prefer that tail-enders do the work alone (something which may be technically possible, as we have pointed out in the case of the Thambesi irrigation system). On the other hand, tail-enders who are at a locational disadvantage entertain the fear that they may be excluded from decision processes that affect the flow of water reaching their fields (see above): this is why they are eager to participate in maintenance works alongside head-enders, yet would not like to be 'suckers' if head-enders refrain from such participation (they have an AG payoff structure). In these circumstances, as argued above, headenders have an incentive to take the leadership so as to associate tail-enders in the maintenance works. This situation seems to fit rather well with the experience of the Pithuwa irrigation project reviewed by Ostrom and Gardner (1993: 105-6).

**FIG. 5.18. Head-enders deal with
tail-enders in a CG-AG encounter **

Another, more plausible situation arises when tail-enders have a mixed CG-AG payoff structure in the following sense: if head-enders participate in the maintenance works, they want to join in order to avoid the aforementioned negative spillover effect but, if head-enders abstain from such participation, they prefer to undertake these works alone rather than leaving the system to fall into decay. If, moreover, as assumed in Figure 5.18, the tail-enders' payoff is higher when they are 'suckers' (6 units) than when they free-ride on the head-enders' maintenance efforts (3 units)—because the price to be paid in terms of loss of reliable access to water is high when participation in these efforts is shunned—the following result obtains: whether the game is played simultaneously or sequentially and whether, in the last case, leadership is exercised by tail-enders or head-enders, the equilibrium outcome is characterized by participation on the part of tail-enders and defection on the part of head-enders. In other words, even though they have the first move, the tail-enders are unable to take advantage of the chicken game structure of the head-ender' payoffs to make them co-operate, owing to the latter's critical control over the supply of water.

Note that, if the tail-ender's payoff when they are 'suckers' is 2 units instead of 6 units or, if their payoff while freeriding on the head-enders' efforts is 6.5 instead of 3 units, it is easy to see that what has just been said is no longer true: if they have the first move, they are now in a position to force head-enders to undertake (alone) the maintenance works. In other words, if the cost of free-riding on maintenance efforts in terms of loss of reliable access to water is not too high for the tail-enders and if they hold a leadership position (two rather implausible assumptions), their leverage allows them to impose the cost of maintenance on the head-enders.

*Encounters between AC and PD
players in small groups*

Let us now turn to the important set of situations in which AG players encounter PD players. There are numerous relevant cases which are worth considering. In order to help the reader to follow the arguments, these cases are presented in increasing order of complexity.

**FIG. 5.19. A PD player meets an AG
player **

- To start with, let us examine the simple one-shot two-player game with perfect information in which an AG player meets a PD player. That in this case cooperation cannot occur is immediately evident from the payoff matrix depicted in Figure 5.19.

The non-co-operative outcome (D,D) is the only Nash equilibrium in this game. Note that defection is a dominant strategy for the PD player as a result of which there is no equilibrium in mixed strategy.

- What would happen if such a game were to be played in a sequential manner? The answer to that question is rather straightforward. If the AG player is in the first-mover position, he will be prompted to defect since he anticipates that the PD player defects in any event. In contrast, and rather unexpectedly, the reverse outcome obtains in the case where the PD player takes the lead: as a matter of fact, knowing that his opponent responds to co-operation by cooperation and to defection by defection, the PD leader has an incentive to start by co-operating. Because (D,D) is the subgame-perfect equilibrium of the former sequential game and (C,C) that of the latter, were both players allowed to choose their leader, both of them would concur in selecting the PD player. Note the similarity between this conclusion and that reached when analysing CG-AG player encounters.
- If the game is (finitely) repeated rather than being played sequentially, mutual defection appears as the only possible equilibrium as observed in the finitely repeated PD game. Reasoning by backwards induction makes this result clear. As a matter of fact, since the AG player knows that his opponent has a PD payoff structure, he can infer that the latter will surely defect in the last round of the game. His best reply is therefore also to defect in this last round. Being aware that the AG player is going to defect, the PD player has no incentive to build a reputation of 'co-operator' in the round (t -1) and, as a consequence, he will also defect in that round. Since he knows that things will turn out that way, the AG player defects in the same round too. This reasoning can be pursued backwards till the very first round of the game.
- If the game is played in infinite time (or if its length is finite but indeterminate), the folk theorem applies and joint co-operation is a possible (subgame-perfect) equilibrium.

**FIG. 5.20. A 2 x 2 game with
one-sided asymmetric information**

- By relaxing the assumption of perfect information, we can raise the question as to whether co-operation becomes a possible equilibrium outcome in a finitely repeated game. This question is worth raising bearing in mind the result achieved by Kreps and his associates in the framework of a finitely repeated PD game (see above). Let us first consider the case where uncertainty about the payoff structure of the other player is one-sided asymmetric. More precisely, we assume the first player has a PD payoff structure, and this is common knowledge, while the type of the second player is not known with certainty: the first player assigns probability p to the possibility of the second player being an AG player and probability (I - p) to the possibility that he is a PD player. We know from Kreps et al. (1982: 251) that in these circumstances defection will occur throughout the whole game: indeed, if the other player is a PD player, we know that defection is the only possible equilibrium and we have just shown that, if a PD player meets repeatedly an AG player, defection also occurs. There is therefore no reason to expect that a co-operative equilibrium can be generated when there is a doubt about whether the second player has an AG or a PD payoff structure.
- The next case to consider is that in which the first player is an AG player, and this is common knowledge, while there is doubt about the payoff structure of the second player. Is cooperation more likely to emerge in those more favourable conditions? The answer is a conditional 'yes'. More precisely, co-operation by both players till the last stage of the game may occur if the expectation held by the player with a certain AG payoff structure that the opponent is also an AG player exceeds a certain level. In case this expectation falls below that level, universal defection occurs throughout the game. To see this, consider a three-period game in standard form in which the Row player is known with certainty to be an AG player while there is doubt about whether the Column player is a PD or an AG player. The payoffs pertaining to the two possible kinds of encounters are given in Figure 5.20.

Thus, for instance, by defecting while his opponent co-operates, the PD player gets a payoff of 3 which is more than his payoff when both co-operate. By contrast, in the same circumstances, the AG player gets a payoff of only 1 which is less than his payoff when both cooperate. Let us now assume that an AG player follows the brave reciprocity strategy consisting of starting by co-operating and, thereafter, co-operating as long as the opponent is in 'good standing'. As for the PD player, he starts by co-operating, thereafter mimics what the opponent has done in the previous round till the last round where he defects. The question then is whether these two strategies are the best replies to one another, which would imply that mutual co-operation starts from the first round and continues till the last round when the Column player defects while the Row player co-operates till the very end of the game.

To begin with, consider which payoffs the Column player would earn, if he is of the AG type, by following various possible strategies in his encounters with the Row player who is known to be of the AG type and follows the aforedescribed strategy. If he plays a strategy of brave reciprocity (note that similar strategies such as tit for tat or unconditional co operation also lead to the (C,C,C) sequence of actions), he gets a total payoff over the three periods equal to 6 (2 + 2 + 2). If he plays a strategy whereby, against the brave reciprocity strategy of Row, he cooperates in the first two rounds and defects in the last one, his payoff amounts to 5 (2 + 2 + 1). If he co-operates in the first round and defects in the last two rounds, he earns 3 (2 + 1 + 0) while, if he defects from beginning to end, he earns only 1 (1 + 0 + 0) It is therefore evident that unconditional co-operation dominates the other three strategies. In other words, if Column is of the AG type and Row is of the same type and follows a strategy which consists of starting by co-operating and, thereafter, mimicking what the opponent has done in the previous round, then Column's best reply to the latter is to co-operate throughout the whole game. Since the Row player has adopted the above strategy, he will also co-operate from the beginning to the end of the game. Mutual co-operation therefore occurs till the game ends. There is actually nothing surprising in this result which has already been accounted for at an earlier stage of our analysis.

If Column is of the PD type, his total payoffs while playing various possible strategies against Row are as follows: 6 (2 + 2 + 2) if he plays a co-operative strategy leading to the (C,C,C) sequence of actions; 7 (2 + 2 + 3) if he plays the fake strategy whereby he cooperates as his opponent co-operates, and defects in the last round, then revealing his true type; 5 (2 + 3 + 0) if his strategy leads to a (C,D,D) sequence of moves; and 3 (3 + 0 + 0 if he plays unconditional defection. The fake strategy dominates the other strategies available to Column.

Let us now turn to the Row player in order to check whether the brave reciprocity strategy can be a best reply to brave reciprocity (or similar strategies) played by Column if of the AG type and to the fake strategy if of the PD type. Let p be the probability Row assigns to the possibility that Column is of the AG type and (1 -p) the probability that Column is of the PD type. If Row plays brave reciprocity, his payoff is 2 for the first period, 2 again for the second period, and (p x 2 + (1-p)(-1)) for the third period, amounting to a total payoff of 3p + 3 for the three periods together. If, instead, he plays a safe strategy whereby he follows the strategy of brave reciprocity except in the last round when he ensures himself against being a 'sucker' by defecting, his payoff is 2 + 2 + (p x 1 + (1-p)0) = p + 4. If he plays a strategy leading to the (C,D,D) sequence of actions, he earns 2 + 1 + 0 = 3 while, if he plays unconditional defection, he gets only 1 ( 1 + 0 + 0). Clearly, the latter two strategies are dominated by the first two. Whether the first strategy dominates the second strategy or is dominated by it hinges upon the value of p, that is, upon the expectation of Row regarding the payoff identity of Column. In more exact terms, brave reciprocity dominates the safe strategy if p > 1/2.

Row will therefore co operate all throughout the game if he believes there is more than 50 per cent chance that Column is of the same type as himself, otherwise he will stop co-operating after two periods. Since p is common knowledge—Column knows Row's expectation regarding his own (Column's) payoff structure—if p > 1/2, Column will continue to co-operate till the very end of the game in case he is of the AG type, and till the last period in case he is of the PD type. If p < 1/2, on the other hand, Column knows that Row will defect in the last (third) round of the game and, as a result, he has no incentive to co-operate in the second round to maintain a reputation of 'co-operator'. Applying the argument backwards, it is easy to see that, in these conditions, co-operation unravels and universal defection occurs from beginning to end. To sum up, either Row's expectation regarding the chance that Column is of the AG type is sufficiently high, and universal co-operation is sure to occur till at least the last stage of the game, or this expectation is too low and universal defection occurs throughout the game.

Clearly, co-operation is not doomed to failure because groups are heterogeneous in the sense that there is a non-negligible proportion of potential opportunists. As we have seen above and will continue to see in the three following points, co-operation is a serious possibility when expectations are favourable to it.

- In the two foregoing points, we have only considered situations of one-sided asymmetric information. It is tempting to examine now whether co-operation is a possible outcome when the imperfection of information is two-sided, that is, when the two players entertain mutual doubts about their respective payoff structure. An important—but largely neglected (see, however, Gibbons, 1992: 226)-result obtained by Kreps and his associates in their aforementioned, celebrated article (1982) is that extension of uncertainty about payoffs to the two players may increase the chance of co-operation. Remember that, as seen under point 4 above, co-operation is impossible when one player is of the PD type and doubts whether the other player is of the AG or the PD type. What Kreps et a/. show, however, is that when the two players are of the PD type but believe that their opponent might perhaps be of the AG type, there can exist an equilibrium in which both players co-operate until the last few stages of the game (the end-game is rather complex). Yet, it deserves to be emphasized that this game admits (subgame-perfect Nash) equilibria in which long-run co-operation does not ensue. Co-operation actually requires a 'boot-strapping' operation (since there is obviously a trust problem): even if each side is certain that the other has an AG payoff structure, co-operation ensues only if each side hypothesizes that the other side will co-operate (Kreps et a/., 1982: 251).

To see this possibility of co-operation when there is two-sided uncertainty about payoff structures, let us again use our simple three-period framework. Pay-offs are assumed to be the same as in Figure 5.19. In the mind of Row, Column might be of the AG, rather than PD, type, an eventuality to which he assigns a probability p. On the other hand, Column entertains the hypothesis that Row is an AG player (with prob ability q) rather than a PD player (with probability (1 - q)). What we want to show is whether and under which conditions the two aforeprescribed strategies ('start by co operating and thereafter mimic what the opponent has done in the previous round', till the last stage of the game for the PD player and till the end of the game for the AG player) can be best replies to each other.

In actual fact, part of the preparatory work required to answer that question has already been done in the previous point while considering the decision problem of Row. Bear in mind, indeed, that Row's best strategies, when he is of the AG type, are a strategy of brave reciprocity, which yields him a total payoff of 3p + 3 over the three periods, and the safe strategy, which yields a payoff of p + 4. The former strategy dominates the latter if p > 1/2. When Pow is of the PD type, on the other hand, his payoffs are as follows:

2 + 2 + [p x 2 (1 - p)(-1 )] = 3p + 3, if he plays brave reciprocity;

2 + 2 + [p x 3 + (1 - p)0] = 3p + 4 if he plays the fake strategy;

2 + 3 + 0 = 5 if he plays the (C,D,D) sequence

of moves;

3 + 0 + 0 = 3 if he plays unconditional defection.

The strategies of unconditional defection and of brave reciprocity are clearly dominated. Whether the fake strategy is superior to the other (which leads to the (C,D,D) sequence of moves) depends on the value of p: the former dominates if p > 1/3. It is therefore apparent that, if Row expects with a probability higher than 1/3 that Column is of the AG type, he will cooperate till, at least, the last stage of the game. If this probability is higher than 1/2 and he is himself of the AG type, Row will even co-operate till the end of the game.

Exactly the same reasoning can be made with respect to Column. If Column expects with a probability higher than 1/3 (q > 1/3) that Row is of the AG type, he has an incentive to cooperate, at least till the last round of the game. We can conclude that, if expectations of both players regarding the chance that the opponent is of the AG type exceed 1/3, co-operation till at least the last stage of the game is an equilibrium outcome. If this expectation is higher than 1/2, both Row and Column will co-operate till the end of the game provided that they are of the AG type. If, say, the expectation of one player is more pessimistic and falls below the threshold level of 1/3, this is sufficient to destroy co-operation. Indeed, the opponent then knows that the pessimistic player is going to defect from as early as the second round—since the (C,D,D) sequence of moves then dominates the fake strategy—and, therefore, he himself has no incentive to co-operate in the second round nor actually in the first round (since it is of no use for him to build up a reputation of 'co-operator'). The pessimistic player, aware of this calculation made by his opponent, will also defect in the initial round. Universal defection occurs throughout the game.

- We will now extend the above analysis to games with many players. To keep things as simple as possible, consider a three-player game that is played over only two periods. The three players are uncertain about the payoff structure of the other two players; more specifically they entertain doubts about whether the other players are of the AG or PD type.

**Fig. 5.21. A three-player game with
asymmetric information **

Let us consider the decision problem faced by player 3 as it is depicted in Figure 5.21. All players have a probability q of being AG and a probability (1 - q) of being PD. If player 3 is of the AG type, he faces the payoff numbers written in bold characters. For instance, if he defects while at least another player co-operates, he is less well-off than if he co-operates. If both other players defect, he prefers to defect too because he does not want to be a 'sucker'. In contrast, if player 3 is of the PD type, his payoffs are those indicated between brackets: defection is then a dominant strategy.

(i) Let us first assume that, if of the AG type, a player adopts a

strategy of harsh punishment. In this case, he starts by co-operating and thereafter detects if one of the other two players defected in the previous round. Otherwise, he co-operates. Now, if a player is of the PD type, he follows a fake strategy (he mimics being an AG player by co-operating in the first round, and continues to co-operate as long as the other two players co operate till the last round when he defects). The question is: are these two strategies best replies to one another?

To proceed with the analysis, we begin by examining the situation in which player 3 is of the AG type. If he plays the harsh punishment strategy, his total payoff over the two periods is:

This payoff is obviously
identical to that which he would obtain were he to follow either
an *unconditional co-operation strategy *or a *soft-punishment
strategy*, since, in actual fact, he cannot know the other
players' types by observing their first period's moves (since PD
players fake till the last round). By *soft-punishment
strategy, *we mean a strategy whereby he continues to
co-operate as long as at least one other player has co-operated
in the previous round (or, to put it in another way, he defects
only if all other players have defected). We will return later to
this particular strategy. To counter the difficulty that he will
know the other players' types only in the last round, player 3
may choose to play a *safe strategy *(he starts by
co-operating and defects in the last round):

If he plays other strategies (implying such sequences of actions as (D,D) or (D,C) ), the payoffs will obviously be lower than when he plays the above two strategies. Whether the harshpunishment strategy yields a higher payoff than the safe strategy obviously depends on the value of the probability q. More specifically, the former is superior to the latter if: implying that q > 0.18.

Consider now the alternative situation in which player 3 is of the PD type. If he plays the fake strategy, he gets the following total payoff:

If he, instead, plays unconditional defection strategy, he gets:

If he plays other strategies, implying in particular a co-operative move in the last round, his payoffs will obviously be lower than when he plays the above two strategies. Moreover, the fake strategy dominates unconditional defection if implying that q > 0.17.

Note carefully that the critical value of q that induces an AG player to reject the safe strategy is actually greater than the value required to prompt a PD player to use the fake strategy, thereby making the latter condition redundant. We can therefore conclude that, if q, the probability that a player is of the AG type, is greater than 0.18, then the best reply to the harshpunishment strategy adopted by AG players is faking for the PD player, and vice versa. This is an important result in so far as it shows that, even if in the one-period game the dominant strategy of a PD player is to defect, he may have an incentive, in a two-period game, to behave 'co-operatively', as though he were an AG player, till the second round of the game. This result can be extended to more periods: if his expectation that the other players are of the AG type is sufficiently high, the PD player has an incentive to start by co-operating and thereafter continue to co-operate as long as these other players co-operate, till the last round of the game when he defects. It is noteworthy that the critical values of q obtained in games that stretch over, say, three periods are precisely the same as those obtained in the two-period case. Finally, it should be emphasized that, as the above example shows, the critical values of q need not be very high. This obviously hinges upon the fact that, in this example, defection is not very rewarding for a PD player.

(ii) Let us now investigate the possibility of the AG players adopting a soft punishment strategy. In these circumstances, the PD players know that their defection may not necessarily be retaliated in the next round by a non-co operative move of the AG players. This obviously depends on what the other PD players choose to do. Consider first the decision problem faced by player 3 if he is of the AG type. For a reason explained above, when opposed to a fake strategy, the payoffs associated with different strategies are exactly the same as those obtained under a harsh-punishment strategy. In particular, the soft-punishment strategy is superior to the safe strategy if q > 0. 18. If player 3 is, instead, of the PD type, the fake strategy yields the following total payoff:

The payoff resulting from unconditional defection is:

Strategies that imply a co-operative move in the last round are clearly inferior. It is immediately apparent that playing unconditional defection is always more rewarding than playing the fake strategy. As a result, with a soft-punishment strategy, it is impossible that all types of players always co-operate in the first round. In the above, we have assumed that the other players, if of the PD type, start by co-operating and defect in the second round. We now have to check whether this is really the most sensible strategy for such players given that the third player, when PD, replies by always defecting. To carry out this check, let us examine whether unconditional defection is the best strategy for all the PD players simultaneously. The payoff obtained by player 3 when he always defects against the other players who, if of the PD type, are also unconditional defectors, is the following:

If, instead, he plays the fake strategy, he gets:

As can easily by seen, unconditional defection always dominates the fake strategy. This, however, is a result that pertains to a border case since the quadratic equation, has a unique root equal to 0.5. When q is just equal to 50 per cent, player 3 is thus indifferent between the two strategies whereas, for all other values of q, he prefers unconditional defection. By altering the payoffs given in Figure 5.21, it is possible to construct a more general case in which the fake strategy is the best reply of the third player, if PD, to unconditional defection by other PD players and the soft-punishment strategy by the AG players. (Presumably, there is an interval for q such that PD players will adopt a mixed strategy which consists of randomizing between the fake strategy and unconditional defection and such that AG players prefer soft punishment to harsh punishment.) To conclude the analysis based on the payoff matrix given in Figure 5.21, there still remains the question as to whether the soft-punishment strategy is the best reply of an AG player to the unconditional defection strategy adopted by the PD players. To see this, let us consider the payoffs which would accrue to an AG player when he, alternatively, chooses to play soft punishment, harsh punishment, or a strategy of cautious reciprocity (start by defecting and co-operate only if at least one other player has co-operated in the first round). The payoffs associated with these strategies are, respectively:

From a comparison of the above payoffs, it is evident that the harsh-punishment strategy is dominated by the soft-punishment strategy. On the other hand, the strategy of cautious reciprocity is superior to the latter when the probability of meeting AG players is very low (below 0.08 approximately).

To conclude, there are plausible conditions, implying a sufficient probability of meeting other players of the AG type, under which AG players follow a strategy of soft punishment while PD players unconditionally defect.