Co-operation and limited rationality


Back to contents - Previous file - Next file


The complexity involved in the analysis of the repeated play of such a simple game as the PD game is considerable. First, there are innumerable possible strategies. As a matter of fact, a repeated game is characterized by the fact that the players may devise strategies which make their actions in a given round conditional on their opponents' actions in previous rounds. Given this possibility, the number of conceivable strategies increases explosively with the number of rounds that may be played (Sugden, 1986: 107-8).

Second, related to the above feature is the aforementioned fact that repeated games are characterized by a 'profusion' (to use Kreps's word) of possible (Nash) equilibria. As has been shown, non-co-operative equilibria are as likely as co-operative ones: for instance, unconditional defection is an equilibrium strategy. Moreover, as long as everyone follows cautious strategies of one kind or another ('start by defecting, and thereafter co-operate if your opponent has cooperated in some previous rounds'), these strategies produce the same result: no one ever cooperates. An obvious consequence of this profusion of equilibria is that the predictive power of game theory is very much restricted when agents interact frequently and repeatedly. This is in stark contrast to its predictive power in one-period situations such as that described by the one-shot PD game.

Third, underlying repeated game theory is two strong assumptions about human behaviour. On the one hand, agents are supposed to be hyper-rational since they must be able to conceive, implement, and choose among numerous and complex strategies. In particular, they must be able to anticipate all the possible reactions of the other players, even in a distant future, and to decide their current moves accordingly. If such requirements of computerizability were met in reality, in any chess contest, the whites would never lose. On the other hand, once they have identified the best strategy, agents never make mistakes, in the sense that they unremittingly act according to this strategy.

It is in view of these difficulties that some authors have chosen to develop alternative approaches to strategic interactions that take into account explicitly the possibility of mistakes and the limitations of the human brain's computational ability. The latter constraint is reflected in the fact that human beings can only devise a restricted number of rather simple or basic strategies.

One strategy which has drawn a lot of attention in this respect is the aforementioned tit-for-tat strategy which emerged as a winner from computer tournaments conducted by Axelrod. In these simulation experiments, the participants, among which were many game theorists, were invited to propose strategies without there being any arbitrary limit imposed by the analyst on the players' possible strategic choices. These strategies were then compared against each other in pairwise encounters with repeated plays. (The median length of games was 200 rounds.) On average, tit for tat yielded the highest payoff and further analysis suggested that, in repeated tournaments, 'tit for tat would continue to thrive, and that eventually it might be used by virtually everyone' (Axelrod, 1984: ;5). Moreover, it appeared that tit for tat is 'a very robust rule', that does very well over a wide range of environments (ibid. 53).

It has been shown earlier that the tit-for-tat strategy is defective in the sense that it relies upon non-credible threats (it is not subgame-perfect) so that a mistake triggers off an endless chain of retaliation and counter-retaliation. A reasonable way of cutting short the above chain would be for A, who defected in round i, to refrain from retaliating in round i + 2, that is, to admit that B's defection in i + 1 was somehow justified and therefore to agree to co-operate in i + 1 and i + 2 without regard to B's (understandable) defection in i + 1 (which is another way of showing that tit for tat is not subgame-perfect). This is precisely the intuition behind the subgame-perfect variant of tit for tat denoted T1 by Sugden (1986) or called 'getting even' by Myerson (Myerson, 1991: 326-7):

T1 starts from a concept of being in good standing. The essential idea is that a player who is in good standing is entitled to the co operation of his opponent. At the start of the game both players are treated as being in good standing. A player remains in good standing provided that he always co-operates when T1 prescribes that he should. If in any round a player defects when T1 prescribes that he should co-operate, he loses his good standing; he regains his good standing after he has co-operated in one subsequent round. ( This is why I call this strategy T1; if it took two rounds of co-operation to regain good standing, the strategy would be T2, and so on. ) Given all this, T1 can be formulated as follows: "Co-operate if your opponent is in good standing, or if you are not. Otherwise, defect'. (Sugden, 1986: 112)

As expected, T and T1 are perfectly equivalent strategies in a game where players never make mistakes. The only difference between these two strategies concerns the moves of a player after he has defected by mistake. Such an event makes him lose his good standing: in accordance with T1, he should co operate in the subsequent round i + 1 while his opponent may defect without losing his good standing. Whatever the opponent does in round i + 1, T1 therefore requires that the first player (the one who made the mistake) co-operates in round i + 2 also (Sugden, 1986: 113).

T1 is clearly a strategy of reciprocity since a person's co-operation is conditional on the others' co-operation. But it is also a strategy of punishment. As a matter of fact, when T1 is followed by two players, the reversal of the combination of moves from (D,C)—the first player defects, by mistake, and the second one co-operates—to (C,D) (the first player co-operates while the second one defects) may be interpreted as a punishment for the faulty player. This player suffers the worst possible outcome in the 'punishing' round i + 1 and, in so far as he has chosen to follow T1, it may be said that he accepts punishment (with a view to preventing the adverse long-term consequences of his mistake from arising). Of course, there is the other side of the story. In the 'punishing' round, the other player receives the best possible outcome (even better than the outcome of a round of mutual co-operation). So not only punishment but also reparation take place: strategy T1 prescribes that to regain his entitlement to the co operation of others, the faulty player must perform an act of reparation during one round following his defection (Sugden, 1986: 114-15).

Now, the question may be raised as to why reparation ought to be confined to one round. After all, as pointed out by Sugden, 'this reparation is insufficient to compensate the injured party fully for the losses he has suffered from the other prayers' breech of the convention' (1986: 115). Indeed, using the utility indices of Figure 4.2, the breach imposes a cost of b on the injured party whereas the act of reparation in the subsequent round allows him to save c. Since b is greater than c and since, in addition, the cost saving c must be discounted to allow for the possibility that the round following the breach will never be played, the injured party receives only partial compensation for the mistake of his opponent.

The answer provided by Sugden is that the extent of reparation is a matter of convention: 'the injured party demands just as much reparation as he expects his opponent to concede, and his opponent offers just as much as he expects the first player to insist on' (Sugden, 1986). A strategy T2 can thus be imagined that prescribes two rounds of reparation for each unjustified defection or, alternatively, a strategy T3 prescribing three rounds of reparation (or any strategy T. prescribing r rounds of reparation). The more r increases, the more vengeful are the people who have been 'suckers' during one round and the less forgiving the reciprocity strategy. Quite evidently, there is a limit to how forgiving such a strategy may be if it is to be an equilibrium: 'reparations must be sufficiently burdensome to deter deliberate defections' (ibid.). Yet, on the other hand, there is also a danger in being too vengeful: after having made a mistake, a player is under no compulsion to accept punishment since he may possibly resign himself to the loss of his good standing and continue to defect. It is clear that the less forgiving the opponents' strategy, the more attractive this second option becomes. Moreover, it will also be all the more attractive as the value of pi is lower: 'the sooner the game is likely to end, the less there is to gain from being in good standing' (ibid.).

In actual fact, tit-for-tat strategies—whether of the T or of the T. types—are members of a much larger class which Sugden has called strategies of brave reciprocity. These strategies have two defining characteristics. In the words of Sugden:

First, against an opponent who defects in every round, these strategies defect in every round except the first (provided no mistakes are made). Second, if two players following strategies of brave reciprocity meet, they both co-operate in every round (again, provided no mistakes are made). Notice that the two players need not be following the same strategy. (Sugden, 1986: 116)

As emphasized by Sugden, strategies can satisfy the second condition above only if they prescribe co-operation in the initial round. This is precisely why such strategies arc brave: players are ready to co-operate in advance of any evidence that the other will reciprocate, thereby exposing themselves to the risk of being exploited by freeriders. It must be noted, however, that 'brave reciprocators' cannot be thus exploited during more than one round. The danger of exploitation appears as the necessary price to be paid for ensuring the possibility of repeated co-operation amongst players. Indeed, if both players are cautious (they will not cooperate until the other has already done so), they will never co operate at all. Therefore, if a strategy is to co-operate with itself— that is, if it is such that when both players follow it they will jointly co-operate—it must be willing to cooperate at a positive risk for the players.

FIG. 4.4. The extended PD game where players are unconditional defectors or 'brave reciprocators'

Evolution of co-operation and limited rationality

Pairwise interactions in an evolutionary framework
Keeping with the assumption of limited rationality, Sugden considers an extended PD game in which people have available to them only two kinds of strategies—strategies of brave reciprocity (henceforth denoted by R), and the 'nasty' strategy of unconditional defection (N). The issue which he then addresses is whether, in repeated pairwise encounters, we can expect spontaneous evolution to favour reciprocity. Three possibilities of encounter arise. First, two N-players meet. This can only lead to repeated joint defections from the very first round of the game: both players will therefore derive a utility of zero from the game. Second, an N-player meets an R-player (a 'brave reciprocator'). Clearly, the result is that in the first round the itplayer co-operates and the N-player defects while in all the subsequent rounds both players defect. There is thus an asymmetry in the payoffs received by the players over the whole game: the N-player gets 6 and the it-player gets -c. Third, two it-players meet. They co-operate in every round, obtaining (b - c) in each round: over the whole game, the expected value of their utility stream is equal to (b - c)/(1 -
p ).

These results are displayed in Figure 4.4.

We know (see above) that N is the best reply to N and that, provided p > c/b, R is the best reply to R. Consequently, the choice of strategy by each player will hinge upon the probability that his opponent will choose one strategy rather than the other. Let then p be the probability that a random opponent will follow strategy R. There is then some critical value of p, say p*, such that when p is greater than, or equal to, p*, the strategy R will prove more rewarding than, or just as rewarding as, the strategy N. This critical value is easily derived from

so that

It is immediately apparent that dp*/dp < 0: if the probability of meeting an opponent again is high, the critical value of p can be quite low (for example, if b = 2, c = 1, and p = 0.98, so that the average game has fifty rounds, p* equals 0.02 only). In other words, if one player assesses even a small probability that the other will be a 'brave reciprocator', he will be induced to choose to play the same strategy instead of following the 'nasty' strategy, provided that the game comprises a sufficient number of rounds. Sugden's interpretation of his result is that playing R is a kind of risky investment: by incurring the risk of being exploited by an N-player in the first round, a player is able to co-operate with an it-player in every round. Therefore, 'the longer the game is likely to last, the more time there is to recoup on a successful investment' (Sugden, 1986: 117).

Adopting an evolutionary perspective in which patterns of behaviour that have proved relatively successful in the past are more likely to survive and reproduce,'' Sugden reaches the conclusion that evolution (in repeated pairwise encounters) tends to favour co-operation: in games which on average have many rounds, 'a convention of brave reciprocity has a good chance of evolving' (Sugden, 1986: 116-20). In particular, when the probability of meeting an opponent again is fairly high, even if initially the great majority of players defect, the latter may do less well than the small minority who are following strategies of brave reciprocity, and there may then be a self-reinforcing tendency for the minority group to grow (ibid.).

Sugden obtains an even stronger support for his thesis that evolution favours cooperation by allowing the players to choose between three strategies: brave reciprocity, the 'nasty' strategy, and the strategy of cautious reciprocity. The result arrived at can be formulated as follows: even if initially almost everyone defects, cautious reciprocators will gradually emerge and their presence will help to bring about conditions causing brave reciprocators eventually to invade and take over (Sugden, 1986: 118-20). It is noteworthy that, to obtain this result, Sugden had to assume that cautious strategies are tailored to the reparation rules which prevail among the 'brave reciprocators'. Thus, if the latter follow T1 (there is only one round of reparation for each unjustified defection), a cautious player's best plan is to co-operate for the next two rounds which follow his initial defection, and then play tit for tat. Mutual co-operation can thus emerge from the third round onwards, because the cautious player's defection in the first round is treated as though it had been a mistake. If, on the other hand, 'brave reciprocators' follow the strategy T2 (there are, by convention, two rounds of reparation), a cautious player must co-operate during rounds 2, 3, and 4 to make his strategy successful (mutual co-operation would then start in round 4). In a more general way, if it-players follow Tj, a cautious player must co-operate during j+1successive rounds after his initial defection, and thereafter play the simple tit-for-tat strategy.

However, as pointed out by Binmore, these results obtain only in so far as attention is restricted to a few strategies (see Binmore, 1992: 434). Strictly speaking, the strategy of brave reciprocity is not evolutionarily stable, that is, it is not 'a pattern of behaviour such that if it is generally followed by the population, any small number of people who deviate from it will do less well than the others' (Bardhan, 1993b: 635): in other words, it is not true to say that no 'mutant' strategy can 'invade' the population. For instance, if the mutant strategy is 'always co-operate' (or any other strategy of conditional cooperation, for example, a strategy with several rounds of reparation), this strategy played against brave reciprocators does as well as the latter's strategy. Note moreover that it is important to distinguish between the evolutionary stability and the viability of a strategy. While the former concept refers to the capacity of a strategy to defend itself against invasion, the latter means that it can invade a large population of non-cooperators. So far, we have contended that the strategy of brave reciprocity is not, strictly speaking evolutionarily stable, yet it is a best reply to itself (it is 'collectively stable' to use Axelrod's terminology). Now, it has to be added that, for such a strategy to be viable, the assumption that agents are randomly paired for contests has to he given up: for instance, we have to allow for the feet that agents are more likely to be paired with others adopting the same strategy (Axelrod and Hamilton, 1981; Bardhan, 1993b: 635). This result can be interpreted as implying that the chance of success of conditional co-operation is all the greater as more encounters take place between homogeneous agents.

It also bears emphasis that the kind of analysis proposed by Axelrod and Sugden is somewhat misleading in so far as it suggests that the possibility of co-operation depends on the presence in the population of at least a small number of players who start by co-operating. l hat this is not true can easily be seen by considering the so-called tat-for-tit strategy (Binmore, 1992: 434) which does not satisfy this requirement and yet leads to co operation in the long run if adopted by the two players. This strategy is: (i) start by defecting, (ii) co-operate if both players have cooperated or if they have both defected in the previous round, (iii) and otherwise defect.

This strategy leads to a Nash equilibrium in which both players defect in the first round and then co operate forever If the component (i) of the above strategy is replaced by (i') 'defect for n periods and then, follow rules (ii)-(iii)', the resulting strategy when chosen by both players supports a Nash equilibrium in which co-operation follows n rounds of defection.

Multiple interactions in a learning model of strategy revision

While in most evolutionary models (including the above model of Sugden) pairs of players are selected randomly from a 'large' population to play the given game once, and are thereafter returned to the population, in learning models, each player is assumed to interact with many varying opponents with a fixed strategy, modifying his strategy thereafter based on the cumulative experience (Bendor et al., 1994: 2). What deserves to be noted is that in both classes of models, 'the "fitness" of a given strategy at any stage of the game depends on its average payoff, achieved against the rest of the population'. This implies that 'a strategy revision by any single player evokes no response from the other players, as it does not appreciably affect their average payoffs' (ibid.).

Recently, however, economists have paid increasing attention to learning models where the strategy revisions of a given player generate substantial feedback effects by affecting the other players' payoffs, thereby inducing the latter also to revise their strategies subsequently. Models of strategy learning embodying such feedback effects differ according to the degree of rationality assumed for the players. On the one hand, we find the 'myopic best-response' models (such es 'fictitious play') that Selten (1991) considers under the heading of 'belief learning': this class of models presumes that each player observes the past moves of other players, forms beliefs about their strategies in the next iteration of the game, and then calculates a best response. On the other hand, we have the models belonging to the so-called 'stimulus learning' approach. I hey are much less requiring in terms of rationality assumptions since they presume a rather high level of bounded rationality characterized by severe limits on information gathering or cognitive abilities. They actually apply to players ignorant of payoff functions and of opponents' past choices, and, moreover, they do not require them to solve maximization problems (Bendor et al., 1994: 4 5; see also Fudenberg and Kreps, 1992; Kalai and Lehrer, 1993 for recent surveys of learning game-theoretical models).

One particular model based on the latter approach is grounded in the 'satisficing' principle and has been proposed by Bendor et al. (1994). In this model, the strategy revisions made by the players after every interaction are based on a comparison between a given aspiration level and the payoff actually obtained in the current period. More precisely, the state of any player at any stage t is represented by a probability vector over his set of pure actions and these probabilities can be interpreted as reflecting his relative inclination to select different actions. Such a state is updated in the following manner: if the payoff realized from the action chosen at t exceeds an aspiration Ievel, the weight on that action is increased at the following stage, with compensating adjustment in the weights on other available actions. And vice versa if the payoff falls short of the aspiration level. (Note that a player's state will remain unchanged if the achieved payoff exactly equals the aspiration level.)

In this framework, different players 'adapt without explicitly co-ordinating with one another and without devoting any cognitive effort to predicting the choices of others and choosing appropriate responses to such predictions' (Bendor et al.: 6). As Bendor et al. have shown, stable long-run outcomes (that is, equilibria with consistent aspirations) in such repeated games with feedback effects need not be Nash equilibria of the one-shot game. This is essentially because feedback effects operate in a manner that resembles 'punishments' imposed on unilateral deviations in repeated games.

For example, mutual co-operation in a PD game of the above kind is always an equilibrium outcome with consistent aspirations despite the fact that mutual defection is a dominant strategy equilibrium in the one-shot game. The interesting question is of course what prevents the players from 'learning' the payoff advantage to defection. Following the aforementioned study, let us start with aspirations near the (mutual) co-operative payoff (Bendor et al., 1994: 6). Suppose that, in a two-player game, player I experiments with defection at stage t. Since player 2 continues to co operate, player I obtains a payoff higher than his aspiration, thereby making him even more inclined to deviate at t + 1. As for player 2, he ended up with a payoff below his aspiration at t and this also makes him more inclined to deviate at t + 1, though for entirely different reasons. Then both players receive below-aspiration payoffs at t + 1, thus tending to induce both of them to return to co-operation at t + 2. In the words of Bendor et al.: 'once the players arrive at a state where both defect with substantial probability, both will indeed defect simultaneously, beginning the process of a simultaneous return to co-operation. Hence, the mutual co-operation outcome is stable with respect to periodic random switches to defection by either player' (ibid. 6). Clearly, assuming that players behave myopically, i.e. try to increase current payoffs, the degree of rationality that is conducive to co-operation is low in some welldefined sense (ibid. 26).

It is worth noting that, in the random matching framework of most evolutionary models, such an outcome could not survive in the long run since a deviation by a single player will not have an appreciable impact on the 'fitness' of any other strategy in the population, and therefore not evoke any feedback effect. To put it in another way, the initial benefits of the deviation are not 'undone' by the reactions of other players, and the player can sustain the benefits of the unilateral deviation, as a result of which the 'fitness' of the deviating strategy will be enhanced, at the eventual expense of other strategies in the population (Bendor et al., 1994).

As is evident from the foregoing discussion, the main result achieved by the above kind of learning model, namely that the intrinsic structure of interactive play may prevent each player from choosing a best response to the other's strategy (non-Nash play can arise), pertains only to environments where current experience is paramount in determining current strategy and where there are only a small number of players. For one thing, in 'learning' games (even with a few players) where the past is given the same weight as the present but where play is otherwise myopic, the tendency to cooperate is weakened: 'A current deviation does not evoke a large reaction from the opponent, who considers his rival's entire history of play' (Bendor et al., 1994: 26). For another thing, in games with many players, co operation is more problematic for two reasons. First, 'in the event of a defection the co-ordination required to restore co operation is of an order of magnitude that is exponential in the number of players' and, second, any single player's deviation has a smaller effect, thereby dampening the reactions of the other players. In particular, with a continuum of players co-operation is impossible: limit play must be Nash (ibid.).

All this being said, stress must be laid on the fact that 'there is scope for considerable multiplicity of equilibria' in the above 'satisficing' model of strategy learning, a feature that directly follows from the possibility of varying the initial aspiration level. Thus, if the agents' aspiration is near the (mutual) non-co-operative payoff instead of being near the (mutual) cooperative payoff as previously assumed, it is easy to see that, after a non-co-operative random move of one of them at stage t, they will both receive their aspiration payoff at t + 1. As a result, they will have no incentive to return to cooperation at later stages. There is evidently a 'self-fulfilling' property in the 'satisficing' model: if members of a given society have low aspirations to start with— say, because of a disappointing experience in a previous game—they will behave in such a way as to repeat this negative experience in the present.

Given the obvious importance of group size for the prospect of co-operation, as again illustrated above in the case of a 'stimulus-learning' model, it is useful to summarize the main sources of the advantage of small groups. This is done below by the way of a conclusion to this chapter.

Co-operation and group size: some reflections

  1. As shown above, an important result of repeated PD game theory is the following: if individuals know one another well, can observe one another's behaviours, and are in continuous interaction with one another, then any pattern of collective behaviour, including co-operation, can be sustained, which will make each individual better off than under universal defection. This is because each player's plan of action can be made dependent on the others' past actions, given that they are all easily observable either directly or indirectly. Indirect observability is possible in as much as, even though individual actions are not directly observable, their impact on the collective performance can be unambiguously assessed so that the number of defections is easily inferred. Notice carefully that it is precisely because a co-operative move is observable that it can be interpreted as a sign of goodwill and purposefully used as such.

These particular conditions of perfect information and repeated interactions obviously correspond to small-group settings in the real world. When they arc satisfied, individuals have a strong incentive to consider the more indirect and long-term consequences of their choices instead of paying exclusive attention to immediate costs and benefits. In other words, given the long time-horizon of the game and the easy observability of each other's actions, they are incited to care about their reputation. Over two centuries ago, Hume (and Adam Smith as well, who spoke about the 'discipline of continuous dealings') had already well understood that, in small-scale social settings, considerations of what is sometimes called reciprocal altruism but really amounts to selfishness with foresight, should lead people to co-operate. The same considerations help explain why they are quite reliable about keeping promises when no legal sanction requires them to do so and when keeping them is inconvenient or costly:

We can better satisfy our appetites in an oblique and artificial manner, than by their headlong and impetuous motion. Hence I learn to do a service to another, without bearing him any real kindness; because I forsee that he will return my service, in expectation of another of the same kind, and in order to maintain the same correspondence of good offices with me or with others. And accordingly, after I have serv'd him, and he is in possession of the advantage arising from my action, he is induc'd to perform his part, as foreseeing the consequences of his refusal.... After these signs [i.e. promises] are instituted, whoever uses them is immediately bound by his interest to execute his engagements, and must never expect to be trusted any more, if he refuse to perform what he promis'd. (Hume, 1740: Bk. III, pt. II, sect. V, 521-2)

  1. Small groups or communities are generally characterized not only by the continuous, but also by the multiplex pattern of their members' interrelationships, a feature which follows from the socially 'embedded' nature of many micro-societies. This means that the sectors of social life in which individuals interact are numerous and can never be neatly separated in the minds of the people. If interests are so tightly intertwined, it becomes difficult to conceive of their interactions with respect to resource management as an isolated 'game'. The common property resource problem becomes part of a repeated multiple prisoners' dilemma corresponding to many social and economic activities at the village level (or even, as we shall see below, to an assurance game). Speaking of social life in a rural county in present-day California, Ellickson (1991) has recently captured this essential characteristic of village communities in a vivid manner. We cannot resist the temptation to quote him at some length:

Shasta County norms entitle a farmer in that situation to keep track of those minor losses in a mental account, and eventually to act to remedy the imbalance. A fundamental feature of rural society makes this enforcement system feasible: rural residents deal with one another on a large number of fronts, and most residents expect those interactions to continue far into the future.... They interact on water supply, controlled burns, fence repairs, social events, staffing the volunteer fire department, and so on.... Thus any trespass dispute with a neighbour is almost certain to be but one thread in the rich fabric of a continuing relationship. A person in a multiplex relationship can keep a rough mental account of the outstanding credits and debits in each aspect of that relationship. Should the aggregate account fall out of balance, tension may mount because the net creditor may begin to perceive the net debtor as an over-reacher. But as long as the aggregate account is in balance, neither party need be concerned that particular subaccounts are not. For example, if a rancher were to owe a farmer in the trespass subaccount, the farmer could be expected to remain content if that imbalance were to be offset by a debt he owed the rancher in, say, the water supply subaccount. (Ellickson, 1991: 55-6)

  1. When a group is small, it is less vulnerable to the problem of incentive dilution. Indeed, free riding is a strategy whereby an individual trades a reduction in his own effort, from which he alone benefits, for reductions in the income of the whole group, which are shared among all members. Therefore, as the size of the group increases, the terms of this exchange become more and more favourable to the free rider (since shares are diluted), and vice versa when the size of the group decreases. This is why 'large groups are less able to act in their common interest than small ones' (Olson, 1982: 31).

In Chapter 2, we have shown in the framework of a one-shot fishing game that the Nash equilibrium outcome tends to move away from the collectively rational outcome as the number of fishermen increases. This relationship between group size and the chance of co-operation was actually discovered by David Hume as early as the first half of the eighteenth century:

Two neighbours may agree to drain a meadow, which they possess in common; because 'tis easy for them to know each others mind; and each must perceive, that the immediate consequence of his failing in his part, is the abandoning the whole project. But 'tis very difficult, and indeed impossible, that a thousand persons shou'd agree in any such action; it being difficult for them to concert so complicated a design, and still more difficult for them to execute it; while each seeks a pretext to free himself of the trouble and expence, and wou'd lay the whole burden on others. (Hume, 1740: Bk. III, pt. II, sect. VII: 538)

As is evident from the above excerpt, the advantage of small groups is not only that they prevent incentives from being excessively diluted, but also that they allow for agreements to be reached among the people concerned at low negotiation costs, which include the costs of communication and bargaining as well as, possibly, those of creating and maintaining a formal organization (Olson, 1965). However, it bears emphasis that this consideration takes us away from the non-co-operative game-theoretical framework, a point to which we shall return in Chapter 8.

  1. A fundamental lesson of game theory is that, in a repeated PD game, there may exist a 'profusion' of equilibria. This means that, when interactions are repeated, many patterns of behaviour may get established and be stable. Unfortunately, perhaps, game theory has little to say about which equilibria will arise and in what manner. It is precisely with respect to this selection problem, which will be addressed at a later stage, that small groups may be at an advantage. As a matter of fact, by allowing individuals to reveal and signal their intended plans of action and to learn about others' intentions, pre-play communication in small-group settings may enable them to choose 'good' equilibria. Similarly, but in a less explicit way, shared experiences or beliefs and inherited patterns of behaviour may also play a role in such settings. For example, if there is a tradition of trust or a group-centred culture, members of a given community arc more likely to give co-operative strategies a chance (more about this in Chapter 7).

Yet, at the same time, it must be stressed that small groups, precisely because the relationships among their members are highly personalized, arc vulnerable to strong manifestations of envy and rivalry which may sometimes make co-operation very difficult. In other words, it is not because people can easily communicate within small groups that cooperative equilibria will necessarily emerge. An atmosphere of distrust can lead to 'bad equilibria' even within such groups. We shall return in Chapter 12 to this important but systematically neglected aspect of the issue under concern.

  1. The feeling of sameness or togetherness which may permeate the culture of small groups may also promote co-operation in an unexpected, irrational way. Indeed, as pointed out by Elster on the basis of empirical evidence, people may be easily led into 'magical thinking', that is, they may believe (or act as though they believed) that their co-operation can cause others to co - operate even though such causal relationship is evidently absent (Elster, 1989a: 195-200). In other words, people are sometimes prone to believe that everything turns upon their own behaviour so that whether joint cooperation or joint defection (the only two outcomes that are then considered to be possible) will occur is deemed to entirely depend on what decision they themselves choose to make. Consider a two-person PD where the agents make their decisions independently of each other. According to Elster, 'if they are sufficiently alike, each of them may reason in the following manner. "If I co-operate, there is a good chance that the other will co-operate too. Being like me, he will act like me. Let me, therefore, cooperate to bring it about that he does tog"' (ibid. 197). Now, the smaller the group, the more other people are like oneself and the more plausibly (in a psychological rather than logical sense) one can infer that they will behave like oneself (ibid. 208). In the context of a small group, therefore, people's proclivity to think in magical terms—that is, upon the belief that by acting on the symptoms one can also change the cause— helps explain why they may start by co-operating, thereby providing an irrational basis for strategies of 'brave reciprocity'.