 # Introduction

How do social norms emerge, and how do people choose to conform or rebel against them? In this tutorial, we will predict the emergence of social norms in simulated societies. I will also introduce an algorithmic solution with an implementation in Python.
Consider a society where every player is facing a binary choice. Whatever the underlying problem is - wearing a facemask outdoors, using a fork in medieval Italy, rebelling against an oppressive government - it is reasonable to assume that the player gets a reinforcement benefit from conforming with the behavior of her neighbors. More masks encourage compliance, more rebels make the rebellion likelier to succeed, etc.
Each period, every player has a chance to revise the decision. This captures the idea that players adapt their behavior to their neighbors over time. They may also behave imperfectly, or, in an alternative interpretation, learn the optimal behavior by trying different things.
This tutorial largely follows the results and the exposition in  and  as well as the author's work on reinforcement learning. Illustrations and simulations are done with  .

## The Stag Hunt game

To make this intuition formal, suppose the players live on a ${N}_{1}×{N}_{2}$${N}_{1}×{N}_{2}$N_(1)xxN_(2)N_1 \times N_2 grid that wraps horizontally and vertically (technically turning this world into a torus). To make it non-trivial, let us assume that there are at least 6 players, i.e. it is at least a $2×3$$2×3$2xx32 \times 3 grid.
Every period each player (call this player $i$$i$ii) is randomly matched with one of their four neighbors in a cross pattern (above,below,left,right) and both have to pick $A$$A$A\textcolor{#FF934F}{A} or $B$$B$B\textcolor{#235FA4}{B}.
The payoffs from their choices are determined according to the following table (the first number is the payoff of the row player, the second is the payoff of the column player).
$A$$A$A\textcolor{#FF934F}{A} $B$$B$B\textcolor{#235FA4}{B}
$A$$A$A\textcolor{#FF934F}{A} $10,10$$10,10$10,1010,10 $0,7$$0,7$0,70,7
$B$$B$B\textcolor{#235FA4}{B} $7,0$$7,0$7,07,0 $7,7$$7,7$7,77,7
The players maximize the expected payoff - the average payoff from playing the chosen action against every one of the four neighbors. This particular payoff table is called a Stag Hunt in game theory. It captures a similar intuition as the famous Prisoner dilemma: a conflict between safety and cooperation.
In particular, notice that $A$$A$A\textcolor{#FF934F}{A} is payoff-dominant (it Pareto-dominates the other action, i.e. both players are strictly better off), while $B$$B$B\textcolor{#235FA4}{B} is risk-dominant. Risk-dominance, in this case, is simply the fact that playing $B$$B$B\textcolor{#235FA4}{B} always gives a safe payoff of $7$$7$77 no matter what the other players do, while playing $A$$A$A\textcolor{#FF934F}{A} can yield 10 but involves a risk of a zero payoff. This choice creates a dilemma for the player. Mapping back to the motivating scenarios, a successful rebellion $A$$A$A\textcolor{#FF934F}{A} yields high gains if it spreads through the society, but not rebelling $B$$B$B\textcolor{#235FA4}{B} is a safer bet.
A common approach to such "agent-based" problems with many agents is to run simulations. You can see the result of one such simulation below with colors corresponding to the two actions.
It is clear that at least in this one case $B$$B$B\textcolor{#235FA4}{B} emerges as a social norm. The purpose of this tutorial is to show that this problem can be solved theoretically without resorting to costly computational simulations. It is always the case that players eventually pick $B$$B$B\textcolor{#235FA4}{B} in the game above.
The rest of this tutorial is organized into sections, starting with a simpler model without noise. This model is then used to show that the process always converges. Then we introduce noise and solve the problem using the tree surgery approach. The approach is general, and the present text advocates these tools for agent-based models. However, for practical problems, an algorithmic approach is also of interest. The reader may be inclined to jump to the last section with the Python code.

# Simulation

## Stable behavior

Much of the social behavior can be captured by the concept of a Markov process. A Markov process means that all the information about what the players are going to do in the next period is contained entirely in what the society is doing now, the state. The state is the current behavior of every agent. The players use the current behavior of the neighbors to revise their own, not considering what happened two, three, or more periods into the past.
First, we consider the problem without noise. We look for stable states, the states that, once entered, will persist indefinitely. The norm will emerge, the players will no longer be switching and will converge to one action. We also need to check for cycles of actions that could persist indefinitely.
Let us call a rectangle of agents that coordinate on the same action (and are surrounded by players with a different action) an enclave.
Consider first $B$$B$B\textcolor{#235FA4}{B}-enclaves surrounded by $A$$A$A\textcolor{#FF934F}{A}-players. Then every $A$$A$A\textcolor{#FF934F}{A}-player has at most one $B$$B$B\textcolor{#235FA4}{B}-player neighbor, while every $B$$B$B\textcolor{#235FA4}{B}-player has at most two $A$$A$A\textcolor{#FF934F}{A}-player neighbors, i.e. no more than 50%. Then $A$$A$A\textcolor{#FF934F}{A}-player obtains at least $10×\frac{3}{4}$$10×\frac{3}{4}$10 xx(3)/(4)10 \times \frac{3}{4} and does not wish to switch since $10×\frac{3}{4}>7$$10×\frac{3}{4}>7$10 xx(3)/(4) > 710 \times \frac{3}{4} > 7. Similarly, $B$$B$B\textcolor{#235FA4}{B}-player obtains $7$$7$77 (again, action $B$$B$B\textcolor{#235FA4}{B} is "safe" and always yields the same payoff), and does not wish to switch either, because $7>10×\frac{2}{4}$$7>10×\frac{2}{4}$7 > 10 xx(2)/(4)7 > 10 \times \frac{2}{4}, the latter is the payoff of playing $A$$A$A\textcolor{#FF934F}{A} with two $A$$A$A\textcolor{#FF934F}{A}-player neighbors.
On the other hand, any such enclave of $A$$A$A\textcolor{#FF934F}{A}-players would break down. Indeed, take a player at the corner of such enclave. This player must be adjacent to two $B$$B$B\textcolor{#235FA4}{B}-players and two $A$$A$A\textcolor{#FF934F}{A}-players, therefore wishing to switch to $B$$B$B\textcolor{#235FA4}{B} for a payoff of $7>10×\frac{2}{4}$$7>10×\frac{2}{4}$7 > 10 xx(2)/(4)7 > 10 \times \frac{2}{4}. To sustain enclaves of $A$$A$A\textcolor{#FF934F}{A}-players one needs every player to have at least three like-minded neighbors who play $A$$A$A\textcolor{#FF934F}{A} as well. This yields a payoff of $10×\frac{3}{4}>7$$10×\frac{3}{4}>7$10 xx(3)/(4) > 710 \times \frac{3}{4} > 7. This configuration is only possible in a "strip" - a ring going vertically or horizontally around the whole world (remember that the world wraps on both sides). Both types of enclaves are illustrated below.
This state is stable and we can see then that many different configurations are stable.
Below is an embedded javascript browser implementation of the model. It also allows one to change the payoff of $B$$B$B\textcolor{#235FA4}{B} to show how the behavior depends on the risk-dominant action.
Click to show/hide the interactive model...
Click "setup randomly", then "run simulation". Control the speed with the slider. Use the buttons to manually change actions. Try setting the payoff to below/above/exactly 5 to see the change in behavior.
This evidence of multiple stable states complicates the problem, and two questions remain. First, could the process not converge and oscillate with players changing their behavior indefinitely in a cycle. Second, which of these stable states are more likely?
Let us start by answering the first question negatively.

# Theoretical approach

## Potential functions (no cycles)

To show that the process always converges, we need to show that we can (weakly) order the states of the process in some way. If from every state we eventually get to another state and can never go back, then the revision process has to eventually stop.
Economists and game theorists are used to thinking of orderings in terms of functions, similarly to how preferences are captured by utility functions. If we can find some value that decreases as the process evolves, we can order the states according to this value. One can think of this as an "energy" that leaves the system, and the evolution of the system stops once this energy reaches zero. The formal term is a potential function.
Let $A\left(x\right)$$A\left(x\right)$A(x)A(x) be the number of pairs of neighbors that both play $A$$A$A\textcolor{#FF934F}{A}. Let $B\left(x\right)$$B\left(x\right)$B(x)B(x) be the number of players that play $B$$B$B\textcolor{#235FA4}{B}.
For this problem such potential function is:
$\pi =10A\left(x\right)+7B\left(x\right)$$\pi =10A\left(x\right)+7B\left(x\right)$pi=10 A(x)+7B(x)\pi = 10 A(x) + 7 B(x)
Indeed, suppose a player (let us call this player $i$$i$ii again) with $n$$n$nn $A$$A$A\textcolor{#FF934F}{A}-player neighbors switches from A to B. Then the change in $\pi$$\pi$pi\pi is $-10n+7$$-10n+7$-10 n+7- 10n + 7. If she switches from B to A instead, the change is $10n-7$$10n-7$10 n-710n - 7. In both cases, this change is exactly the change in $i$$i$ii's own payoff. This means that whenever a player revises her behavior, the potential (the energy of the system) decreases. Since it is bounded by 0, this process has to end.
The fact that $\pi \left({x}^{\prime }\right)<\pi \left(x\right)$$\pi \left({x}^{\prime }\right)<\pi \left(x\right)$pi(x^(')) < pi(x)\pi(x') < \pi(x) whenever the process moves from $x$$x$xx to ${x}^{\prime }$${x}^{\prime }$x^(')x' tells us that the process always converges.
A usual warning is that $\pi$$\pi$pi\pi does not represent welfare. It is only a way of ordering the states to show that there are no cycles and $\pi$$\pi$pi\pi does not equal the sum of the payoffs. It is worth mentioning also that the tight condition for convergence is weaker. It only requires that a path to equilibrium exists from any cycle. This condition is called "weak acyclicity" and it is extensively studied as a sufficient condition for convergence of a wide range of dynamics  .

## Stochastic stability (introducing noise)

Let us now turn to the original question of which of these many stable states are more likely. Without noise, it is impossible to say, as it depends on the starting conditions. Indeed if we start already in some stable state, then by definition, this state will be repeated in perpetuity.
This situation changes once we add noise. There are many possible ways to add random behavior. The simplest one is to suppose that the optimal action is chosen with probability $1-\eta$$1-\eta$1-eta1-\eta and the other action is chosen with probability $\eta$$\eta$eta\eta. Thus $\eta$$\eta$eta\eta is the noise parameter, the probability of a "mistake" or an "experiment" by the player. Effectively, we assume that players try different things and eventually stop experimenting as $\eta$$\eta$eta\eta approaches $0$$0$00 over time. The behavior in this limit, once the players have decided what they wanted to do and a social norm has emerged, is what we would like to know.

## Costs and Trees

The tools of the trade were introduced in the 1980s-1990s and have been successfully applied to many interesting problems.
First, the 1-step cost of the process moving from state $x$$x$xx to state ${x}^{\mathrm{\prime }}$${x}^{\mathrm{\prime }}$x^(')x^{\prime} is defined as:
$c\left(x,{x}^{\mathrm{\prime }}\right):=\underset{\beta \to 0}{lim}-\beta \mathrm{log}{P}_{\beta }\left(x,{x}^{\mathrm{\prime }}\right),$$c\left(x,{x}^{\mathrm{\prime }}\right):=\underset{\beta \to 0}{lim} -\beta \mathrm{log}{P}_{\beta }\left(x,{x}^{\mathrm{\prime }}\right),$c(x,x^(')):=lim_(beta rarr0)-beta log P_(beta)(x,x^(')),c\left(x, x^{\prime}\right):=\lim_{\beta \rightarrow 0}-\beta \log P_{\beta}\left(x, x^{\prime}\right),
adopting the convention that $-\mathrm{log}0=\mathrm{\infty }$$-\mathrm{log}0=\mathrm{\infty }$-log 0=oo-\log 0=\infty. Here $\beta ={e}^{-\frac{1}{\eta }}$$\beta ={e}^{-\frac{1}{\eta }}$beta=e^(-(1)/(eta))\beta = e^{-\frac{1}{\eta}} is the convenient change of parameters (it is still some uniform constant probability of noise for all states). The cost of one player randomly taking a noisy action is thus simply
$c\left(x,{x}^{\mathrm{\prime }}\right)=\underset{\eta \to 0}{lim}-\eta \mathrm{log}{e}^{-\frac{1}{\eta }}=1.$$c\left(x,{x}^{\mathrm{\prime }}\right)=\underset{\eta \to 0}{lim} -\eta \mathrm{log}{e}^{-\frac{1}{\eta }}=1.$c(x,x^('))=lim_(eta rarr0)-eta log e^(-(1)/(eta))=1.c\left(x, x^{\prime}\right)=\lim_{\eta \rightarrow 0}-\eta \log e^{-\frac{1}{\eta}} = 1.
The cost is the exponential decay rate of the transition probability from $g$$g$gg to ${g}^{\mathrm{\prime }}$${g}^{\mathrm{\prime }}$g^(')g^{\prime}. Roughly, it says how fast this transition probability falls over time relative to others.
The value of $c$$c$cc can be seen as an encoding of three types of transitions:
• $\mathrm{\infty }$$\mathrm{\infty }$oo\infty - impossible transitions;
• $0$$0$00 - transitions that do not require noise and can happen with zero noise. For example, states with rectangular $A$$A$A\textcolor{#FF934F}{A}-enclaves are not stable, and there are zero-cost transitions from these states;
• Some $constant>0$$constant>0$constant > 0constant>0 - transitions that are only possible with random noise. The less probable is a transition, the higher its cost.
We can extend the definition of cost to paths and multi-step transitions. The overall cost of the process moving from $x$$x$xx to ${x}^{\prime }$${x}^{\prime }$x^(')x' is defined as:
$C\left(x,{x}^{\prime }\right)=\underset{x,...,{x}^{\prime }\in S\left(x,{x}^{\prime }\right)}{min}c\left(x,...,{x}^{\prime }\right),$$C\left(x,{x}^{\prime }\right)=\underset{x,...,{x}^{\prime }\in S\left(x,{x}^{\prime }\right)}{min} c\left(x,...,{x}^{\prime }\right),$C(x,x^('))=min_(x,...,x^(')in S(x,x^(')))c(x,...,x^(')),C(x,x') = \min_{x,...,x' \in S(x,x')} c(x,...,x'),
where $S\left(x,{x}^{\prime }\right)$$S\left(x,{x}^{\prime }\right)$S(x,x^('))S(x,x') is the set of all possible paths between the two states.
A spanning tree rooted at $\stackrel{^}{x}$$\stackrel{^}{x}$hat(x)\hat x is a directed graph over some set $X$$X$XX such that every $x\in X$$x\in X$x in Xx \in X other than $\stackrel{^}{x}$$\stackrel{^}{x}$hat(x)\hat x has exactly one outgoing edge, and the graph has no cycles.
Figure 1: Example of a (directed) spanning tree. Every node has exactly one outgoing edge except the root, node 0.
The correct term for a directed spanning tree is an arborescence, but social sciences tend to call them spanning trees.
Now consider the directed graph on the set of states $X$$X$XX with weights of the edges given by $c\left(\cdot \right)$$c\left(\cdot \right)$c(*)c(\cdot). We can now further extend the notion of cost to trees. The cost of a spanning tree is the sum of the costs of its edges.

## Tree Surgery

The reason for looking at costs and spanning trees is the following repeatedly rediscovered result   :
Theorem: A state can occur in the limit with non-vanishing probability if and only if there is a spanning tree rooted in this state whose cost is minimal.
Another important fact is that we do not need to consider all states. Indeed, a good understanding exercise is to count their number:
How many possible states (configurations of the field, not necessarily stable) are there?...
${2}^{{N}_{1}×{N}_{2}}$${2}^{{N}_{1}×{N}_{2}}$2^(N_(1)xxN_(2))2^{N_1\times N_2}
While there are many states, the set of the possible spanning trees is even bigger. Fortunately, we can focus only on the trees defined on the stable states (sometimes called bonsai trees), of which there is a much smaller number. The weights are then given by $C\left(\cdot \right)$$C\left(\cdot \right)$C(*)C(\cdot). We refer the reader to the proof of this fact in  and  , but the intuition should come from the fact that transitions from unstable states to the closest stable states have zero cost and do not affect the overall cost of the tree.
We will illustrate the method by applying the theory above to our problem. Let us denote the state where all players play $A$$A$A\textcolor{#FF934F}{A} by $\stackrel{\to }{A}$$\stackrel{\to }{A}$vec(A)\textcolor{#FF934F}{\vec A} and the state where all players play $B$$B$B\textcolor{#235FA4}{B} by $\stackrel{\to }{B}$$\stackrel{\to }{B}$vec(B)\textcolor{#235FA4}{\vec B}.
The most important observation is the following:
There is a path from any state to $\stackrel{\to }{B}$$\stackrel{\to }{B}$vec(B)\textcolor{#235FA4}{\vec B}, such that from every state, this path follows an edge with the lowest possible cost.
This statement is not a theorem but an applied result for our particular grid-world problem. Such statements are the essence of "tree surgery": making a graph-theoretic argument about the shape of minimal spanning trees to narrow down the possible roots that correspond to the social norms in the limit.
Let us sketch the proof of this fact below. Remember that we need only consider the stable states, so the edge is the minimum cost transition to another stable state, possibly involving multiple steps.
Indeed, take any state where some players are playing $B$$B$B\textcolor{#235FA4}{B} and some players are playing $A$$A$A\textcolor{#FF934F}{A}. Then there is at least one enclave of $B$$B$B\textcolor{#235FA4}{B}. That is, at least two adjacent players are playing $B$$B$B\textcolor{#235FA4}{B} on the border of this enclave. Then making one of their neighbors randomly play $B$$B$B\textcolor{#235FA4}{B} is enough to make the second neighbor follow with $B$$B$B\textcolor{#235FA4}{B} as well. Thus a single mistake (a minimum non-zero cost as desired) is enough to strictly increase the number of $B$$B$B\textcolor{#235FA4}{B}-players. This process eventually leads to $\stackrel{\to }{B}$$\stackrel{\to }{B}$vec(B)\textcolor{#235FA4}{\vec B}.
The only other case is $\stackrel{\to }{A}$$\stackrel{\to }{A}$vec(A)\textcolor{#FF934F}{\vec A}. In this state, a minimum cost transition to a different stable state is for two players adjacent diagonally to randomly play $B$$B$B\textcolor{#235FA4}{B}. One player randomly attempting $B$$B$B\textcolor{#235FA4}{B}, would immediately find herself surrounded and reconsider, going back to $\stackrel{\to }{A}$$\stackrel{\to }{A}$vec(A)\textcolor{#FF934F}{\vec A}. Thus at least two players are required. Two diagonally adjacent players lead to their neighbors switching, forming a stable square enclave. Thus we obtained a transition to a stable state with a minimum cost. All steps of this process are summarized below:
This concludes the proof of the result above. The rest of the argument is short. The state $\stackrel{\to }{B}$$\stackrel{\to }{B}$vec(B)\textcolor{#235FA4}{\vec B} is very difficult to leave as it is associated with the highest risk. In fact, it is the costliest transition in the game. One can easily see that any cross pattern of $B$$B$B\textcolor{#235FA4}{B}-players contagiously spreads this behavior until everyone plays $B$$B$B\textcolor{#235FA4}{B} starting from the $\stackrel{\to }{A}$$\stackrel{\to }{A}$vec(A)\textcolor{#FF934F}{\vec A}-players next to the center of the cross. The illustration below has two such crosses.
Therefore to leave the "basin of attraction" of $\stackrel{\to }{B}$$\stackrel{\to }{B}$vec(B)\textcolor{#235FA4}{\vec B}, that is to reach a stable state without returning back to $\stackrel{\to }{B}$$\stackrel{\to }{B}$vec(B)\textcolor{#235FA4}{\vec B}, one has to destabilize all the crosses. In other words, at the least, all the players at the centers of crosses have to randomly attempt to play $A$$A$A\textcolor{#FF934F}{A}. When everyone is playing $B$$B$B\textcolor{#235FA4}{B} in $\stackrel{\to }{B}$$\stackrel{\to }{B}$vec(B)\textcolor{#235FA4}{\vec B}, this involves at least everyone on the diagonal or at least $min\left({N}_{1},{N}_{2}\right)$$min\left({N}_{1},{N}_{2}\right)$min(N_(1),N_(2))\min(N_1,N_2) players.
Thus shifting from state $\stackrel{\to }{B}$$\stackrel{\to }{B}$vec(B)\textcolor{#235FA4}{\vec B} to any other stable state has the cost of at least $min\left({N}_{1},{N}_{2}\right)$$min\left({N}_{1},{N}_{2}\right)$min(N_(1),N_(2))\min(N_1,N_2). Since we assumed that we have at least 6 players, $min\left({N}_{1},{N}_{2}\right)\ge 3$$min\left({N}_{1},{N}_{2}\right)\ge 3$min(N_(1),N_(2)) >= 3\min(N_1,N_2) \ge 3.
On the other hand, shifting between any other stable states has the cost of at most $2$$2$22. For example, two players taking a noisy action in $\stackrel{\to }{A}$$\stackrel{\to }{A}$vec(A)\textcolor{#FF934F}{\vec A}.
Then in any minimal spanning tree, we can safely assume that there is a path to $\stackrel{\to }{B}$$\stackrel{\to }{B}$vec(B)\textcolor{#235FA4}{\vec B} from any other state, all edges are minimal, and $\stackrel{\to }{B}$$\stackrel{\to }{B}$vec(B)\textcolor{#235FA4}{\vec B} is always the root. If there were some other minimal spanning tree rooted in a different state, then we could construct another tree of a lower cost, which is a contradiction. First, add a path from the root using only minimum cost edges to reach $\stackrel{\to }{B}$$\stackrel{\to }{B}$vec(B)\textcolor{#235FA4}{\vec B} (we have just proven that this is possible). Now remove the previous edges from the states along this path and the edge from $\stackrel{\to }{B}$$\stackrel{\to }{B}$vec(B)\textcolor{#235FA4}{\vec B}. Since the edge leaving $\stackrel{\to }{B}$$\stackrel{\to }{B}$vec(B)\textcolor{#235FA4}{\vec B} has the cost of at least $min\left({N}_{1},{N}_{2}\right)\ge 3$$min\left({N}_{1},{N}_{2}\right)\ge 3$min(N_(1),N_(2)) >= 3\min(N_1,N_2) \ge 3, we obtain a tree where every edge has the cost of at most $2$$2$22. This tree, therefore, has a strictly lower cost, which contradicts the possibility of having a minimum cost spanning tree rooted in any state but $\stackrel{\to }{B}$$\stackrel{\to }{B}$vec(B)\textcolor{#235FA4}{\vec B}.
This argument solves the problem since the process always converges to the root, $\stackrel{\to }{B}$$\stackrel{\to }{B}$vec(B)\textcolor{#235FA4}{\vec B}, by the above theorem.

## Conclusion

We have used tree surgery to show that the inferior social norm will prevail in the long run. Even though $A$$A$A\textcolor{#FF934F}{A} is the better action, it requires teamwork and is unsustainable for these payoffs.
This result confirms what we had suspected from simulations, but the power of the theoretical analysis is that it tells us more. Suppose we change the payoffs to
$A$$A$A\textcolor{#FF934F}{A} $B$$B$B\textcolor{#235FA4}{B}
$A$$A$A\textcolor{#FF934F}{A} $10,10$$10,10$10,1010,10 $0,4$$0,4$0,40,4
$B$$B$B\textcolor{#235FA4}{B} $4,0$$4,0$4,04,0 $4,4$$4,4$4,44,4
Then by following the same theoretical analysis, we would discover that $\stackrel{\to }{A}$$\stackrel{\to }{A}$vec(A)\textcolor{#FF934F}{\vec A} is now the unique outcome. Indeed, the relevant condition is the half-dominance, which generalizes our analysis above. If one of the actions is better than the other when at least half of the neighbors play it, then the population will converge to playing this action. This condition is widely studied in the literature (see the survey in  ) and is robust to many learning rules. This is not something that simulations can consistently show without exploring every one of the countless possibilities.
The study of social dynamics with the mathematical apparatus of stochastic stability is an active field of research going through a renaissance  . The theoretical intuition shown here had also been generalized with radius-coradius theorems  . These theorems give predictions without the need for tree-surgery for many scenarios as well as the (often tight) bounds on the time it takes to converge.

# Algorithmic approach

The theorem above reduces the problems of predicting the emergence of norms to the minimum arborescence problem, the problem of finding a minimum directed spanning tree for a given graph. The root of this graph is what will (almost always) eventually happen in society. In other words, we can show that the pattern in the animation shown at the beginning of this tutorial is guaranteed to eventually occur in the limit if we find a corresponding graph-theoretic tree-surgery argument.
There are well-known fast algorithms that automate this problem. A famous approach is the Chu–Liu/Edmonds' algorithm  . Algorithms for the minimum arborescence problem can also be found in the Python NetworkX package.  However, for large problems such as the one here, the algorithmic approach is intractable even though the algorithms are polynomial time.
We will show a simple implementation in Python. Assume 9 players in a $3×3$$3×3$3xx33 \times 3 grid for simplicity.
First, we import the libraries.
import numpy as np
import matplotlib.pyplot as plt
import networkx as nx

We will need to convert decimal indices of states to their binary representation and back:
def binatodeci(binary):
return sum(val*(2**idx) for idx, val in enumerate(reversed(binary)))

We now construct a graph $G$$G$GG that connects all states through the possible transitions (one random player switching the action) and adds the costs (one or zero) as weights.
# Generate an empty digraph
G = nx.DiGraph()

# Add edges by going over all states
for state in range(2**9):
state_bin = np.array([int(i) for i in list('{0:09b}'.format(state))]).reshape(3,3)

# Nine possible transitions, one for each player
for x in range(3):
for y in range(3):

# The new state if the chosen player switches
new_state_bin = np.copy(state_bin)
new_state_bin[x,y]=1-new_state_bin[x,y]
new_state=binatodeci(new_state_bin.flatten().tolist())

# Number of neighbors with action  A
n = np.sum([state_bin[(x-1) % 3, (y-1) % 3],
state_bin[(x+1) % 3, (y-1) % 3],
state_bin[(x-1) % 3, (y+1) % 3],
state_bin[(x+1) % 3, (y+1) % 3]])

# Add an edge and a cost depending on
# whether the transition is free or requires noisy behavior
if state_bin[x,y] == 0:
if n*10 >= 4*4:
else:
else:
if n*10 <= 4*4:
else:

The actual code for solving the problem is just
root_node = [n for n,d in spanning_tree.in_degree() if d==0]
print(np.array([int(i) for i in list('{0:09b}'.format(root_node))]).reshape(3,3))

This code will output the final social norm that the players choose in the limit, assuming it is unique.

# Exercise

It is reasonable to generalize from a grid world to a society living on a general graph $M$$M$MM. Many situations in society are better captured by complex networks. An edge in $M$$M$MM indicates a possible interaction between the two agents in the same way the neighboring players interacted on a grid. Let us generate a random graph of connections between the agents:
M = nx.gnp_random_graph(9,0.5)
nx.draw(M, with_labels=True)

Different graph generators offer a different structure for the problem. A society living on a one-dimensional circle is a very common model. Concluding the tutorial, the proposed exercise is to generalize the code above to predict the behavior of a society living on the graph $M$$M$MM.
 Durlauf, Steven N., and H. Peyton Young, eds. Social dynamics. Vol. 4. Mit Press, 2004. ↩︎
 Edmonds, Jack. "Optimum branchings." Journal of Research of the national Bureau of Standards B 71.4 (1967): 233-240. ↩︎
 Ellison, Glenn. "Basins of attraction, long-run stochastic stability, and the speed of step-by-step evolution." The Review of Economic Studies 67.1 (2000): 17-45. ↩︎
 Freidlin, M. "On wave front propagation in periodic media." Stochastic analysis and applications, ed. M. Pinsky, Advances in Probability and related topics 7 (1984): 147-166. ↩︎
 Marden, Jason R., et al. "Payoff-based dynamics for multiplayer weakly acyclic games." SIAM Journal on Control and Optimization 48.1 (2009): 373-396. ↩︎
 Newton, Jonathan. "Evolutionary game theory: A renaissance." Games 9.2 (2018): 31. ↩︎
 Sandholm, William H. "Population games and deterministic evolutionary dynamics." Handbook of game theory with economic applications. Vol. 4. Elsevier, 2015. 703-778. ↩︎
 Weidenholzer, Simon. "Coordination games and local interactions: a survey of the game theoretic literature." Games 1.4 (2010): 551-585. ↩︎
 Wilensky, U. (1999). NetLogo. http://ccl.northwestern.edu/netlogo/. Center for Connected Learning and Computer-Based Modeling, Northwestern University, Evanston, IL. ↩︎
 Young, H. Peyton. "The evolution of conventions." Econometrica: Journal of the Econometric Society (1993): 57-84. ↩︎

1. The code is also available as an iPython notebook on github https://github.com/artdol/spanning_trees ↩︎

2. A complete editable and downloadable netlogo web implementation of this model is live at http://modelingcommons.org/browse/one_model/6958 ↩︎

3. This specification is technically a so-called "lenient" cost that helps ensure that the stationary distribution exists. We will gloss over the difference. Please refer to  for a detailed treatment. ↩︎

# Recommended for you

Srishti Saha
High Dimension Data Analysis - A tutorial and review for Dimensionality Reduction Techniques
High Dimension Data Analysis - A tutorial and review for Dimensionality Reduction Techniques
This article explains and provides a comparative study of a few techniques for dimensionality reduction. It dives into the mathematical explanation of several feature selection and feature transformation techniques, while also providing the algorithmic representation and implementation of some other techniques. Lastly, it also provides a very brief review of various other works done in this space.
29 points
0 issues 