The mathematics of kindness

Wim Hordijk Share this page

Charles Darwin's theory of evolution by natural selection is one of the most profound scientific theories to have ever been developed. However, there were several questions about evolution that Darwin himself could not answer. Not that he wasn't smart enough (in fact, his intuition often pointed in the right direction), but the answers to those questions required sophisticated mathematical insights that were not developed far enough, or even available yet, in Darwin's time.


Worker bees give up their chance to reproduce to serve the hive.

One such problem was the evolution of altruism. In biology, altruism is defined as an organism (or individual) performing an action which is at a cost to itself, but which benefits (directly or indirectly) another individual, often without the expectation of reciprocity or compensation. Altruistic behaviour seems abundant in nature: a mother bear protecting her cubs, possibly at the risk of injury; worker bees giving up their reproductive capacity entirely, effectively reducing their fitness to zero; a bird giving out warning signals to others, thereby revealing its own presence to an approaching predator; human beings going to war to defend their country, knowing very well they might die on the battle field; and so on.

However, if evolution by natural selection is all about competition and survival of the fittest, how can altruistic behaviour (which, by definition, lowers the altruist's fitness and increases the receiver's fitness) ever evolve? As Darwin himself wrote: "Natural selection will never produce in a being anything injurious to itself, for natural selection acts solely by and for the good of each." (Origin of Species, 1859, Ch. 6.) Pondering the riddle of altruism, Darwin later suggested that if natural selection would sometimes act at a level higher than the individual, then altruistic behavior, for the good of the community, could indeed evolve. This intuitive idea already reflected what is now referred to as group selection, a notion to which I will return below.

Hamilton's rule

As some of the above examples indicate, one particular situation in which altruistic behaviour is often observed is when it involves close family members, or kin. A mother bear cares for and protects her own cubs (but not others!), because they are closely related to her. In a bee colony, all worker bees are sisters born from the same mother (the queen bee). And even humans are generally more likely to perform "selfless acts of kindness" towards closely related family members than to complete strangers (although not always).

Polar bears

A bear will defend her own cubs, but not others.

The idea of kinship can be made mathematically more precise by calculating a coefficient of relationship, call it $r$, which is defined as the probability that two individuals share a common gene (technically this should be stated in terms of alleles, or gene values, but as in most other descriptions, I’ll use the term gene here, for simplicity). When you were conceived, you inherited (roughly) half of your genes from your mother, and the other half from your father. In general this is a (mostly) random process, with no particular preference for which genes are inherited from which parent. So, the coefficient of relationship between you and either one of your parents is $r=0.5.$

Now, if you have a sibling (brother or sister), they also inherited half of their genes from your mother and half from your father, but they may not all be the same genes as the ones you inherited. In fact, because the process of inheritance is random, you and your sibling (on average) share only half of the half of the genes that each of you inherited from your mother (and similarly for the half you inherited from your father). So, the coefficient of relationship between you and your sibling is

  \[ r=0.5^2+0.5^2 = 0.5 \]    

In a similar way, you can calculate your coefficient of relationship with other family members. For example, for you and a (first) cousin, it is $r = 0.125$ (I’ll leave the calculation as an exercise).

So how can this notion of genetic relatedness explain the evolution of altruism? It was the evolutionary biologist Bill Hamilton, born in Egypt from New Zealand parents who then settled in England, who formulated the answer in a precise mathematical way. Hamilton argued that if the benefit ($B$) of an altruistic act, devalued by the coefficient of relationship ($r$) between the two individuals involved, is greater than the cost ($C$), then (genes for) altruistic behaviour can evolve. In mathematical terms, if

  \[ rB>C, \]    

then altruism is worth it. This is now known as Hamilton’s rule.

To give a simple example, if you sacrifice your own life to save two or more siblings, then for every gene that is lost with your own death, at least one copy can be expected to be saved. After all, each of your siblings has a probability of $0.5$ to share a given gene with you. Mathematically, the coefficient of relationship is $r=0.5$ (between you and your siblings) and the cost is $C = 1$ (you). So, according to Hamilton's rule, if the benefit is at least $B = 2$ (your siblings), you're OK (or rather, your genes are).


A bird's warning call benefits all other birds around.

The founders of population genetics, the trio Ronald Fisher, J. B. S. Haldane, and Sewall Wright, apparently had been intuitively aware of this general idea. Fisher had already published a table calculating genetic distances between kin in 1918, and Wright formally introduced the coefficient of relationship in 1922. For some reason, though, they never related this to the problem of altruism. However, according to legend, Haldane had first expressed the logic behind Hamilton's rule when he announced that he was prepared to lay down his life for eight cousins or two brothers. But in the end it was Hamilton who generalised the idea and formalised it mathematically in 1964.

So, clearly, altruistic behavior is associated with kinship. Or at least it is in many cases. However, as some of the above examples indicate, it need not always be. Warning signals from one bird are received by all birds that happen to be nearby, whether they are genetically related to the altruist or not. And people going to war to defend their country don't only fight to protect their own immediate family. Even though one could argue that in these cases there is a good chance that there will be a sufficiently large number of close kin among the receivers of the altruistic act to make it worthwhile, there seems to be something more general going on.

The Price equation

Enter George Price, an American physical chemist living in London on his savings, reading and writing on evolutionary biology. After applying for a grant and obtaining a research position in mathematical genetics at University College London, and reading Hamilton's 1964 papers on kin selection, Price derived an equation (in the early 1970s) that generalises Hamilton's rule, and provides a formal method for the hierarchical analysis of the effects of natural selection.


The Price Equation, as it is now known, consists of two terms. Let's start with considering only the first term, also known as the covariance equation (which, incidentally, was discovered independently, and unbeknownst to Price, a few years earlier by two other researchers, Robertson (1966) and Li (1967)). Suppose we have a quantitative trait, such as human height, or the propensity for altruistic behaviour. Let's write $z$ for this trait — so $z$ is a variable that takes a numerical value measuring the trait (e.g. height in centimetres). Given a population of people, each comes with his or her value for $z$ and we write $z_1$, $z_2$, $z_3$, etc, for all the different values that occur in the population, and $\bar{z}$ for the population average. Now each value of $z_ i$ (for $i=1,2,3,...$) comes with a fitness value $w_ i$ (e.g. being tall might come with a high fitness value because it enables you to reach more of the fruit on a tree than a small person). Let's again write $\bar{w}$ for the population average.


The covariance between $z_ i$ and $w_ i$ is defined as

  \[ Cov(w_ i,z_ i) = E(w_ i,z_ i)-E(w_ i)E(z_ i), \]    

where $E(.)$ denotes the expected (or mean) value.

There is a statistical quantity called the covariance between the trait values and their respective fitness values. It’s a measure of how the $w_ i$ and $z_ i$ vary together and denoted by $Cov(w_ i,z_ i)$ (see the box for a definition). Roughly speaking, a positive value of the covariance indicates that the $z_ i$ and $w_ i$ are related, with $w_ i$ increasing when $z_ i$ does and vice versa (with a high positive value indicating a strong relationship). A negative value indicates there’s an inverse relationship between the two, with $w_ i$ decreasing when $z_ i$ increases and vice versa (with a low negative value indicating a strong inverse relationship). Finally, a $0$ value of the covariance means that there is no relationship between the two.

The covariance equation states that the change in average trait value from one generation to the next (denoted by $\Delta \bar{z}$) is proportional to the covariance between the trait values and their respective fitness values:

  \[ \Delta \bar{z}=\frac{1}{\bar{w}}Cov(w_ i,z_ i). \]    

Here is a graphical illustration of the meaning of the covariance equation for selection.

If the covariance between trait values and fitness values is positive (so a higher trait value means higher fitness), the average trait value in the current population ($\bar{z}$, dashed blue line) will move upward in the next generation ($\bar{z}^\prime $, dashed red line) due to selection.

This may seem a trivial statement, but it actually represents a significant insight in at least two ways. First, it provides a formal, quantitative description of how selection works, which can be used to analyse any kind of selective process, not only in biology, but also in economics or learning, for example. And second, it generalises Hamilton's rule by showing that what really matters is statistical association (i.e., covariance), not just genetic relatedness (which is only one particular source of statistical association, although a very important one in biological evolution).


George Price (1922-1975).

Covariance is the proper way to think about the role of genetic relatedness in evolution. After all, selection acts directly on traits, and only indirectly on genes (through those traits the genes are responsible for). This crucial insight had been missed (or at least under-appreciated) by everyone else: Darwin himself, the founding trio of population genetics Fisher, Haldane, and Wright, and the originator of kin selection Hamilton. However, George Price was the first to realise its importance, which subsequently led Hamilton to reformulate his theory of kin selection in terms of covariance. Indeed, it can be shown mathematically that Hamilton's rule is a specific instance (given appropriate assumptions) of the covariance equation.

In the examples above, this means that birds (or people) who interact in an altruistic way do not necessarily need to be genetically related. It could be sufficient for the altruistic individuals to belong to a clearly distinct group, such as all birds nesting in a particular patch of trees, or all people living in a particular country, which provides the required statistical association. This, then, brings us to the full Price equation and the notion of group selection.

Group selection

The full Price equation consists of the original covariance equation plus an additional term:

  \[ \Delta \bar{z}=\frac{1}{\bar{w}}Cov(w_ i,z_ i) + \frac{1}{\bar{w}}E(w_ i\Delta z_ i), \]    

where $w_ i\Delta z_ i$ measures the change in character values between ancestor and descendant, weighted by the fitness $w_ i$, and $E(w_ i\Delta z_ i)$ denotes the expected (or mean) value of $w_ i\Delta z_ i$.

One interpretation of the full equation is that it provides a natural and hierarchical decomposition of selection within and between groups. To see this, let’s assume the population consists of several groups (e.g. birds nesting in different parts of a forest) and write $z_ g$ for the average trait value within a group $g$. The corresponding average fitness within the group is $w_ g$. We can now rewrite the equation, replacing individuals by groups:

  \[ \Delta \bar{z}=\frac{1}{\bar{w}}Cov(w_ g,z_ g) + \frac{1}{\bar{w}}E(w_ g\mathbf{\Delta z_ g}), \]    

where $\bar{z}$ and $\bar{w}$ are the corresponding averages over all groups $g$. Now, note that the $\Delta z_ g$ in the expectation term (in bold in the equation above), which is the change in average trait value of a given group $g$, can itself be written as a full Price equation (in bold in the equation below) in terms of the individuals $i$ that make up this group $g$:

  \[ \Delta \bar{z}=\frac{1}{\bar{w}}Cov(w_ g,z_ g) + \frac{1}{\bar{w}}E\left(w_ g\mathbf{\left[\frac{1}{w_ g}}Cov(w_{g,i},z_{g,i}) + \frac{1}{w_ g}E(w_{g,i}\Delta z_{g,i})\right]\right). \]    

This simplifies to

  \[ \Delta \bar{z} = \frac{1}{\bar{w}}Cov(w_ g,z_ g)+\frac{1}{\bar{w}}E\left(Cov(w_{g,i},z_{g,i})+E(w_{g,i}\Delta z_{g,i})\right). \]    

This recursive expansion of the Price equation can be repeated for yet another level of subgroups, replacing $\Delta z_{g,i}$ in the expectation term by yet another full Price equation, and so on.

However, what is important here is that this recursive expansion provides additional insight into the possible evolution of altruism. Recall that, by definition, altruistic behavior decreases an individual’s fitness, but increases its group’s fitness (relative to other groups). In other words, the covariance between an individual’s altruistic trait $z_{g,i}$ and that individual’s fitness $w_{g,i}$ is negative, $Cov(w_{g,i}, z_{g,i})0$. So, according to the Price equation, altruistic traits can only evolve in those situations where the positive between-group covariance $Cov(w_{g}, z_{g})$ is large enough to make up for the negative within-group covariance $Cov(w_{g,i}, z_{g,i})$. The Price equation thus provides a mathematical formulation (and verification) of Darwin’s original intuition that if natural selection acts at a level higher than the individual (in other words, group selection), then altruistic behaviour can evolve (within a group, for the good of the community).


Direct reciprocity can be observed in many animal species, including chimps. Grooming someone else is a cost to yourself (it takes time and energy, without getting any direct benefit), but it strengthens social bonds, and is expected to be "paid back".

Note that in such a group selection scenario, there is reciprocity at the group level, but not necessarily directly between the same two individuals. In other words, if I do you a favour, I do not necessarily need to expect you personally to return the favour to me, as long as I can expect it from the group as a whole. Altruism based on direct reciprocity, i.e., between the same two individuals, can also be explained using game theory (see this article to find out more). Incidentally, it was also Price who introduced this idea in evolutionary biology, together with John Maynard Smith, one of the leading evolutionary biologists during the second half of the 20th century (and a former student of Haldane). Alternatively, direct reciprocity can also be viewed in a group selection setting, where the group size is just two.

What had motivated Price to develop his equation was a quest to find true selfless kindness. However, what his mathematics seemed to tell him is that altruism, as a product of evolution, in the end still serves a selfish purpose, whether it is at the level of the gene, the individual, or the group as a whole. Even within groups, altruism only evolves when there is competition between groups. Perhaps disillusioned by what may have looked to him like a failure in his quest, George Price took his own life in January of 1975. At that time probably only a handful of evolutionary biologists (including Hamilton) truly understood the significance of his equation.

Further reading

The complete and highly intriguing story of the life and work of George Price is beautifully told in the book The Price of Altruism: George Price and the Search for the Origins of Kindness by Oren Harman, professor of the history of science at Bar Ilan University, Tel Aviv, Israel. The current article is largely based on that book, together with the more technical overview of George Price's contributions to evolutionary genetics by Steven A. Frank, professor of ecology and evolutionary biology, University of California, Irvine, CA, USA. I also thank my colleague Mike Steel, professor of mathematics and statistics at the University of Canterbury, Christchurch, New Zealand, for gifting me a copy of Harman's book, which inspired me to learn more about the Price Equation and "the mathematics of kindness".

About the author

Wim Hordijk

Wim Hordijk is a computer scientist currently on a fellowship at the Konrad Lorenz Institute in Klosterneuburg, Austria. He has worked on many research and computing projects all over the world, mostly focusing on questions related to evolution and the origin of life. More information about his research can be found on his website.



This whole theory is quite laughable if one considers the fact that an animal will help a species that is far from being related to it. (there are numerous examples of such behavior on YouTube, which also makes one wonder of the prevalence of this behavior that goes unrecorded)


Apparently humans share 50% of their genome with bananas.
Shouldn't the quote be "to lay down his life for two bananas"? ;)
More seriously: The genome of different humans is extremely similar.
So shouldn't be r much higher than 0,5 for siblings? More like 99.5?
Wouldn't this make altruism inevitable?
Anyway, great article!

While I am not an expert by any means, I understand that the percentages of shared DNA in heritage / genetic lineage are the percentages of specific inheritable areas of DNA that only represent about 1% of the total DNA in the human genome. I think we all share about 99%. So yes, your 99.5% is probably correct, but in genealogical DNA terms, only the variable portion is considered.
I also agree - great article.

'Relatedness' has little to do with how much DNA is actually shared, which is confusing. If a sexually reproducing organism has a new mutation, how likely is it that it will be passed down to a given child? 50% Granchild? 25%...and so on.

For Hamilton's rule at least, it's more like how much DNA is shared between two members of a population over and above the population average - which means you can actually get negative relatedness, and that relatedness is completely contextual! So perhaps you could have a high relatedness to 2 bananas - if only you, two bananas, and lots of bacteria lived in a population together.

Permalink In reply to by Samuel Ford (not verified)

The reason the two bananas idea is wrong is because a human can’t (successfully) mate with a banana. Hamilton’s r refers to that portion of the genome which is variable within a species, and does not apply e.g. to those humans who are enamored of bananas.


A good article.
Unselfish acts of kindness are not limited to members of the blood line. Indeed, they appear mostly between non-blood related individuals. Most common between man and wife. In regards to soldiers. there is no doubt that a very small minority may embrace such ideals. However, the great majority are either ordered (or else) or have been victims of mass propaganda (brain washing) to do it. There are also the psychopaths that don't need motivation to commit horrible acts against fellow human beings.


A very good article