bookmark_borderThe two-headed bacterium

I like to see categories as fish nets we use to capture ideas. We classify things into categories like individuals, nation or species, and of course it is all arbitrary and doesn’t correspond to anything in the real world. But categories still form useful chunks we can use to make sense of the world. Furthermore, here is a fun exercise: introduce arbitrary changes in the categories, and see what the world looks like through this new lens. As I will argue, there are plenty of things to be discovered this way. Use the standard fish nets, and you get a standard understanding of the world. Try to use slightly larger or smaller nets, and maybe you will discover things you had never noticed before.

Take the individual, for example. One bacterial cell contains exactly one genome and all the necessary equipment to replicate it. Using our human-derived intuition of what makes an individual, it makes sense to see bacteria as unicellular organisms, meaning that one cell = one individual. If you visit the wiki page on prokaryotes (the larger group that encompasses bacteria and archaea) the first thing you hear is that they are unicellular, as if it were the most important thing about them. However, bacteria are so weird, so different from us, that it makes little sense to describe them using the categories we invented while observing humans.

Let’s explore the strange and surprising processes that are uncovered when you change your definition of the individual to make it either wider, or narrower. First, I will start with a hot take: each bacterial lineage is one big multicellular individual. Then I will move on to the super-hot-magma-take: each bacterial cell is actually made of two distinct individuals fused together, facing in opposite directions.

Bacteria as multicellular organisms

First, let’s make our definition of the individual arbitrarily broader, and consider that the whole bacterial culture, descending from a single ancestral cell, is one individual. Is there anything interesting to see here? For starters, some behaviors of bacterial cells don’t really make sense as individuals. For example, bacterial cells regularly perform what could only be described as bacterial sacrifice.

The Kelly criterion in prokaryotes

Content warning: bacterial sacrifice

Antibiotics were already in the environment long before humans started using them, usually secreted by other micro-organisms who want to take your precious nutrients for themselves. Imagine being a bacterium growing peacefully – there is always a risk that some bastard fungus will put their filthy pterulone, sparassol or strobilurin in your soup. Fortunately, bacteria figured out a solution: enemies can’t stop you from growing if you are already not growing.

In its simplest form, this works because the antimicrobial compound needs to be actively incorporated in the growth machinery to cause trouble. Think of a grain of sand being caught in a clockwork mechanism and breaking everything – if the mechanism is stopped, the grain of sand doesn’t enter, and you can resume operation later once the grain of sand has been blown away. Obviously, the drawback is that the bacterium is no longer growing, which kind of defeats the whole point. This is why bacteria have invented what we humans know as the Kelly betting system.

Say a gambler bets on something with 2:1 odds, so if she wins the bet, she gains twice as much as what she invested. She know she has a 60% chance of winning, so the most profitable strategy is of course to invest 100% of her money every time – this way, she maximizes the return of every winning bet! But obviously this is bad, because eventually she will lose a bet, and then have zero monies remaining. For bacteria, this is like having 100% of the cells growing as fast as they can. This maximizes the population growth rate, until the aforementioned bastard fungus secretes some pleuromutilin or whatever and then the entire population takes it up and goes extinct. To avoid this, our gambler should invest only a fraction of her money on each bet, so her funds still grow exponentially (albeit at a slower rate) but in case of loss she still has some funds to continue. For bacteria, this means always having a small fraction of the population that stops growing, as a backup. This is essentially the bacterial population betting on whether there will be antibiotics in the close future. From the perspective of an individual cells, both situations are bad – either you stop growing, while your friends quickly outnumber you by orders of magnitudes and you practically disappear, or you are part of the growing fraction and eventually you die from antibiotic overdose. But if you look at the entire colony, you can see the two sub-populations as two essential parts of a single organism, that figured out some slick decision theory techniques long before the species of John L. Kelly even evolved a brain.

Eating the corpses of your siblings

Content warning: eating the corpses of your siblings.

Similarly, one puzzling feature of bacteria is that they sometimes commit apoptosis. This happens, for example, when food is scarce – some cells may spontaneously explode so that other cells can feed on their remains, increasing the chances that at least one of them will make it out alive when resources come back. If you see each cell as an individual, that is weird, and does not fit well with anything methodological individualism would predict. But if you see the whole colony as the individual, then it is just like your good old typical apoptosis – just like, in the fetal stage, your fingers were all connected by cells until some of them honorably committed seppuku so you get born with fingers instead of webbed paws.

(One fascinating thing with bacterial apoptosis is that every cell which ever activated these pathways is dead. Thus, if you look at a currently living bacteria, at no point in billion years of evolution did this pathway ever activate in any of its ancestors. Not even by chance. The entire mechanism evolved and improved only by correlation with other cells, without ever activating in the lineages we can now see.)

Action potentials in biofilms

As a third exhibit of things bacteria do that definitely don’t look like unicellular behavior, there is the recent discovery that some bacteria, after organizing themselves as a biofilm, are able to communicate with each other using electrical waves. The way it works is remotely similar to the action potentials we see in neurons. At a resting state, cells are filled with potassium ions, which makes them electrically polarized. Whenever the polarization disappears, ion channels in the envelope open up, and the potassium ions all exit the cell into the extracellular environment. This, in turns, cancels out the polarization of neighboring cells. The result is this:

Video from Prindle et al., 2015, showing waves of potassium propagating in a colony of tens of thousands of cells.

Supposedly, this mechanism makes sure the outer bacteria will stop eating from time to time, so the nutrients can diffuse all the way to the center and prevent the interior cells from starving. If this does not make you scream “multicellular!”, I don’t know what will.

In short, rather than being just individual cells fighting against each other, bacteria have evolved hard-wired mechanisms that only make sense if you consider the dynamics of the whole colony. A microbiologist could spend her entire career building a perfect model of one bacterial cell, but she would still be far from understanding all facets of the organism. Oh, and if you are ready to hear a similar point about humans (that is, human communities are multi-body individuals), get your largest fish net and check out this review. I will continue with bacteria, because we have barely scratched the lipopolysaccharide of their weirdness.

Bacterial cells are two-faced pairs of individuals

Now, let’s see what happens with a much narrower definition for an individual. Even narrower than a single cell. Put down that extra-large “big game”-rated landing net and bring the tweezers.

Here is our new definition: an individual is what happens between a birth event and a death event. Now we need to find definitions of birth and death that apply to bacteria. Let’s say, a birth event is when a mother cell divides into two daughters (specifically, cytokinesis). A death event is when a cell is irreversibly broken, is torn apart or becomes too damaged to grow. We have a simple and precise definition, now we can look at bacteria and pick apart the individuals.

One generation goes as follows:

  • The cell extends and roughly doubles in length
  • The middle of the cell constricts and two new poles are constructed
  • The cell divides and you get two cells. Each of them has one old pole that was already there in the previous generation, and one shiny new pole:

Where is the individual here? Now you understand why I came up with that bizarre birth-death definition. First, let’s number the poles according to their age (in generations).

Blink very fast while on shrooms and you might see a Koch snowflake in the bottom sequence.

But what if bacteria age? It turns out that, yes, bacteria age. After a number of generations, old poles accumulate damage. Depending on the growth environment, they may still be fine, or grow slower, or explode in an effusion of bacteria blood. To reduce clutter, I’ll consider that poles have a lifespan of 3 generations, and then the cell is dead (in real life, they hold for much longer, but that wouldn’t be sketchable).

Coming back to our custom, “birth-to-death” definition of an individual, you can see that each cell is actually made of two of them – one on the left, one on the right.

Here they are very short-lived and die after three generations, but in real life these “half-bacteria” live for much longer, perhaps hundreds of generations if the conditions are not too bad. But the principle remains the same, there are just a lot more of these diagonal individuals.

Using your ancestors as trashcans

Content warning: yeah, that.

But wait, there is more. As I said, in nice conditions the poles can grow basically forever. Yet they still exhibit aging. And yes, this is all sane and coherent. This is where the titles of the papers become really spooky (Age structure landscapes emerge from the equilibrium between aging and rejuvenation in bacterial populations or Cell aging preserves cellular immortality in the presence of lethal levels of damage), showing how far we are from our typically construction of the individual.

To put it very briefly, take the sketches above where half of the cell is young and half of the cell keeps getting older. Old material accumulates in the old pole, so those cells keep growing slower and slower after each generation. Now add some mixing to it: every generation, the older pole gets a little bit of fresh material, and the younger pole gets a little bit of old material. Eventually the old pole reaches an equilibrium when the new material their inherit exactly compensates the damage from aging. As there is the same thing, reversed, for the young pole, you end up with two attractors:

Slightly adapted from Proenca et al., 2018.

What is the importance of this? There may be no importance at all, since the old cells are quickly outnumbered by young cells so they only represent a tiny fraction of the colony. However, there is also some evidence that all kinds of garbage, like misfolded proteins or aggregates, tend to accumulate in the old pole. Perhaps this ensure that at least some cells in the population will be in perfect shape, so in case of trouble, they have a good chance of having at least one survivor (a bit like North Korea preparing a team for the Math Olympiads).

But this, of course, brings us back to collective, multicellular behavior. Life is too complicated to fit in a single fish net.

bookmark_borderAverage North-Koreans Mathematicians

Here are the top-fifteen countries ranked by how well their teams do at the International Math Olympiads:

When I first saw this ranking, I was surprised to see that North Koreans have such an impressive track record, especially when you factor in their relatively small population. One possible interpretation is that East Asians are just particularly good at mathematics, just like in the stereotypes, even when they live in one of the world’s worst dictatorships.

But I don’t believe that. In fact, I believe North Koreans are, on average, particularly bad at math. More than 40% of the population is undernourished. Many of the students involved in the IMOs grew up in the 1990s, during the March of Suffering, when hundreds of thousands of North Koreans died of famine. That is not exactly the best context to learn mathematics, not to mention the direct effect of nutrients on the brain. There does not seem to be a lot of famous North Korean mathematicians either1There is actually a candidate from the North Korean IMO team who managed to escape during the 2016 Olympiads in Hong-Kong. He is now living in South Korea. I wish him to become a famous mathematician.. Thus, realistically, if all 18 years-old from North Korea were to take a math test, they would probably score much worse than their South Korean neighbors. And yet, Best Korea reaches almost the same score with only half the source population. What is their secret?

This piece on the current state of mathematics in North Korea gives it away. “The entire nation suffered greatly during and after the March of Suffering, when the economy collapsed. Yet, North Korea maintained its educational system, focusing on the gifted and special schools such as the First High Schools to preserve the next generation. The limited resources were concentrated towards gifted students. Students were tested and selected at the end of elementary school.” In that second interpretation, the primary concern of the North Korean government is to produce a few very brilliant students every year, who will bring back medals from the Olympiads and make the country look good. The rest of the population’s skills at mathematics are less of a concern.

When we receive new information, we update our beliefs to keep them compatible with the new observations, doing an informal version of Bayesian updating. Before learning about the North Korean IMO team, my prior beliefs were something like “most of the country is starving and their education is mostly propaganda, there is no way they can be good at math”. After seeing the IMO results, I had to update. In the first interpretation, we update the mean – the average math skill is higher than I previously thought. In the second interpretation, we leave the mean untouched, but we make the upper tail of the distribution heavier. Most North Koreans are not particularly good at math, but a few of them are heavily nurtured for the sole purpose of winning medals at the IMO. As we will see later in this article, this problem has some pretty important consequences for how we understand society, and those who ignore it might take pretty bad policy decisions.

But first, let’s break it apart and see how it really works. There will be a few formulas, but nothing that can hurt you, I promise. Consider a probability distribution where the outcome x happens with probability p(x). For any integer n, the formula below gives what we call the nth moment of a distribution, centered on \mu.

\int_{\mathbb{R}}p(x)(x-\mu)^ndx

To put it simply, moments describe how things are distributed around a center. For example, if a planet is rotating around its center of mass, you can use moments to describe how its mass is distributed around it. But here I will only talk about their use in statistics, where each moment encodes one particular characteristic of a probability distribution. Let’s sketch some plots to see what it is all about.

First moment: replace n with 1 and μ with 0 in the previous formula. We get

\int_{\mathbb{R}}p(x)(x)dx

which is – suprise – the definition of the mean. Changing the first moment just shifts the distribution towards higher or lower values, while keeping the same shape.

Second moment: for n = 2, we get

\int_{\mathbb{R}}p(x)(x-\mu)^2dx

If we set μ to be (arbitrarily, for simplicity) equal to the mean, we obtain the definition of the variance! The second moment around the mean describes how values are spread away from the average, while the mean remains constant.

Third moment (n = 3): the third moment describes how skewed (asymmetric) the distribution is, while the mean and the variance remain constant.

Fourth moment (n = 4): this describes how leptokurtic or platykurtic your distribution is, while the mean, variance and skew remain constant. These words basically describe how long the tails of your distribution are, or “how extreme the extreme values are”.

You could go on to higher n, each time bringing in more detail about what the distribution really looks like, until you end up with a perfect description of the distribution. By only mentioning the first few moments, you can describe a population with only a few numbers (rather than infinite), but it only gives a “simplified” version of the true distribution, as on the left graph below:

Say you want to describe the height of humans. As everybody knows, height follows a normal distribution, so you could just give the mean and standard deviation of human height, and get a fairly accurate description of the distribution. But there is always a wise-ass in the back of the room to point out that the normal distribution is defined over \mathbb{R}, so for a large enough population, some humans will have a negative height. The problem here is that we only gave information about the first two moments and neglected all the higher ones. As it turns out, humans are only viable within a certain range of height, below or above which people don’t survive. This erodes the tails of the distribution, effectively making it more platykurtic2If I can get even one reader to use the word platykurtic in real life, I’ll consider this article a success..

Let’s come back to the remarkable scores of North Koreans at the Math Olympiads. What these scores teach us is not that North Korean high-schoolers are really good at math, but that many of the high-schoolers who are really good at math are North Koreans. On the distribution plots, it would translate to something like this:

With North Koreans in purple and another country that does worse in the IMOs (say, France), in black. So you are looking at the tails and try to infer something about the rest of the distribution. Recall the plots above. Which one could it be?

Answer: just by looking at the extreme values, you cannot possibly tell, because any of these plots would potentially match. In Bayesian terms, each moment of the distribution has its own prior, and when you encounter new information, you could in principle update any of them to match the new data. So how can we make sure we are not updating the wrong moment? When you have a large representative sample that reflects the entire distribution, this is easy. When you only have information about the “top 10” extreme values, it is impossible. This is unfortunate because the extreme values are precisely what gets all our attention – most of what we see in the media is about the most talented athletes, the most dishonest politicians, the craziest people, the most violent criminals, and so forth. Thus, when we hear new information about extreme cases, it’s important to be careful about which moment to update.

This problem also occurs in reverse – in the same way looking at the tails doesn’t tell you anything about the average, looking at the average doesn’t tell you anything about the tails. An example: on a typical year, more Americans die from falling than from viral infections. So one could argue that we should dedicate more resources to prevent falls than viral infections. Except the number of deaths from falls is fairly stable (you will never have a pandemic of people starting to slip in their bathtubs 100 times more than usual). On the other hand, virus transmission is a multiplicative process, so most outbreaks will be mostly harmless (remember how SARS-cov-1 killed less than 1000 people, those were the days) but a few of them will be really bad. In other words, yearly deaths from falls have a higher mean than deaths from viruses, but since the latter are highly skewed and leptokurtic, they might deserve more attention. (For a detailed analysis of this, just ask Nassim Taleb.)

There are a lot of other interesting things to say about the moments of a probability distribution, like the deep connection between them and the partition function in statistical thermodynamics, or the fact that in my drawings the purple line always crosses the black like exactly n times. But these are for nerds, and it’s time to move on to the secret topic of this article. Let’s talk about SEX AND VIOLENCE.

This will not come as a surprise: most criminals are men. In the USA, men represent 93% of the prison population. Of course, discrimination in the justice system explains some part of the gap, but I doubt it accounts for the whole 9-fold difference. Accordingly, it is a solid cultural stereotypes that men use violence and women use communication. Everybody knows that. Nevertheless, having just read the previous paragraphs, you wonder: “are we really updating the right moment?”

A recent meta-analysis by Thöni et al. sheds some light on the question. Published in the journal Pyschological Science, it synthesizes 23 studies (with >8000 participants), about gender differences in cooperation. In such studies, participants play cooperation games against each other. These games are essentially a multiplayer, continuous version of the Prisoner’s Dilemma – players can choose to be more or less cooperative, with possible strategies ranging from total selfishness to total selflessness.

So, in cooperation games, we expect women to cooperate more often than men, right? After all, women are socialized to be caring, supportive and empathetic, while men are taught to be selfish and dominant, aren’t they? To find out, Thöni et al aligned all of these studies on a single cooperativeness scale, and compared the scores of men and women. Here are the averages, for three different game variants:

This is strange. On average, men and women are just equally cooperative. If society really allows men to behave selfishly, it should be visible somewhere in all these studies. I mean, where are all the criminals/rapists/politicians? It’s undeniable that most of them are men, right?

The problem with the graph above is that it only shows averages, so it misses the most important information – that men’s level of cooperation is much more variable than women’s. So if you zoom on the people who were either very selfish or very cooperative, you find a wild majority of men. If you zoom on people who kind-of cooperated but were also kind-of selfish, you find predominantly women.

As I’m sure you’ve noticed, the title of the Thöni et al paper says “evolutionary perspective”. As far as I’m concerned, I’m fairly skeptical about evolutionary psychology, since it is one of the fields with the worst track record of reproducibility ever. To be fair, a good part of evpsych is just regular psychology where the researchers added a little bit of speculative evolutionary varnish to make it look more exciting. This aside, real evpsych is apparently not so bad. But that’s not the important part of the paper – what matters is that there is increasingly strong evidence that men are indeed more variable than women in behaviors like cooperation. Whether it is due to hormones, culture, discrimination or cultural evolution is up to debate and I don’t think the current data is remotely sufficient to answer this question.

(Side note: if you must read one paper on the topic, I recommend this German study where they measure the testosterone level of fans of a football team, then have them play Prisoner’s Dilemma against fans of a rival team. I wouldn’t draw any strong conclusion from this just yet, but it’s a fun read.)

The thing is, men are not only found to be more variable in cooperation, but in tons of other things. These include aggression, exam grades, PISA scores, all kinds of cognitive tests, personality, creativity, vocational interests and even some neuroanatomical features. In the last few years, support for the greater male variability hypothesis has accumulated, so much that it is no longer possible to claim to understand gender or masculinity without taking it into account.

Alas, that’s not how stereotyping works. Instead, we see news report showing all these male criminals, and assume that our society turns men into violent and selfish creatures and call them toxic3Here is Dworkin: “Men are distinguished from women by their commitment to do violence rather than to be victimized by it. Men are rewarded for learning the practice of violence in virtually any sphere of activity by money, admiration, recognition, respect, and the genuflection of others honoring their sacred and proven masculinity.” (Remember – in the above study, the majority of “unconditional cooperators” were men.). Internet people make up a hashtag to ridicule those who complain about the generalization. We see all these male IMO medalists, and – depending on your favorite political tradition – either assume that men have an unfair advantage in maths, or that they are inherently better at it. The former worldview serves as a basis for public policy. The question of which moment to update rarely even comes up.

This makes me wonder whether this process of looking at the extremes then updating our beliefs about the mean is just the normal way we learn. If that is the case, how many other things are we missing?

bookmark_borderArgumentative prison cells

Two persons are trapped in a prison cell. The warden gives them a controversial question they disagree about, and promises to set them free if they manage to reach an honest agreement on the answer. They can discuss and debate for as long as they need, and all the relevant empirical data are available. Importantly, they are not allowed to just pretend to agree: they must genuinely find common ground with each other for the door of the prison cell to open. Needless to say, both participants want to escape the room as soon as possible, so they will do their best to reach a honest agreement1I know some of you would love to stay forever in a room with unlimited time and data – just pretend you want to leave the room for the sake of the thought experiment..

In most cases, a handful of good arguments from each side may be enough to settle the case. Sometimes, they would disagree on the meaning of the question itself, in which case they would first spend some time arguing about terminology, before arguing about the content of the question. In more complicated cases, the subjects might turn to a meta-discussion about the best method to reach agreement and get out of the room. If they must debate about whether to rely on the Scientific Method or the double-crux or any other advanced epistemic jutsu, they have all the time in the world to do that. The question is, is it always possible to escape the Argumentative Escape Room? Given unlimited time, will any two persons necessarily reach an agreement on any possible question, or are there cases where the two persons will never agree, despite their best efforts?

Of course, it is easy to find trivial cases where this will not work. For sure, if one participant is a human and the other is a pigeon, agreement might be hard to reach (although, you can’t say the pigeon really disagrees either, right?). If one participant has Alzheimer’s and forgets everything you say after two minutes, it will be hard to change their mind on any somewhat complicated topic. But these are edge cases.

A more difficult question is whether some people just lack the fundamental intelligence to understand certain arguments, or if anybody can eventually understand anything given enough time. To take an extreme case, suppose one of the participants is a rudimentary AI with a very limited amount of memory space. Some arguments based on experimental data will never fit in that memory. It might be possible, in principle, to compress the data by carefully building layers of abstraction on top of each others, but there is a limit. Likewise, many mathematical proofs require logical disjunction, where you split the claim into a number of particular cases, and prove you are right for each case taken separately. If you are arguing with an AI who firmly disbelieves the 4-color theorem but lacks the hardware to survey the 1482 distinct cases, it is going to be very hard to truly convince it. Without knowing how the brain works, I am not sure how this would translate to humans debating “normal” controversial questions. Let’s say your argument involves some advanced quantum mechanics. Most people won’t understand it at first, but since you have all the time you want, you could just teach QM to the other participant until she gets your point and can agree/disagree with you. I have good hopes that most humans could eventually understand QM given enough time and patience. But it is not clear what are the absolute limits of one particular human brain, and whether these limits differ from person to person.

The problems I mentioned so far are merely “technical” difficulties. If we leave these aside, it seems reasonable to me that the two players will reach agreement on pretty much any factual statement or belief. If everything else fails, both parties can agree that they do not know the correct answer to the question, that more research is needed, that the question does not make sense, that the problem is undecidable. The real problem lies on the other branch of Hume’s fork. What happen if we ask the two participants to agree on moral values?

Is it okay to kill a cow for food? Is it okay to steal bread if your family is starving? Is it okay to kill a stolen cow for food if your family is starving? There is a Nature Versus Nurture kind of problem here. If values are entirely cultural, or come entirely from lived experience, then there is no reason to think that, after a sufficient time spent together, the two participants will never put their sacred values into perspective and find common ground about what is okay or not. On the other hand, if values are in part influenced by your brain’s mechanisms for emotion, empathy or instinct, like the structure of your amygdala or the sensitivity of your oxytocin receptors, then it’s entirely possible that two people will simply have different values, no matter how long they discuss it. We already know from classical twin studies that political opinions are in large part influenced by genetics. In developed countries, genetic factors are responsible for about half of the variance in attitudes towards egalitarianism, immigration and abortion. They might explain one third of the variance in patriotism, nationalism, and homophobia. One study suggested that an intra-nasal administration of oxytocin leads to increased ethnocentrism (but check out this skeptical paper for good measure). There is even a strange study were researchers could bias the reported political opinions of participants by stimulating parts of their brain with magnetic fields2That’s right, scientists MANIPULATED people’s views on IMMIGRATION using MAGNETS. Please, never tell my grandmother about this study.. Thus, it is pretty clear that our opinions and values are not just the result of experience and reasoning, but also involve a lot of weird brain chemistry that we might no be able to change. Genetic differences are only one obvious factor of inescapable disagreement, but they are likely not the only one. For example, it is easy to imagine that some experiences will leave irreversible marks on one’s psyche (for an interesting illustration, look at the story of Gudrun Himmler). Can such barriers ever be overcome through discussion? I’m not sure.

But that is just a fun thought experiment with mildly philosophical implications about the existence of objective truth. Since unlimited time is quite uncommon in the real world, and since reaching honest agreement is rarely the only goal of people who argue with each other, does it ever matter in practice? I think this thought experiment is important, because it clarifies our underlying assumptions about how we collectively handle disagreement.

When one defends the marketplace of ideas, deliberative democracy and absolute free speech, it is implicitly assumed that, for all practical purposes, any disagreement can eventually be solved through discussion and explanation. If it turns out some people will simply never agree because their minds operate in fundamentally different ways, then the marketplace of ideas probably needs a patch. The scenario that Karl Popper describes in his “paradox of intolerance” is precisely such a situation: there are very intolerant people out there who simply can’t be reasoned with, so the best thing you can do is silence them. One essay from Scott Alexander describes two approaches to politics: mistake and conflict. Mistake theory is when you believe everybody wants to benefit the collective, and disagreements come from people being mistaken about the best way to achieve that. Conflict theory is when you believe that people are just advocating for their own personal advantage, and disagreements come from people serving different goals. On first sight, those who believe it is usually possible to escape the room might gravitate towards Mistake Theory, while those who think otherwise might be driven to Conflict Theory. However, things are more complicated.

In a recent study, Alexander Severson found that, when people are presented evidence that political opinions have genetic influences, they typically become more tolerant of the other side. From the conclusion part:

“We proudly weaponize bumper stickers and traffic in taunt-infused comment-thread witticisms in the war against the political other, all in part because we believe that the other side chooses to believe what they believe freely and unencumbered. […] In disavowing this belief and accepting that our own ideologies are partially the byproduct of biological and genetic processes over which we have no control, we may end up promoting a more tolerant and kinder civil society.”

Somehow, since the outgroup’s obviously wrong opinions are altered by their genes, it’s not entirely their fault if they disagree with you, so it becomes a forgivable offense. Alternatively, if differences in our opinions partially reflect differences in our bodies, then peace is only possible if we accept the coexistence of a plurality of opinions, and we may as well embrace it. Interestingly, in this study, about 20% of the participants ignored all the presented evidence, firmly rejecting the idea of any possible genetic influence on opinions. Perhaps the evidence that Severson showed them was not all that convincing, or perhaps the belief that genetics can influence beliefs is itself influenced by genetics, which, at least, would be fun to argue.

I’m curious about whether this question has already been treated by other people, in theory or – even better – experimentally. If you know of anything like that, please let me know.

bookmark_borderThe Hundred Coca-Colas

1.

Knowing nothing of the inextricable complexity of the human administration it was flying into, the fly entered through the vent of the workstation’s fan. It slipped into the depths of the circuitboard, causing a single-bit error in the index of the Reference Legal Archive. The intern in charge of proof-reading felt that something was different, but could not pinpoint exactly what. The fully automated computer system had corrected any inconsistency in paragraph numbering. When the updated text of the law was sent to all executive forces., nobody noticed that an entire section had been erased.

In Terry Gilliam’s 1975 film Brazil, a fly gets jammed in the apparatus of a dystopian bureaucratic administration, creating an error which serves as a starting point for the entire story. As our legal systems become increasingly bureaucratic and complicated, it is a fun exercise to think about what could happen if a small modification was randomly introduced into the law, as a mutation in the genome of society. Certain mutations would have no effect, some would lead to the rapid collapse of civilization, and, who knows, some might even be beneficial.

But there is one simple mutation – a deletion of single legal concept – that I believe has the potential to make our society much better in the long run. I am talking about trademarks, and I will explain why I think they should be abandoned. There has been a lot of debate about whether patents or copyright should be abolished, but even anti-patent and anti-copyright activists like the Pirate Party’s founder Rick Falkvinge or the GNU guru Richard Stallman think trademarks are a good thing. This is how far out the Overton Window we are going. Well, I don’t actually think that they should be just erased at once – I am aware that trademarks, by design or by accident, serve all kinds of roles in our current societies, so we couldn’t abolish them just like that, without carefully planning how these roles would be filled instead. But you already know the arguments in favor of the status quo. Rather, I am just going to present the radical idea of abolishing trademarks in a one-sided way, with the hope to make you question whether trademarks are as natural, necessary and optimal as they appear to people who are used to them.

2.

Thank you for coming to this emergency meeting. As you may know, we are facing a problem without precedent. Since this morning, a second Coca-Cola company has entered the market. The first batches are already reaching retail stores as I’m talking.
– A second Coca-Cola company? How so?
– Another Coca-Cola. The same as ours. Identical product, same packaging, same logo. It is just not produced by our company.
– Well, we sue them for trademark infringement, like we always do!
– This is where it gets complicated. Apparently the administration made a mistake when converting the official version of law to some obscure new technical standard. They said it was a computer bug or something, nobody knows. But the entire section about trademarks completely vanished from the law. At the moment, there is nothing we can do legally to protect our brand.
– You’re saying trademarks disappeared just like that? What the hell, don’t they have backups of the law somewhere?
– Of course they do, but you can’t just revert the law of the country to a previous version like that. That would be antidemocratic. As per constitution, the state will only enforce the standard version of the law from the Reference Legal Archive, and any correction will have to be voted. It might take weeks.

I know the fly scenario is highly implausible in real life, but take that as a thought experiment. Let’s suspend our disbelief and assume, for the sake of the story, that all laws related to trademarks suddenly disappeared. In other words, anybody can brand their product as they want, and counterfeits are basically legal. That does not mean one can write whatever they want on the packaging – required information like ingredients, contact info or quantity are still enforced as always –, but the brand is no longer protected. Anybody can start manufacturing Coca-Cola and call it Coca-Cola.

3.

– The marketing department just got the results from panel testing. “The One and Only Coca-Cola” did pretty bad, only 20% of the panel picked it. “The Original Coca-Cola” works much better. People are confident that we are the original one if we write that on the label.
– But we are
not the original Coca-Cola, are we?
– As far as the law is concerned, we are.
– Oh right. What about the holograms?
– Bigger is better. I mean, I don’t want this to escalate out of control, but it’s increasingly clear that people are just choosing whatever package carries the largest hologram. So we designed a new, 12 cm-wide hologram. The largest on the market. Not even “Best Coca-Cola” have such big holograms.
– Actually, they’re no longer called “Best Coca-Cola”. If I remember correctly, they changed their name to “The Original Coca-Cola” last week.

1970 anti-war poster, Berkeley university

This might go on for a while. Eventually, the original companies have to face the hard truth – their brands only existed as long as the State was willing to protect them. Without them, they are just one manufacturer among many others selling the same product under the same name.

But what if it is not the same product? One company might seize the opportunity to sacrifice quality and cut down the costs. To quote Rick Falkvinge: “Trademarks are basically good, as they primarily serve as consumer protection. If it says “Coca-Cola” on the can, I know that The Coca-Cola Company guarantees its quality.” I personally doubt this, and my doubts are supported by blind tests where participants taste food without knowing the brand1“Our conclusion is that brand image is the only explanation for the premium commanded by the supplier brands in the four food product markets. The consumer is paying a premium for the often intangible benefits inherent in a branded product. Only in washing-up liquid did the leading brand offer better intrinsically superior value for money.” – Davies et al., 2004..

Moreover, it’s important to separate the effect of trademarks themselves, from the effect of other regulations. As a case study, let’s look at counterfeit medicines. This is obviously a rampant problem, with about half of the pills sold online being fakes and many people dying because of it. But trademark infringement is not the root of the problem here. The factories who make counterfeit medication break the law in two different ways: first, they infringe a trademark, second, they deliver pills that do not contain the chemical mentioned on the label (or not in the right concentration). The danger of counterfeit medication comes from the latter, and has nothing to do with the trademark. Without trademarks, copycats could copy the name, the logo and the slogans, but they still couldn’t lie about the content or cGMP-compliance, which would still be enforced by law. The reputation of brands could be fully replaced by product certification, where an independent organism delivers a label if the products meets a certain standard, as it already exist for environmental impact, ethics, health, compliance to religious traditions and so forth. There are even certifications that certify certification bodies’ certification procedures. Or, you know, if everything else fails, you can just go for the cheapest product.

Of course, at this point, there are many objections that you can make about how the standards for product certification would work without trademarks. They definitely require some level of legal protection, otherwise anybody could just copy the name and logo of an existent certification but with more lenient criteria, and award it to themselves. But they shouldn’t be protected too much either, otherwise any company could have their own standard that says “manufactured in our factory at [address]” and we just re-invented trademarks. Hopefully, there is a middle ground somewhere, where labels are unique and meaningful, yet flexible enough so they can be fulfilled by any competitor entering the market. That is not going to be a clean and elegant solution, but trademarks were never clean and elegant either. If trademarks did not exist and I was arguing for introducing them, one could also come up with many loopholes and objections: what if your actual last name is McDonald and you want to start a fast-food chain? Should trademarks be transferable to other people and if so, how does that not defeat the purpose of trademarks? If not, what happens when Sir Coca-Cola, First of His Name passes away? What if I start a company called “Coca-CoIa”, where the 7th letter is a capital i instead of an L? Can I trademark an image, a sound, a smell, a taste? In practice, these issues are fixed using a ton of specific laws and jurisprudence, that legal experts must navigate to tell what is ok and what is not. Likewise, without trademarks, a new legal framework would be necessary for product certification to actually work. But why would we even get rid of trademarks?

4.

Something in the city was not the same. You would just walk to work, as you’d been doing everyday for years, but you kept noticing things that you had never paid attention to before. A pigeon nest, a 19th century street lamp, a tree, a wrought iron balcony, the stamped pattern of a manhole. All these things had been here forever, but you could not see them, because the flashing advertisement billboards would catch all your attention.

The Eiffel Tower used as a billboard, 1925-1934. Wikimedia Commons

Without trademarks, there is no point in advertising your brand, since anyone else could just use the same brand and benefit from your advertisement. And this is fortunate, because advertising is the ultimate form of evil. I talked before about how the Chinese government buys “sponsored content” in western journals to print propaganda disguised as legitimate articles. In 2016, as the New York Times distanced themselves from the less-reputable “fake news” media, they realized painfully that their own website was displaying its own fake news in the form of advertisement – like announcing the death of a celebrity who was still alive. In their classic book Manufacturing Consent, Herman and Chomsky describe how journals that rely on advertisement are pressured into printing things that favor the advertiser. That’s not to mention the attention cost of constant interruption, the mass surveillance necessary for “behavioral” advertising, the waste produced by junkmail, or the perpetuation of harmful stereotypes by commercials (although causality is contested). Without trademark protection, most of this would spontaneously disappear, making the world a much better place.

Can we really live without advertisement? The best natural experiment comes from Brazil. In 2006, the city of Saõ Paulo enacted a law called Cidade Limpa, prohibiting all outdoor billboard advertisement. In a survey more than 10 years later, the citizens had no regrets, and the majority of them wanted to keep the ban in place. Other cities have made similar (albeit milder) attempts. Of course, these legal bans might sound a tiny bit authoritarian, and one can wonder where is the safeguard between banning ads and censoring speech. In addition, these policies are not that radically effective – in São Paulo, advertisement started to appear again after a few years, in more convoluted forms, stealthily integrating itself into urban furniture. Abolishing trademarks, on the other hand, would circumvent these problems and cut brand advertisement from its roots. No ban has to be enforced – in fact, it’s not about enforcing a new law, but stopping enforcement of an old law. We remove a little piece of coercion from the state, the police no longer comes when someone infringes a trademark, and the entire advertising industry becomes unprofitable. The most brilliant computer scientists in the world can go back to doing useful things, instead of building machine-learning models for consumer tracking and targeted marketing.

5.

“Help us bring the best content to you, for free”. The old advertisement-based media started a massive communication campaign to persuade citizens to vote trademarks back into the law. Yet, people just had a glimpse of an ad-free society, and many wondered whether they really missed the advertising giants so much.

Needless to say, all the big companies that rely on advertisement for funding would be in immediate danger. Some might try to defend the advertising industry by claiming it allows to obtain things for free. You get free search engines, free bus stops, free newspapers, what is there to complain about? This is a gargantuan scam. Let’s investigate. Internet companies like Twitter, Facebook or Google use advertisement as their primary source of revenue. This includes directly displaying ads to the consumer, as well as accumulating information about their users to sell it to third-parties. In turn, this process manipulates consumers into buying products they wouldn’t otherwise. In effect, advertisement makes you pay a premium on everyday products, and that is where the money comes from. How much is that? In the third quarter of 2020, Facebook made a bit more than $10 billions from North America only. Divide this by 255 millions users that are active monthly, you get $40 per user per quarter, that is $120 per year. And that’s the average for monthly users. If you go to Facebook daily, it will be much more. A similar calculation for Twitter gives about $20 per user and per year worldwide (like for Facebook, it may be much more if you live in a rich country). Google doesn’t disclose how many users they have, but given their worldwide revenues exceeded $160 billions in 2019, even if every 7.8 billions humans on Earth used Google (this is a lower bound) that would still be about $20 per person. Of course, it must be something like an order of magnitude higher if Google also provides your e-mails, document storage, maps, browser and so forth. Oh, and JCDecaux, the arch-evil Great Satan of public space advertising, made €3.9 billions in 2019. Now make a list of all the “free stuff” you get in your daily life (other free websites, applications, TV commercials, movie theater advertisements, sponsored content, …) and calculate the grand total. That’s an expensive free lunch.

Keep in mind this is only a fraction of the real cost of advertisement, since the companies who buy ads or data from Google et al are expecting a positive return on investment. The amount they give to advertising companies is only a lower bound to the premium they can trick consumers into paying. For example, Google claims that people who advertise with them get an average return on investment of 8-to-1. If that is true, what we previously estimated using Google’s revenues must be multiplied by eight to obtain the real cost for the consumer.

Even worse, competitors on a market are engaged in a Moloch-esque red queen race, where each company must spend more and more money on marketing just to stay in the game. Where do all these wasted resources come from, if not from the consumer’s pocket? Without advertisement, I’d speculate that companies would resort to the next best strategy instead, that is cutting prices. Hopefully, the large premium people pay for marketing would be subtracted from the price of day-to-day products.

Finally, for those who still think Internet ads are good because they support the creative class, remember that only a fraction of what you pay goes to the authors, and you would be better off with something like Patreon. As for server costs, a centralized service like Youtube might resort to paid subscription, in which case they would have to compete with decentralized, p2p-based alternatives like PeerTube which may turn out a lot cheaper. Also, when we talk about Internet Giants, we often forget that one of them never relied on ads in the first place – Wikipedia has run entirely on donations for two decades, and they did better than Google’s own attempt at making an encyclopedia.

6.

It was a passive revolution – no plutocrat was be bereft, no king was beheaded, no parliament was burnt, no landowner was expropriated. Removing a tiny piece of legal coercion made the entire society less coercive.

In their modern form, trademarks are about 150 years old2Sumerian merchants were already marking stuff with their seals some 5000 years ago, but this worked in a pretty different way and I don’t think those merchant marks were protected by the State.. This is just old enough so nobody remembers how things worked before trademarks, and we accept them as a part of nature that’s been here forever. 150 years old is also just young enough so the long-term efffects of trademarks have not been thoroughly tested and selected for by cultural evolution. If you want to overthrow a 3000-years old tradition, you should remember Chesterton’s fence and think carefully about why it’s there and why it remained in place for so long. But 150 years old? That could just be a temporary mistake.

Do you think this guy owns a trademark? Probably not. After all, he’s an actor posing for stock photos.

Omnipresent advertising is one of the things that did not go so well in our modern capitalist society. Another one is the emergence of a handful of aristocrats with an astronomical amount of financial power. These commercial empires are, to a large extent, built on the salience of their brands, itself built on advertisement, itself built on trademarks. Once we see trademarks not as something natural and necessary, but as a legal mistake of the 19th century, those empires appear to be built on very artificial foundations. If we removed them, the plutocrats would be forced to adapt, or lose their fortune. On the other side, the fall of brands would be a blessing for individual artisans and local shops. They did not rely on trademarks anyways, and they can use the now-cheap advertisement space to get known from local customers. Nevertheless, as soon as one of them grows big enough to try to advertise their brand, copycats would appear and make the brand useless. Like a rubber band, this would pull companies back to the human scale. Somehow, this echoes a point Guy Debord makes in La Société du Spectacle: “With the generalized separation of the worker and his products, every unitary view of accomplished activity and all direct personal communication among producers are lost.” A bottle of Coca-Cola is a calibrated, standard, almost abstract entity that contains no trace of the individuals who were involved in its production. While Debord sees this as an essential feature of capitalism, I would say that it’s rather a feature of brands, which act as an abstraction layer between the chain of production and the consumers.

Let’s speculate even further. Building a brand and making sure the public knows about it is a major obstacle for new companies. In post-brand capitalism, it may be much easier for newcomers to enter the market. Any company making products with good certifications, for a low enough price could readily compete with the most established industrial trusts. Monopolies would be much harder to establish, and even if someone actually manages to reach a monopoly on something, they could not make a lot of additional profit out of it because some unknown player could just enter the market under the same name as soon as they increase their prices too much. In the long run, economic inequality might even erode a little bit. That’s not too say you can’t bereave the plutocrats in addition to abolishing trademarks, if you are into this kind of things.

7.

I guess it is time for a reality check. First, there is the problem that brand abolition is not exactly the most viable political project. That’s because the people who benefit from advertisement are precisely the ones who are in the best position to define public opinion. It might not be easy to remove something that directly benefits journalists, news sites and search engines.

Second, the obvious: if the government actually decides to store the entire law on a single computer, and if a fly actually does crash into the motherboard and erase everything about trademarks, the world would not instantly become a post-brand utopia – there would most likely be a lot of turmoil and violence and chaos and everybody would be upset at me. If this happens, you are welcome to complain in the comments. That is, if you can find the real Telescopic Turnip among the hundred copycats.

bookmark_borderInvisible privileges

1.

Let’s review the evidence.

African-Americans are twice as likely to be stopped by the police. Police officers speak less respectfully to them. They are more likely to use violence against them. Overall, African-American men get killed by the police 2.5 times as often as White men. Then, African-Americans face discrimination at every stage of the justice system. For an identical case and history, African-American defendants have 10% higher odds of being incarcerated. When they are, they receive 10% longer sentences for the same crime.

African-Americans also face discrimination when they are the victims. Criminals receive lighter sentences when their victim is black1 However, it is not clear whether this is really due to discrimination or to other factors. In fatal traffic accidents, drivers receive a 53% shorter sentence if the person they killed happens to be black. When a black person goes missing, there is 3.1 times less media coverage than if the victim is white2This is called the Missing White Woman Syndrome. This study was conducted in 2016, before the Black Lives Matter movement gained popularity. It would be interesting to see how things have changed as a result..

Institutional discrimination also appears in the education system. Teachers systematically give better grades to students from the white majority than to ethnic minorities, for identical works3There are two kinds of methodologies to address this question. The most common is to compare the grades obtained by a student when the teacher knows her identity, with grades obtained by the same student on blind examinations. The second one is to fabricate a fake essay and ask teachers to grade it, while changing only the name of the student, and see if they are graded differently.. At school, African-American children receive harsher punishments for the same behavior as well as closer surveillance from teachers. And overall, in the US, African-Americans are 12% less likely to access higher education than white people.

Then, there is housing discrimination. When ethnic minorities apply to rent an apartment, their odds of receiving a positive response are 47% lower, everything else equal. With no surprise, African-Americans are 4.5 times as likely to be homeless, and then 45% less likely to be sheltered.

In addition, ethnic minorities generally have poorer health than white people. Black people work more dangerous jobs, making them 33% more likely than white people to be injured at work. They are 16 % more likely to die on their workplace. On average, the life of black people is 4.3 years shorter than white people’s.

Most of those results are from large studies, they are solid and have been replicated many times. Yet some people decide to completely ignore all the evidence, and still deny the existence of racist discrimination. How is it even possible? What is going on in the head of racism-deniers?

2.

Men are 2.5 times more likely than women to be stopped by the police. Police officers are more likely to arrest men and more lenient toward women. Overall, men get killed by the police 20 times more often than women. Then, men face discrimination at every stage of the justice system. Men are more likely to be considered guilty and receive harsher sentences than women for an identical case and history4These studies are called “mock juror trials”. They use a panel of jurors who are presented with a fictional case, where only the gender or ethnicity of the defendant is changed, and asked what the sentence should be. This way, everything is exactly identical except the gender of the defendant, so any difference can be attributed to discrimination. Some studies even staged fake audiences with comedians for extra realism.. Men have 1.64 to 2.15 times higher odds of being incarcerated, depending on the study. When they are, men also receive 30% to 63% longer sentences for the same crime compared to women5These are observational studies, meaning they look at the outcomes of a large number of real-life cases, taking into account offense severity, previous offenses, whether the defendant has to take care of children, and other confounders.. The sexist bias favoring women is much larger than the racial bias – that is, black women are treated better than white men. As you might expect, justice’s double-standard against men is especially marked for sexual offenses.

Men also face discrimination when they are the victims. Criminals receive lighter sentences when their victim is a man6With the same caveat as for racial discrimination.. In fatal traffic accidents, drivers receive a 36% shorter sentence if the person they killed happens to be a man. When a man goes missing, there is 2.9 times less media coverage than if the victim is a woman.

Institutional discrimination also appears in the education system. Teachers systematically give better grades to girls than to boys, for identical works. This happens already in elementary school, continues in middle school, and again in high school, and again in college7Interestingly, these studies found that female teachers were on average more biased in favor of girls than male teachers.. This favoritism for girls has measurable effects on boys’ progress and future career orientation. Parents also invest more time teaching girls than boys and spend 25% more money on girls’ education. At school, boys receive harsher punishments for the same behavior as well as closer surveillance from teachers. And overall, in the US, men are 16% less likely to access higher education than women. Here again, the gender gap is larger than the racial gap8Moreover, unlike for ethnic minorities, there is no affirmative action attempting to correct this disparity – even when women are more likely to access higher education, affirmative action is still in favor of women..

Then, there is housing discrimination. When women apply to rent an apartment, their odds of receiving a positive response are 28% higher than men, everything else equal. With no surprise, men are 1.5 times as likely to be homeless, and then 40% less likely to be sheltered. A study in France found that 90% of the people who die in the streets are men9It should be noted that the gender gap in homelessness is more marked in France than in the USA..

In addition, men generally have poorer health than women. Men work more dangerous jobs, making them 40% more likely than women to be injured at work. They are 8 times more likely to die on their workplace. On average, the life of men is 5 years shorter than women’s10The gap in life expectancy is commonly attributed to biological factors, as a legitimizing myth. However, this study on monks and nuns (who do pretty much the exact same things throughout their lives) found that at most one year of the gap could be attributed to biological differences.. In spite of this, there is much more scientific research and US national offices dedicated to women’s health. Medical research on women’s health receives considerably more funding than men’s health, even for conditions that affect men more often11See the tables from page 56. For lung cancer, in 2016 the NIH spent $180,000,000 for women-specific research, $318,000 (!) for men-specific research, and $136,000,000 for lung cancer in general. They also spent $1,916,000 for women’s suicides, and only $156,000 for men’s suicides, despite men dying from suicide about four times as often..

Like for racism, most of those results are from large studies, they are solid and have been replicated many times. Yet some people decide to completely ignore all the evidence, and still deny the existence of discrimination privileging women. Just like racism, discrimination against men has been systematically made invisible.

3.

I am aware that many readers will hear about discrimination against men for the first time. Perhaps you’ve heard about discrimination from the police beforehand, but did you know about the grading discrimination? Did you know about the housing discrimination? If not, why didn’t anybody tell you about it?

One thing to consider is that people can’t really tell how much discrimination they face based on their subjective experience. In their classic 1997 book Social Dominance, social psychologists Jim Sidanius and Felicia Pratto report that (in 1997) many African-Americans had no clue about how much racism they faced12See page 106 of the book.. In the 1990s, 58% of African-Americans believed they had the same housing opportunities as white people. 46% thought they had the same chances at employment, and 63% thought they had the same chances in education – despite clear evidence of the contrary13Sidanius and Pratto dedicate the third part of their book to evidence of discrimination against black people. However, they completely disregard discrimination against men – to be fair, most of the evidence that I discussed here was published after the book Social Dominance came out, so you can’t blame the authors.. This is one of the universal patterns described in Social Dominance: unfair treatment against subordinate groups is overlooked, legitimized, and actively erased by the dominant status quo, until even the discriminated population believes it is not real. It is perfectly possible to face discrimination on a daily basis and be completely unaware of it.

In addition, there is growing evidence that people (academics, the media, people in general) care very little about the issues that affect men. Most people know about manspreading, but have never heard about the teacher grading gap. People think gender balance at work is important, but only in professions where women are underrepresented. Scientific studies that find a bias against women are cited far more often than studies that find a bias against men, even when the later use larger samples. Remember the kidnapping study I mentioned above, which found that there is less media coverage when a man goes missing? This is the same process. Presumably, this attention disparity is the result of traditional gender roles, which (among many other things) say that men are not expected to complain, and will be shamed if they do so – but this is a complicated topic that deserves a future blog post on its own.

As a takeaway, there is a striking similarity between discrimination against ethnic minorities and discrimination against men. My point is not to say that minorities or men “have it harder”, nor is it that racism is exactly identical to sexism – the historical and social mechanisms are obviously entirely different. My point is that, currently, men and ethnic minorities experience a similar pattern of stereotyping and discrimination in their daily life. The strange polarization of the culture wars makes it even harder to notice: the political tribes who care about racism are sharply separated from the tribes who care about men’s issues. This is unfortunate, because both tribes share the common goal of eliminating discrimination14Of course, there are also traditionalists who just use men’s issues as an excuse to attack feminism, hoping to restore traditional gender roles. I personally believe, on the contrary, that traditional gender roles are the cause for discrimination, and that we need to step away from them. – maybe their filter bubbles only show them one side of the problem? It took decades for the majority of the population to realize that racist discrimination is real. For sexism against men, such a shift in collective consciousness has yet to happen.

If you spot any mistake or inaccuracy in this text or the supporting evidence, please let me know in the comments, so I can correct it.

Annex: what about hiring discrimination?

Hiring discrimination can be measured by sending fictional resumes to employers, only changing the ethnicity or gender of the applicant, and counting how many replies you get. As you expect, equally-qualified ethnic minorities are far less likely to be hired. Regarding gender discrimination, the evidence is much more mixed. This makes it very easy to cherry-pick studies that show discrimination against women (if you read feminist sources) or against men (if you read MRA sources). This meta-analysis found moderate discrimination against men, but only in female-dominated jobs. This systematic review lists 11 studies looking at pure gender discrimination (man vs woman). Two of them found discrimination against women, four of them found discrimination against men, and the rest found no discrimination. A recent study which tracked recruiters’ behavior on online hiring markets found that women face a 6.7% penalty in men-dominated occupations, and that men face a 12.6% penalty in women-dominated occupations. Overall, gender discrimination in hiring is much less systematic than racial discrimination. This discrepancy is probably a remnant of the traditional gender division of labor, since men were traditionally assigned to salaried jobs. In any case, the common claim that it is harder for women to find employment appears to be wrong.

Changelog:

30-11-2020 – According Leeth et al., 2005, the racial gaps in fatal and non-fatal workplace injuries are respectively 16% and 33%, not 20% as previously reported.

01-02-2021 – A few studies on the effect of victim gender/origins on sentencing found no evidence for discrimination after controlling for case details. Thanks Greg for pointing that out. I also moved hiring discrimination to an annex, and added the recent study by Hangartner et al.

bookmark_borderCelebrities, numerosity and the Weber-Fechner law

This article uses the net worth of celebrities as a practical example. Net worth values were shamelessly taken from celebritynetworth.com as of August 2020. They may fluctuate and become obsolete within days, but it does not change anything to the point of the article. Also, I will assume that you, the reader, have a net worth of $0 (trust me, it’s not going to matter).

I.

I recently had a discussion with my brother about Cristiano Ronaldo becoming the first billionaire footballer ever. We were both surprised, but for opposite reasons. He was surprised that no footballer ever before became a billionaire, while I was surprised that it was ever possible to reach one billion through football, even with associated income like advertisement and clothing. I think this disagreement gives some insight about the way we process large numbers. There are essentially two ways for humans to mentally handle quantities: one  is called numeracy and resorts to a set of symbols with rules that tell you how to work with them. The other one is called numerosity and is some kind of analogue scale we use to compare things without resorting to symbols. To demonstrate that numerosity is more sophisticated than it looks, let’s do a thought experiment.

Imagine you are in a large room with Jeff Bezos, the richest person in the world. There is a line painted on the floor, with numbers written on each end. One side is marked with a big 0, the other side is marked with « $190 billions ». Mmm, it looks like we are in a thought experiment where we have to stand on a line depending on our net worth, you think. As Jeff Bezos stands on the $190 billion mark, you reluctantly walk to the zero mark right next to the wall, where you belong.

You see Bezos smirking at you from the other side. Suddenly, the door opens, and a bunch of world-class football players enter the room. Intuitively, where do you think they will stand on the line?

This may come as a surprise, but compared to Jeff Bezos, the net worth of all these legendary footballers is not so different from yours (remember, you’re worth $0). Football players might be millionaires, but they are very unlikely to become billionaires, Cristiano Ronaldo being the exception. Thus, on a line from $0 to $190B, they are basically piled up right next to you. What about superstar singers?

Some singers become much richer than footballers, but they are still much closer to you than to Jeff Bezos. Let’s add a few famous billionaires. Like, people who are actually famous because they are billionaires.

Surprisingly, they are still very close to you in absolute value. Their wealth is still several orders of magnitude below Bezos. What happens if we look at big tech CEOs, like Elon Musk or Larry Page? Surely they belong to the same world as Bezos?

Now, this is indeed getting closer to Bezos. However, in absolute distance, they are still closer to you. Here is the punchline – the absolute wealth difference between Elon Musk and you is smaller than between Elon Musk and Jeff Bezos. This becomes obvious once you realize Bezos’s wealth is more than twice as much as Musk’s wealth.

II.

Why is this so counter-intuitive? This is because, unless we look carefully into the numbers, we are comparing all these large quantities using the numerosity scale, which is logarithmic. Musk has hundreds of thousands times more money than you, and only 3 times less money than Bezos. Since 3 is smaller than hundreds of thousands, you intuitively estimate that Musk is closer to Bezos than to you.

It makes sense: in the graphs above (which use linear scales), the dots for everybody under one billion are almost impossible to distinguish. If you wanted to display these people’s net worth in a readable way, you would need to use a log-scale. In the case of wealth, a log scale is especially appropriate since wealth accumulation is a multiplicative process: the more dollars you already have, the easier it is to acquire one extra dollar. In consequence, wealth can be well-approximated with a log-normal distribution, which is strongly skewed towards low values. Most values are lower than the average, but then you’ve got a few very high values that drive the mean up. A typical feature of this kind of distributions is that high values fall very far from each other. That’s why the richest human in the world (Bezos) beats the second richest (currently Bill Gates, not shown on the graphs) by a margin of several billions.

But our perception of numbers as a log-scale is not restricted to the wealth of celebrities. In fact, it appears to be an universal pattern is numerical cognition, called the Weber-Fechner law. Originally, this law is about sensory input, for example light intensity or sound loudness. But it also applies to counting objects:

In this picture (reprinted from Wikipedia), it is much easier to see the difference between 10 and 20 dots, than between 110 and 120 dots. We seem to have a logarithmic scale hard-wired into our brains.

III.

What really puzzles me about the Weber-Fechner law is that we are performing a logarithmic transformation intuitively, without thinking about it. There is evidence that it is rather innate: pre-school children have been shown to use a logarithmic number line before they learn about digital symbols. After a few years of schooling, children tend to switch away from the logarithmic line to a more linear number cognition system, which can be difficult. Eventually, in high school, they have to learn logarithms again, in an abstract formal way. Logarithms are notoriously difficult to teach (I know plenty of well-educated people who still struggle with them). This is a shame, because all these high-schoolers have been using log scales since they were young, without even realizing it.

bookmark_borderTrust your sample, not your sample of samples

The train is about to depart. Your ticket in your hand, you check your seat number, walk in the central alley, find your seat and sit down next to another traveler. You look around to see what the other people in the wagon look like.

How many people were there in the wagon you just imagined? If you are like me, it was probably rather crowded, with few empty seats. However, according to these European data, the average occupancy rate of trains is only about 45%, so there should be more empty seats than occupied ones. What is going on?

The issue here is a simple statistical phenomenon: the sample of “all the trains you took in your life” is not quite representative of “all the trains”. The occupancy rate of trains varies all the time. Some trains will be much more crowded than average, some others will be almost empty. And – guess what – the more people there are in a train, the more likely for you to be one of them. A train packed with hundreds of customers will be observed by, well, hundreds of passengers while the empty trains will not be observed at all. Thus, in your empirical sample, trains with n passengers will be over-represented n times compared to trains with only one passenger.

Here is a riddle: you want to estimate the average number of occupants in the trains that arrive to a station. To that end, you survey people leaving the station and ask how many people they saw in the same train. If you were to take the mean of your sample, the average occupancy would be over-estimated, for the reason stated above. How do you calculate the unbiased occupancy rate? Assume every train had at least one occupant (this is necessary since empty trains are never observed, so the number could be virtually anything).

We have an observed distribution P_o(n) and we want to get back to the true distribution P_t(n). As we saw before:

P_o(n) = \frac{nP_t(n)}{\sum_{k}{kP_t(k)}}

Since \sum_{k}{P_t(k)} = 1, the true distribution is

P_t(n) = \frac{P_o(n)/n}{\sum_{k}{P_o(k)/k}}

And the mean occupancy of the trains is

\langle n \rangle = \frac{1}{\sum_{k}{\frac{P_o(k)}{k}}}

which turns out to be the harmonic mean of the observed sample.

Harmonic mean is typically used to average rates. The textbook example is about calculating the average speed of something: if you write down the speed of a car once per kilometer, the average speed is the harmonic mean of your sample, not the arithmetic mean. This is because the car spends less time on the kilometers that it traveled through very fast, so you need to account for that by giving less weight to those kilometers. This is in fact closely related to the train occupancy riddle: in that case, the harmonic mean gives more weight to the trains with fewer people in them, to compensate for the sampling bias.

I don’t know if this statistical bias has a name (if you know, tell me in the comments). It occurs in a lot of situations. A prominent one is the fact that your average Facebook friend has more Facebook friends than average.

Consider how your Facebook friends are sampled: obviously, only people with at least one friend will appear in your sample. So all those idle accounts with no friends at all are already excluded. People with 100 friends are 10 times more likely to appear in your list than people with 10 friends. This leads to a big inflation of the average number of friends your friends have. To put it in a different way, if you have an average number of friends, it’s *perfectly normal* that you have fewer friends than your friends. So there is no need to worry about it.