← Research

Selection as Coupling

Evolution — natural selection mapped to coupling strength using gnomAD population data
JIM’S OVERSIMPLIFICATION

Natural selection isn’t survival of the fittest. It’s survival of the best coupled to your environment. Change the environment, change who’s fit. The genes don’t know the plan. They just try stuff and keep what sticks.

K IN THIS DOMAIN

K here is fitness coupling to environment. Natural selection optimizes K. Drift is neutral K — shape changes without coupling change. Never the same twice.

EVOLUTION IS A COUPLING FILTER

Mutations happen randomly. Every generation, DNA makes typos. Most of these typos do nothing. Some break things. A very few improve things. Evolution is the process of keeping the ones that couple well to the environment and throwing away the ones that do not.

We can see this directly. The gnomAD database contains genetic data from 807,000 people. When we look at 18,272 variants across 6 disease genes, the pattern is immediate: the vast majority of coding variants are rare. Common variants are the minority. This is purifying selection in action — mutations that break coupling get removed from the population. They stay rare because carriers do not thrive.

THE FREQUENCY IS THE SIGNAL

How rare a variant is tells you exactly how strongly selection is acting on it. An ultra-rare variant (found in 1 out of 20,000 people) is under strong selection — the body really does not want that change. A common variant (found in 5% of people) is under almost no selection — the body does not care about that change. The allele frequency IS the selection coefficient.

The formula is simple: selection pressure = -ln(allele frequency). This is not our invention. Information theory (Shannon, 1948) says the same thing. Population genetics (Kimura, 1983) arrives at the same answer from a completely different direction. We are pointing out that this is the same K from every other page on this site.

THE RELABELING IS THE INSIGHT

We did not discover purifying selection. Population geneticists have known this for decades. What we are saying is that the same variable — K, coupling strength — describes nuclear binding, protein folding, cancer drug design, materials fatigue, and evolution. Selection IS coupling. That is the claim. The data supports it. The contribution is the connection, not the formula.

K IN THIS DOMAIN

K here is fitness coupling to environment. Natural selection optimizes K. Drift is neutral K — shape changes without coupling change. Never the same twice.

THE RESULT

gnomAD DATA (18,272 variants across 6 disease genes):

  Variant allele frequency distribution:
    The vast majority of coding variants are rare.
    Common variants (af > 1%) are the minority.
    This IS purifying selection: mutations that break coupling are removed.

THE MAPPING:
  Kselection = -ln(allele_frequency)

  Ultra-rare (af ~ 0.00005):  K = -ln(0.00005) = 9.90  (strong selection)
  Rare (af ~ 0.0005):       K = -ln(0.0005)  = 7.60  (moderate selection)
  Low-freq (af ~ 0.005):    K = -ln(0.005)   = 5.30  (weak selection)
  Common (af ~ 0.05):      K = -ln(0.05)    = 3.00  (near-neutral)

THE INSIGHT:
  Allele frequency IS a direct measurement of selection pressure.
  Selection pressure IS coupling strength between organism and environment.
  The frequency distribution IS the selection coefficient distribution.
  Evolution IS coupling optimization over time.

THE MECHANISM

Mutations arise randomly. What happens next is coupling:

Pathogenic mutation (high Kselection):
  Breaks protein coupling → organism less fit → selected against
  Result: variant stays rare (low frequency = high K)
  Example: TP53 R175H — destroys tumor suppressor → strong selection

Neutral mutation (low Kselection):
  Preserves protein coupling → organism equally fit → drifts freely
  Result: variant can become common (high frequency = low K)
  Example: synonymous variants — no amino acid change → no selection

Beneficial mutation (negative Kselection):
  Improves coupling → organism more fit → selected for
  Result: variant sweeps to fixation (af → 1.0, K → 0)
  Example: lactase persistence in dairy cultures — coupling to environment

THE DATA

We used the gnomAD database (Genome Aggregation Database), the largest catalog of human genetic variation. Our index contains 18,272 variants across 6 disease-relevant genes:

Source: gnomAD v4.1 (807,162 individuals)
Our index: 18,272 variants across 6 genes
Genes: TP53, BRCA1, BRCA2, EGFR, BRAF, AR

The frequency spectrum confirms purifying selection:
  Most variants are rare — they break coupling and are removed.
  Few variants are common — they preserve coupling and persist.
  This shape is universal across all genes.
  The shape itself IS the signature of coupling selection.

WHY -ln(FREQUENCY)?

The formula Kselection = -ln(af) is not arbitrary. It emerges from two independent derivations:

1. Information theory (Shannon 1948):
  Information content of an event = -ln(probability)
  A rare variant carries more information about selection
  than a common variant. The rarer it is, the stronger
  the message: "this mutation is being removed."

2. Landauer cost (1961):
  Energy to erase one bit = kT ln(2)
  A rare allele represents a high-information state.
  Maintaining it at low frequency costs the population
  ln(1/af) units of "selective effort" per generation.
  This is the Landauer cost of purifying selection.

3. Population genetics (Kimura 1983):
  The fixation probability of a neutral mutation = 1/(2N)
  The steady-state frequency of a deleterious mutation = μ/s
  where s = selection coefficient. So: s ∝ μ/af
  Taking logs: ln(s) ∝ -ln(af) + const
  Kselection ∝ s. The mapping is consistent.

CROSS-DOMAIN CONNECTION

The same pattern appears in every domain we've studied:

Proteins: Aggregation-prone residues = high K between monomers. Selected against in evolution (rare in nature). Alzheimer's Aβ42 →

Networks: Hub nodes (high K) are under strong selection pressure. Remove a hub = network fails. Evolution protects hubs. Network science →

Ecology: Keystone species (high K) collapse the ecosystem when removed. Same selection pressure as hub genes. May's criterion →

Chemistry: Strong bonds (high K) persist. Weak bonds break. Chemical evolution = bond selection. Bond coupling →

HONEST LIMITS

What this is:
  A known concept (purifying selection) mapped to K.
  The gnomAD data is real (18,272 variants from gnomAD v4.1).
  The Kselection = -ln(af) formula is our framing, not our discovery.
  Population geneticists have known this shape for decades.

What this is NOT:
  A new theory of evolution.
  A predictive model of which mutations will be selected.
  An improvement on existing population genetics models.

What the mapping adds:
  It connects evolution to the same K that describes nuclear binding,
  quantum error correction, climate feedbacks, and language structure.
  Selection IS coupling. That's the claim. The data supports it.
  The relabeling is the insight.

COMPUTATION DETAILS

Data source: gnomAD v4.1 (807,162 individuals)
Local index: 18,272 variants, 6 genes
Hardware: Mac Mini M4 · $499 · 35W
Analysis: Allele frequency binning + Kselection = -ln(af)
Package: pip install begump

This is a reframing of known population genetics (Kimura 1983, gnomAD 2020). We show the mapping to K. The data is real. The relabeling is ours. The connection to other domains is the contribution.

GUMPResearch · Support · [email protected] · terms