Some amino acids are bridges holding two parts of a protein together. Mutate the bridge, the protein falls apart. But some mutations do not break anything — they flip switches. Knowing whether the mutation is breaking a wall or flipping a switch is the whole game. We built a detector that figures out which ruler to use before scoring.
You have a genetic mutation. You want to know: does it matter? The scanner answers by asking two things. First: how structurally important is this position? (Fiedler damage — how much the protein's connectivity collapses when you pull one node.) Second: has evolution preserved this position? If every species from fish to humans has the same amino acid here, it is probably important.
But there is a deeper problem. Loss-of-function mutations follow one rule (structural damage predicts disease). Gain-of-function mutations follow the opposite rule (disease mutations hit control sites, not load-bearing walls). If you do not know which regime you are in, your scores are meaningless. The regime detector solves this: 8 out of 10 correct, zero training data.
0.74 AUC across 6 disease proteins (leave-one-gene-out). Matches SIFT (2001). Does not match AlphaMissense (0.94 AUC, trained on 100M sequences). The value is not raw accuracy — every score is traceable to a physical mechanism. Fiedler network damage alone achieves 0.82 AUC. One number. No training. Pure graph theory.
Previous claims (94.7%, 83.4%) were inflated by gene-level confounders and in-sample weight optimization, discovered and corrected in Session 23. The current 0.74 is validated under strict leave-one-gene-out with no learned weights.
| Variant | Disease | Score | Mechanism |
|---|---|---|---|
| KRAS G12D | Lung/pancreatic cancer | 0.545 | GTPase P-loop disruption |
| p53 R175H | Cancer (#1 hotspot) | 0.254 | Metal site + charge loss |
| HBB E6V | Sickle cell | 0.570 | Surface hydrophobic patch |
| SOD1 A4V | ALS | 0.180 | Buried packing change |
Benign variants correctly identified:
| Tool | AUC | Training data |
|---|---|---|
| SIFT (2001) | 0.69–0.74 | Conservation |
| PolyPhen-2 (2010) | 0.75–0.81 | Conservation + structure |
| CADD v1.6 (2019) | 0.82–0.87 | Genome-wide meta-predictor |
| REVEL (2016) | 0.90–0.94 | ClinGen-calibrated ensemble |
| AlphaMissense (2023) | 0.94–0.96 | AlphaFold + 100M sequences |
| GUMP (2026) | 0.74 | Fiedler damage + MSA + physics |
The two-ruler problem: a universal scorer that treats all genes the same will score GoF genes backwards. The detector identifies which ruler to use before scoring.
| Gene | Detected | Expected | Conf | K-Dmg Corr | n |
|---|---|---|---|---|---|
| TP53 | LoF | LoF | 0.48 | +0.056 | 1,331 |
| ATM | LoF | LoF | 1.00 | +0.165 | 32 |
| FBN1 | LoF | LoF | 1.00 | +0.011 | 34 |
| EGFR | GoF | GoF | 1.00 | -0.392 | 23 |
| AR | GoF | GoF | 0.81 | +0.053 | 57 |
| PIK3CA | GoF | GoF | 1.00 | -0.913 | 6 |
| BRAF | GoF | GoF | 0.29 | +0.046 | 8 |
Jim McCandless, beGump LLC. All computation on Mac Mini M4, 16GB, 35W. No cloud. Test variants, ortholog data, gnomAD index, and validation script included with the package.