K-Lag Spectrum

Coupling is a landscape, not a number.
Birds (79 recordings) and speech (180 RAVDESS recordings). The timescale of the measurement IS the dimension.

JIM’S OVERSIMPLIFICATION

K isn’t one number. It’s a landscape. Measure coupling at different timescales and you see different things. Short-range coupling tells you how urgent something is. Long-range coupling tells you what it means. The timescale of the measurement IS the dimension being measured. We found this in bird calls and proved it in human speech. The dead zone between the peaks is where the system has no natural unit — the gap between scales.

THE UPGRADE

Every previous page on this site treats K as a single number — autocorrelation at lag 1. This page is about what happens when you stop doing that. K becomes K(lag): a function, not a scalar. Compute autocorrelation at lag 1, lag 2, lag 3, all the way out. Then ask: at which lags does K best separate categories? The answer is not “all of them.” It is a structured landscape with peaks, dead zones, and crossovers. The structure itself is informative.

The Core Finding

In human speech (180 RAVDESS recordings, 8 emotions), discriminability between emotional categories is not uniform across lags. It has a two-humped structure:

• First hump (~200ms, lags 6–10): separates arousal. Angry vs. calm. Fear vs. neutral. The short-range structure of the waveform carries urgency.

• Dead zone (~400–600ms): discriminability drops. This is the timescale boundary — too long for arousal, too short for valence. The system has no natural unit here.

• Second hump (~800ms, lags 20–30): separates valence. Happy vs. sad. The long-range structure carries meaning.

The two-humped structure is CONFIRMED. The mechanism labels — “arousal” for the first hump, “valence” for the second — are HYPOTHESIS. The structure exists regardless of what we call it.

Birds Show It Too

In the bird coupling data (79 recordings, 6 species), the K-lag spectrum has a different shape: crossovers between call types rather than humps. Territory songs and alarm calls trade places in the discriminability ranking depending on the lag. The bird spectrum is real data, not a replication of the speech pattern — it shows that the landscape shape is domain-specific while the insight (K is a function of lag, not a scalar) is general.

The Dead Zone

The dead zone between the humps is a real feature, not noise. It sits at the timescale boundary where short-range coupling gives way to long-range coupling. In speech, this boundary is around 400–600ms — roughly the duration of a syllable. Below it, you’re measuring phonetic texture. Above it, you’re measuring prosodic contour. The dead zone is where neither dominates.

This has an instrument design implication: a sensor system that measures coupling at a single timescale is leaving information on the table. Three-scale measurement — short, medium, long — captures the full landscape.

Instrument Design Implication

THREE-SCALE FOR ARCHITECTURE (derived from data)

• Fast layer (~200ms): reads arousal. Urgency. Attack. How hard you’re hitting.

• Mid layer (~500ms): reads the boundary. Transition detection. When are you shifting between modes?

• Slow layer (~800ms+): reads valence. Intention. What kind of thing are you playing?

This is derived from the measured discriminability structure, not a claim about neural architecture. The data says three timescales carry different information. An instrument should listen at all three.

Bird Coupling →
79 recordings, 6 species. The original K analysis that led here.

Framework →
K, R, E, T. The four numbers. This page upgrades the first one.

Biofeedback →
Phone reads body oscillators. K-lag informs the sensor design.

The Groove →
Flow = sustained prediction error. Timescale matters.

The timescale of the measurement
IS the dimension being measured.
Same signal. Different zoom. Different truth.

Good will applied forward.

METHOD

Compute autocorrelation of the amplitude envelope at each lag from 1 to N. At each lag, run Kruskal-Wallis H test across categories (call types for birds, emotions for speech). The H statistic at each lag measures how well K(lag) separates the categories. Plot H vs. lag: the shape of that curve is the K-lag spectrum.

1. Speech Dataset

• Source: RAVDESS (Ryerson Audio-Visual Database of Emotional Speech and Song)

• Recordings: 180 speech recordings

• Categories: 8 emotions (neutral, calm, happy, sad, angry, fearful, disgust, surprised)

• Method: Autocorrelation of amplitude envelope at lags 1–50, Kruskal-Wallis H at each lag

2. The Two-Humped Structure (CONFIRMED)

The Kruskal-Wallis H statistic plotted against lag shows two distinct peaks in speech:

First hump: lags 6–10 (~200ms at 22050Hz/hop256). Peak H separates high-arousal from low-arousal emotions.

Dead zone: lags 12–18 (~400–600ms). H drops. Neither short-range nor long-range structure dominates.

Second hump: lags 20–30 (~800ms). Peak H separates valence categories (happy/sad).

H(lag) = two peaks with dead zone between CONFIRMED structure. Mechanism labels (arousal/valence) are HYPOTHESIS.

The structure is robust: it appears regardless of whether you use Kruskal-Wallis, Mann-Whitney, or ANOVA. The two humps are a property of the data, not the statistical test.

3. Bird K-Lag Spectrum

The same method applied to bird calls (79 recordings, 4 call types) produces a different landscape. Instead of two humps, the bird spectrum shows crossovers — lags where different call types trade rank order in autocorrelation. Territory songs have the highest K at short lags but alarm calls overtake them at longer lags.

• Short lags (1–5): Song > Alarm > Call > Dawn

• Medium lags (6–15): Crossover zone. Rankings become unstable.

• Long lags (16+): Different ordering emerges.

The crossover structure in birds is real data but does not survive Bonferroni correction at any individual lag. The pattern is suggestive, not statistically confirmed.

4. What K(lag) Means

Treating K as a scalar (lag 1 only) collapses a rich landscape into a single number. What you lose:

• Scale separation. Different categories that look similar at lag 1 may separate cleanly at lag 20. The scalar K hides this.

• The dead zone. The timescale boundary between regimes is itself informative. It tells you where one kind of structure ends and another begins.

• Crossovers. Two signals can have the same K at lag 1 but opposite K at lag 15. The scalar misses the reversal.

5. Three-Scale FOR Architecture

The two-humped structure in speech suggests a three-layer sensing architecture for the instrument:

Layer	Timescale	What it reads	Source
Fast	~200ms	Arousal / urgency / attack energy	First hump
Mid	~500ms	Boundary / transition / mode shift	Dead zone
Slow	~800ms+	Valence / intention / phrase contour	Second hump

This is a design implication derived from the measured discriminability structure, not a claim about how brains process sound.

6. What Was Killed / What Survives

Killed

× “Universal across 5 domains.” The original claim tested K-lag in 5 domains. Three of the 5 were synthetic (generated data, not measured). Circular reasoning. KILLED.

× E7 connection to coupling timescales. Attempted to link the lag structure to E7 lattice dimensions. No empirical support. KILLED.

Weakened

∼ Lag 5 as THE peak. Cherry-picked from a secondary analysis. The first hump is actually at lags 6–10. Lag 5 is not the canonical peak. WEAKENED.

∼ Consonance as a specific number (91%). The magnitude is threshold-dependent (depends on what tolerance you pick for “close to a just ratio”). The invariance across call types survives. The specific percentage does not. WEAKENED.

∼ “Birds choose K, humans choose E/R” as simple binary. No individual bird lag survives Bonferroni correction. The crossover pattern is suggestive but the clean binary is too simple. WEAKENED to hypothesis.

Survives

✓ K is K(lag), not a scalar. The core methodological upgrade. Measuring at multiple lags reveals structure that lag-1 hides. CONFIRMED.

✓ Two-humped discriminability in speech. 180 RAVDESS recordings, 8 emotions. Two peaks separated by a dead zone. CONFIRMED.

✓ Short-range separates arousal, long-range separates valence. CONFIRMED as structure. HYPOTHESIS for mechanism labels.

✓ The dead zone is real. Discriminability drops between the humps. The timescale boundary is a feature, not noise. CONFIRMED.

✓ Bird K-lag spectrum with crossovers. Real data showing call types trading rank at different lags. CONFIRMED as pattern, not statistically significant at individual lags.

✓ Three-scale architecture. Derived from the data as an instrument design implication. CONFIRMED as derivation.

7. Honest Limits

• Speech data is acted, not spontaneous. RAVDESS actors perform scripted sentences with target emotions. Real emotional speech may show different K-lag structure.

• 180 recordings is modest. Enough to see the two-humped structure clearly but not enough for fine-grained per-emotion analysis.

• Bird lag-by-lag results do not survive Bonferroni correction. The crossover pattern is visible in the data but no individual lag achieves significance after correcting for multiple comparisons.

• Arousal/valence labels are interpretive. The two humps exist. Calling the first one “arousal” and the second “valence” is our interpretation, not a demonstrated mechanism.

• Lag-to-millisecond conversion is approximate. Depends on sample rate and hop size. The timescales given are estimates, not precise measurements.

• Three-scale architecture is a design implication, not a tested system. We derived it from the data. It has not been built and validated as an instrument.

• 3 of 5 original “universal” domains were synthetic. Only bird calls and speech are real measured data. The other three were generated and therefore circular.

8. Open Questions

• Spontaneous speech. Does the two-humped structure appear in natural conversation, not just acted emotions?

• Music. What does the K-lag spectrum look like for musical performances? Does it have more than two humps?

• Cross-cultural speech. Is the two-humped structure invariant across languages, or does prosodic structure shift the hump locations?

• Larger bird dataset. With thousands of recordings, do the crossover patterns survive Bonferroni? Is the landscape shape species-specific?

• Instrument validation. Build the three-scale sensor. Does it actually separate musical gestures better than single-scale K?

GUMP — Research · Support · [email protected] · terms