← Research

Why Birds Sing

Coupling function predicts call structure across species.
79 recordings. 6 species. The structure tracks the function.
JIM’S OVERSIMPLIFICATION

Birds sing for 5 reasons. Territory, mating, alarm, contact, and the dawn chorus. Everyone studies WHAT they sing (spectrograms) and HOW (syrinx mechanics). Nobody asks WHY in a way that makes predictions. The framework does: if you know the coupling function, you can predict the structure of the call before you hear it. Territory calls should be repetitive and medium-energy (maintaining a boundary). Mating calls should be complex and consonant (establishing new coupling). Alarm calls should be short, harsh, and unpredictable (breaking coupling). Contact calls should be simple and periodic (maintaining existing coupling). Dawn chorus should show a phase transition (scattered to synchronized). We tested 79 recordings across 6 species. The structure tracks the function.

K IN THIS DOMAIN

K is coupling between the singer and the listener. A territory song is high-K: locked, repetitive, broadcasting “I’m here” on a stable frequency. An alarm call is anti-coupling: short, sharp, designed to break whatever you were doing. Contact calls are maintenance coupling: “still here, still here.” Mating songs are coupling proposals: complex enough to show fitness, consonant enough to feel good. Same K that governs a drummer locking in with a room.


The Thesis

Every bird call has a coupling function — the reason it exists. Territory, mating, alarm, contact. If the framework is real, the function should predict the structure. Not after the fact. Before you look at the data.

We wrote the predictions first:

Predictions (written before analysis)

Songs (territory/mating) should have higher K (more autocorrelation), higher R (richer spectrum), higher T (more timing variability — rubato), and higher consonance than calls.

Calls (contact/alarm) should be simpler, more periodic, lower entropy.

Alarm calls should show the lowest K and highest entropy of any type — designed to startle, not couple.

Dawn chorus should show a phase transition — R climbing as scattered individuals synchronize into a coordinated soundscape.

Consonance should be high everywhere — birds, like humans, should prefer simple frequency ratios.

Then we analyzed 79 recordings from xeno-canto.org across 6 species: European Robin, Eurasian Wren, Great Tit, Common Blackbird, Song Thrush, and House Sparrow — plus 10 dawn chorus recordings.


What We Found

1. Songs are more coupled AND more complex than calls

Songs show higher spectral richness (R = 0.748 vs 0.693 for calls) and dramatically higher timing variability (T = 0.993 vs 0.834). Songs have more rubato — they breathe. Calls are metronomic.

Songs stretch and compress time. Calls keep it steady. Same pattern as human music vs. speech: musicians play with time, speakers don’t.

Song T = 0.993    Call T = 0.834 Songs breathe. Calls clock.

2. The consonance finding

This is the result that matters most. Consonance is remarkably stable across all call types and species — the channel doesn’t change, only the message does. Territory songs, mating calls, alarm calls, contact calls: they all use simple frequency ratios at comparable rates. The lowest single recording is 82%. Birds overwhelmingly prefer octaves, fifths, fourths — intervals that birds and humans both favor.

This is not cultural. No bird learned Western music theory. This is physics: the ear (avian or mammalian) is a frequency analyzer, and simple ratios are cheaper to process. The invariance across call types is the finding — it suggests consonance is a property of the transmission medium, not the signal content.

Consonance: stable across all call types and species 79 recordings, 6 species, lowest = 82%. The invariance is the finding.

3. Shannon entropy separates songs from calls

Songs have higher Shannon entropy (H = 4.66) than calls (H = 3.95). Songs are less predictable. They explore more of the transition space between notes. Calls repeat the same patterns — that’s the point. A contact call that said something new every time would be a bad contact call.

4. The rubato signature

Songs have timing variability that looks like musical rubato: stretching and compressing phrases for expressive effect. This shows up as high T (coefficient of variation of inter-onset intervals). Calls don’t have it. They’re rhythmically regular, like a metronome.

The same pattern exists in human music. A jazz soloist plays rubato. A fire alarm doesn’t.


What Didn’t Work

Killed

× 1/f timing at the note-to-note level. Mean exponent across all recordings: 0.15. That’s near zero — essentially white noise. Human drummers show exponents of 0.5–1.0 (Hennig 2011). Bird inter-onset intervals do not show 1/f structure. This was a clear prediction failure.

× Any claim about “language.” This is coupling structure, not semantics. Birds are not speaking. They are coupling and decoupling. The framework says nothing about meaning.

Redirected

Alarm = minimum K. We predicted alarm calls would have the lowest K. They don’t. Dawn chorus does (K=0.704). Alarm K is 0.787 — between songs and calls. This makes more sense: alarm calls need to be instantly recognizable. That means high coupling to a stored template. You hear “tick tick tick” and your body knows exactly what it means. Dawn chorus is the opposite: many independent voices, not yet synchronized, each doing their own thing. Low K is what “not yet coupled” looks like.

Survived

Consonance invariance: CONFIRMED (stable across all call types and species, 79 recordings)

Rubato in songs: CONFIRMED (T = 0.993 songs vs 0.834 calls)

Songs more complex: CONFIRMED (H = 4.66 vs 3.95)

Dawn chorus phase transition: SUGGESTIVE (2/9 show rising R, others started after transition)


The Alarm Calls

With 24 alarm recordings across all 6 species, the picture is clearer now. Alarm K = 0.787, sitting between songs (0.808) and calls (0.749). We originally predicted alarm would have the lowest K. It doesn’t. And the reason is obvious in hindsight: an alarm call needs to be instantly recognizable. “Tick tick tick” matches a template your nervous system already knows. That’s coupling — coupling to a stored pattern. The chaos isn’t in the autocorrelation, it’s in the timing (T = 0.970) and the spectral spread (R = 0.744).

The lowest K belongs to dawn chorus (0.704). Many independent voices, not yet synchronized. That’s what “not yet coupled” actually looks like.


The Dawn Chorus

10 dawn chorus recordings, ranging from 4-second fragments to 2.3-hour continuous sessions. Dawn chorus has the lowest K (0.704) and lowest R (0.614) of any call type. Many voices, low spectral richness per-voice, not yet organized. This is the baseline — what a community sounds like before coupling takes hold.

The prediction was that R should climb as the chorus synchronizes — a phase transition from scattered to coordinated. Of the 9 recordings long enough to analyze, 2 show rising R over time. The others appear to have started recording after the transition was already underway or complete.

This is suggestive, not confirmed. To catch the full dawn chorus phase transition, you need recordings that start in pre-dawn silence and run through the first 20 minutes of vocalization. Most xeno-canto recordings start when the chorus is already going. The 2 that do show the R climb are the ones tagged earliest.

Dawn Chorus: K=0.704    R=0.614    T=1.100 Many voices, low coupling, high timing variability. The pre-synchronized state.

Next step: pre-dawn recordings with continuous monitoring. Catch the transition from silence to synchronization.


Related

The Drum →
1/f timing in human drummers. Euclidean rhythms. Why groove is physics.

Music Theory →
Consonance as minimum energy. The ear is a Landauer computer.

The Groove →
Rubato and flow. Why time disappears in the pocket.

Body Music →
7 coupled oscillators. Heart:Breath = 4:1. Disease is detuning.

Linguistics →
Zipf’s law. Language as coupling structure.

Ecology →
May’s criterion. Ecosystem stability as coupling threshold.

Evolution →
K applied to allele frequency. Natural selection as coupling.

A bird doesn’t decide what to sing.
The coupling function decides for it.
The structure IS the function.

Good will applied forward.

K IN THIS DOMAIN

K = autocorrelation at lag 1 of the waveform envelope. R = spectral entropy (normalized Shannon entropy of the power spectrum). E = mean amplitude (RMS). T = coefficient of variation of inter-onset intervals. 1/f exponent = slope of log-log power spectrum of IOI series. Consonance = fraction of successive frequency intervals within 5% of just intonation ratios (2:1, 3:2, 4:3, 5:4, 5:3, 6:5). Shannon entropy = entropy of the note-to-note transition matrix.


1. Dataset

Source: xeno-canto.org (Creative Commons licensed field recordings)

Total recordings: 79

Species (6): European Robin, Eurasian Wren, Great Tit, Common Blackbird, Song Thrush, House Sparrow

Call types: 24 songs (territory/mating), 21 calls (contact/alarm), 24 alarm calls, 10 dawn chorus

Behavioral labels: as tagged by xeno-canto contributors (observer-reported)


2. Methods

K — autocorrelation of amplitude envelope at lag 1. Higher K = more self-similar, more repetitive structure.

R — spectral entropy. Normalized Shannon entropy of the power spectrum. Higher R = energy spread across more frequencies = richer sound.

E — mean amplitude (RMS of waveform). Proxy for loudness/energy investment.

T — coefficient of variation of inter-onset intervals. T > 1 = timing swings larger than the mean interval. Rubato signature.

1/f exponent — slope of log-log power spectrum of IOI series. Exponent near 1 = 1/f (pink noise, long-range temporal correlations). Near 0 = white noise (no structure).

Consonance — fraction of successive peak-frequency intervals within 5% of just intonation ratios (octave, fifth, fourth, major third, major sixth, minor third). Range 0–1.

Shannon entropy — entropy of note-to-note transition matrix. Higher = more unpredictable sequences.


3. Results by Call Type

MetricSong (n=24)Call (n=21)Alarm (n=24)Dawn (n=10)Direction
K0.8080.7490.7870.704Song > Alarm > Call > Dawn
R0.7480.6930.7440.614Song ≈ Alarm > Call > Dawn
T0.9930.8340.9701.100Dawn > Song ≈ Alarm > Call

4. Per-Species Breakdown

European Robin (4 songs, 4 calls, 4 alarms)

TypeKRETCons.H
Song0.8690.8250.0471.0640.8805.067
Call0.8450.7290.0230.7800.9084.439
Alarm0.8600.8550.0201.1040.8765.208

Eurasian Wren (4 songs, 4 calls, 4 alarms)

TypeKRETCons.H
Song0.7930.8130.0250.9600.9374.403
Call0.7510.7760.0160.8040.8964.120
Alarm0.7600.8210.0580.7450.9174.517

Great Tit (4 songs, 4 calls, 4 alarms)

TypeKRETCons.H
Song0.8590.5860.0461.2260.9304.391
Call0.7390.6870.0410.8790.9044.119
Alarm0.8310.6040.0451.0990.9614.047

Common Blackbird (4 songs, 4 calls, 4 alarms)

TypeKRETCons.H
Song0.8130.7050.0741.1520.8754.716
Call0.6390.7050.0250.7630.9372.287
Alarm0.8100.7240.0701.3130.8704.959

Song Thrush (4 songs, 4 calls, 4 alarms)

TypeKRETCons.H
Song0.8480.7580.0311.0520.9504.883
Call0.7770.5510.0361.0040.9204.545
Alarm0.8440.7720.0250.9100.9345.064

House Sparrow (4 songs, 1 call, 4 alarms)

TypeKRETCons.H
Song0.6650.8000.0360.5070.9264.514
Call0.7140.7580.0600.5890.9224.906
Alarm0.6170.6880.0410.6240.9213.946

House Sparrow has only 1 call recording. That row is a single data point, not a mean.


5. Key Findings

Consonance is invariant across types

Consonance stable across all call types and species

Minimum consonance (single recording): 0.820

Recordings ≥ 80% consonant: 79/79 (100%)

Every single recording, regardless of species or function, uses simple frequency ratios at least 80% of the time. The finding is the invariance: territory, mating, alarm, contact — the consonance level barely changes. The channel stays the same; only the message varies. This is the ear being an energy-efficient frequency analyzer, not culture.

Timing variability separates song from call

Songs with T > 1.0 (rubato): 14 / 24 (58%)

Calls with T > 1.0: 4 / 21 (19%)

Songs stretch and compress time like a musician playing expressively. Calls keep steady intervals like a clock. This distinction holds across all 6 species.

Shannon entropy separates sequence complexity

Song H mean: 4.66

Call H mean: 3.95

Songs explore more of the available transition space. Calls repeat. The difference is consistent: 5 of 6 species show higher H for songs than calls (House Sparrow is the exception, with only 1 call recording).

1/f timing: killed

Mean 1/f exponent (all recordings): 0.15

Expected for 1/f structure: 0.5 – 1.0

Expected for white noise: ≈ 0

Bird inter-onset intervals do not show 1/f temporal correlations. The exponents cluster near zero across all species and call types. Human drumming shows 1/f (Hennig 2011). Bird timing does not. This prediction was wrong.


6. The Five Predictions

#PredictionResult
1Consonance should be invariant across call typesCONFIRMED. Stable across all types and species. 100% above 80%. The channel doesn’t change, only the message. Physics, not culture.
2Songs should show rubato (high T)CONFIRMED. T = 0.993 (songs) vs 0.834 (calls). Songs breathe, calls clock.
3Songs should be more complex (higher H)CONFIRMED. H = 4.66 (songs) vs 3.95 (calls). Consistent across 5/6 species.
4Dawn chorus should show rising R (phase transition)SUGGESTIVE. 2/9 show rising R. Others started recording after transition. Need pre-dawn recordings.
5Alarm calls should have minimum KREDIRECTED. Dawn chorus has lowest K (0.704). Alarm K (0.787) is between songs and calls. Alarms need to match a stored template = high coupling. Dawn chorus = many unsynchronized voices = low coupling.
61/f timing at note-to-note levelKILLED. Mean exponent 0.15. Near zero. Human drummers show 0.5–1.0. Birds do not have 1/f timing.

Score: 3 confirmed, 1 suggestive, 1 redirected, 1 killed. The framework makes directionally correct predictions. The alarm K prediction was wrong in a way that taught us something: coupling to a stored template IS high K, and the real low-K state is unsynchronized independent voices.


7. Full Data

All 79 recordings. Sorted by species and call type. Xeno-canto IDs in filenames.

SpeciesTypeKRET1/fCons.HOnsets
European Robin
song0.8890.8160.0470.9460.0580.825.12264
song0.8540.8500.0541.334-0.010.945.31326
song0.8930.8680.0541.2000.2420.865.30374
song0.8420.7660.0350.7750.1730.904.5459
call0.8850.6600.0260.7580.8970.873.9928
call0.8560.8150.0310.861-0.000.925.2773
call0.7840.6710.0200.8230.2020.885.10135
call0.8550.7690.0170.6800.0930.963.4143
alarm0.8340.9080.0091.4020.3320.845.2276
alarm0.8380.8470.0101.026-0.000.925.33123
alarm0.8890.8160.0470.9460.0580.825.12264
alarm0.8810.8480.0141.040-0.030.925.1762
Eurasian Wren
song0.7000.7750.0210.448-0.640.994.6217
song0.6300.8200.0130.2920.1340.925.5239
song0.9370.8260.0451.9400.0790.962.7188
song0.9070.8300.0211.1600.8780.884.76190
call0.8860.7580.0040.7730.2090.903.16111
call0.6710.7900.0100.7260.4320.844.33118
call0.7720.7460.0140.8860.2140.964.25216
call0.6760.8090.0360.832-0.180.884.7467
alarm0.7550.7940.0430.5260.1450.985.20219
alarm0.7110.8510.0310.484-0.160.914.6471
alarm0.9400.8400.1211.2740.4320.924.01167
alarm0.6350.8010.0350.6950.2680.864.22205
Great Tit
song0.8050.6850.0091.2270.2900.984.73171
song0.8810.4990.0601.4080.1630.984.29176
song0.9350.5130.0531.2180.0970.943.64136
song0.8130.6480.0621.0510.3350.824.9193
call0.7780.5610.0641.1160.0890.964.02314
call0.6370.7210.0321.0660.1890.923.7923
call0.7270.8290.0510.8590.1850.824.42160
call0.8150.6370.0170.4730.2790.914.2625
alarm0.8050.6850.0091.2270.2900.984.73171
alarm0.8810.4990.0601.4080.1630.984.29176
alarm0.9350.5130.0531.2180.0970.943.64136
alarm0.7040.7210.0570.5450.4680.923.53135
Common Blackbird
song0.7270.6780.1391.2910.1220.864.9791
song0.8630.7750.0440.5550.2170.905.10290
song0.8250.7020.0621.560-0.120.864.55332
song0.8370.6640.0491.202-0.180.884.24198
call0.9010.5650.0130.8520.4420.963.9021
call0.5690.7870.0190.8470.1800.912.8629
call0.6070.7640.0301.0210.0730.921.81367
call0.4800.7030.0390.330-0.220.960.58111
alarm0.7270.6780.1391.2910.1220.864.9791
alarm0.8630.7750.0440.5550.2170.905.10290
alarm0.8250.7020.0621.560-0.120.864.55332
alarm0.8260.7390.0371.846-0.300.865.21275
Song Thrush
song0.8700.7100.0371.096-0.090.945.52307
song0.8600.7930.0130.9110.0130.984.611096
song0.7770.7070.0241.0000.0360.944.51164
song0.8860.8230.0491.202-0.170.944.89208
call0.7280.8500.0100.8170.2060.905.242753
call0.8560.7340.0370.9890.0650.925.84103
call0.8110.4330.0131.5251.4250.941.7615
call0.7130.1860.0840.6860.1560.925.35360
alarm0.8700.7100.0371.096-0.090.945.52307
alarm0.8600.7930.0130.9110.0130.984.611096
alarm0.7770.7070.0241.0000.0360.944.51164
alarm0.8680.8800.0140.734-0.390.885.12211
House Sparrow
song0.7080.7890.0070.3660.1500.925.5471
song0.6700.7300.0960.5520.3320.924.1873
song0.5530.8320.0230.4730.0370.923.901063
song0.7310.8480.0180.6350.0400.944.44132
call0.7140.7580.0600.5890.9150.924.9144
alarm0.7320.8260.0200.6630.0480.884.1358
alarm0.5570.8370.0270.8540.5250.963.751455
alarm0.6710.2630.0580.5490.0580.883.911415
alarm0.5100.8250.0610.4320.0740.963.99295
Dawn Chorus (mixed species)
dawn0.8520.1720.0280.4060.862.897
dawn0.5680.1280.0191.282-0.080.905.1342
dawn0.5330.3410.0291.9600.1600.945.2691
dawn0.5910.4410.0461.6100.1850.964.8130
dawn0.6180.7280.0700.3440.5750.844.7820
dawn0.7530.8030.0190.7050.0170.944.53617
dawn0.7820.8300.0140.8830.2540.984.784284
dawn0.7860.8950.0091.8600.3040.964.886447
dawn0.6820.8980.0150.7040.1100.865.668496
dawn0.8790.9020.0101.2480.1770.944.976344

8. Honest Limits

79 recordings is better, still modest. Real bioacoustics datasets run into the thousands. This is larger than a pilot but not definitive.

Field recordings have background noise. Wind, other birds, insects. Onset detection and frequency analysis are affected. We did not isolate individual birds.

Behavioral annotations are observer-reported. Xeno-canto tags (“song,” “call,” “alarm”) are not experimentally verified. A recording tagged “call” might include both contact and alarm vocalizations.

Small alarm sample per species. 24 alarm recordings total across 6 species = 4 per species. Within-species alarm patterns are based on very few data points.

Only 1 House Sparrow call recording. Per-species comparisons for this species are unreliable.

1/f timing was not confirmed. The exponents are near zero. Bird inter-onset timing does not show the long-range correlations found in human drumming.

Dawn chorus phase transition only suggestive. 2/9 recordings show rising R. Most started too late to catch the transition. Need pre-dawn recordings.

No statistical significance tests. With n=24 songs and n=21 calls, parametric tests are feasible but were not run. The differences reported are raw means, not corrected for multiple comparisons.

Consonance definition is coarse. 5% tolerance around just ratios captures the trend but does not account for harmonic series structure in bird calls.


9. What Was Killed / What Survives

Killed

× 1/f timing at the note-to-note level. Exponents cluster near 0. Human drummers show 0.5–1.0. Birds do not. The inter-onset series is closer to white noise than pink noise. This does not mean bird timing is random — it means the long-range temporal correlations that characterize human groove are not present at this scale.

× Any claim about bird “language.” We measured coupling structure, not semantics. A territory song is not a sentence. An alarm call is not a word. The framework describes energy allocation and temporal structure, not meaning.

Redirected

Alarm = minimum K. We predicted alarm calls would have the lowest K. With 24 alarm recordings across 6 species, alarm K = 0.787 — between songs (0.808) and calls (0.749). The lowest K belongs to dawn chorus (0.704). Alarms are repetitive (tick-tick-tick) because they need to match a stored template. Dawn chorus is many independent voices not yet coupled. The prediction was wrong because it conflated “startling” with “uncoupled.” Startling IS coupling — to a danger template.

Survives

Consonance invariance. Stable across all call types and species in 79 recordings. 100% above 80%. Birds prefer simple frequency ratios. The invariance across call types is the finding — the channel stays the same regardless of the message.

Rubato in songs, metronomic in calls. T = 0.993 (song) vs 0.834 (call). 58% of songs show T > 1.0. Only 19% of calls do. Songs play with time.

Shannon entropy separates types. H = 4.66 (song) vs 3.95 (call). Songs are less predictable. Calls repeat.

Structure tracks function across 6 species. Songs are more complex, richer, and more temporally variable than calls. This holds for 5 of 6 species.

Suggestive

Dawn chorus phase transition. 2/9 dawn chorus recordings show rising R over time. The others started recording after the transition was already underway. Need pre-dawn recordings to test fully.


10. Open Questions

Pre-dawn recordings. Start before first vocalization, run through 20 minutes. Does R climb monotonically? Does K increase as voices synchronize?

• Can the coupling-function grouping outperform species grouping in a formal clustering analysis?

• Does consonance vary with habitat (forest vs. open field vs. urban)? If the ear is Landauer-constrained, consonance should be invariant.

• Do species with more complex songs (Song Thrush, Blackbird) show higher K-diversity across their repertoire?

Alarm template matching. If alarm K is high because it couples to a stored template, does playback of alarm calls in novel contexts still trigger the same response? (Testing the “coupling to template” interpretation.)

GUMPResearch · Support · [email protected] · terms