Seshat × CAMS — Cross-Dataset Validation

1. Dataset Overview

Seshat Global History Databank

35

NGAs

1,494

NGA-century rows

5,500

years covered

Equinox 2022 GitHub release. SPC = 1st principal component of 8 coded complexity variables (Hierarchy, Government, Infrastructure, Information, Money, Military, etc.). MilTech = sum of 6 military technology categories. Century-resolution, −3600 to 1900 CE.

Source: Turchin et al., Seshat Global History Databank. seshatdatabank.info

CAMS v2.3 Corpus

38

Societies

39,351

Node-year records

45

Series

Annual or 5-year resolution. Λ(t) = mean cross-node bond strength = mean over all 28 node pairs of B_ij(t) = √(max(V_i+8,0)·max(V_j+8,0)) / 32. Covers CE 5 – 2026, with most modern datasets starting 1750+.

Extended overlap achieved: New CAMS scoring for Latium Vetus (CE 460–2010, 25-year resolution) gives 14 century-aligned comparison points against Seshat's Latium NGA (CE 500–1800). This is the first Seshat × CAMS comparison with near-Granger-ready sample size. The broader NGA explorer shows where further CAMS reconstruction could extend the analysis to additional NGAs.

2. Latium Vetus — Extended Cross-Validation: CE 460–1800

New CAMS scoring for Latium Vetus covers 62 time-steps at 25-year resolution (CE 460–2010), produced using the CAMS v2.3 rubric. Seshat's Latium NGA covers CE 500–1800 at century resolution. This yields N = 14 century-level matched observations — the largest Seshat × CAMS overlap yet assembled. The two datasets are fully independent: Seshat uses expert-coded historical variables; CAMS uses a structured LLM complexity rubric — no shared inputs.

CAMS Λ(t) — Latium Vetus, CE 460–2010 (25-year resolution, 62 observations)

Key events visible in the trajectory: Justinianic collapse (CE 535 nadir, Λ = 0.136), Carolingian rise (CE 785–810 local peak, Λ = 0.507), Black Death trough (CE 1360, Λ = 0.225), Renaissance peak (CE 1460–1510, Λ ≈ 0.598), Sack of Rome shock (CE 1535, Λ = 0.388), post-WWII boom (CE 1960, Λ = 0.586).

Dual-Axis: CAMS Λ(t) vs Seshat SPC — Latium, CE 500–1800 (N = 14)

Century-Level Data Table

Century CE	CAMS Λ(t) avg	Seshat SPC	Seshat MilTech

Correlation (N = 14)

Findings

Pearson r = −0.309, Spearman ρ = 0.000 (N = 14, p ≈ 0.28) — the two measures are statistically orthogonal over the CE 500–1800 period.
Seshat SPC for Latium is remarkably flat (range: 6.39–7.09) across 1,300 years — it tracks slow structural accumulation at century resolution, not coordination quality dynamics.
CAMS Λ(t) varies by a factor of 4.5× (0.136 to 0.598) over the same period — detecting coordination collapses and recoveries that the structural index does not register.
The Black Death (CE 1360) drives CAMS Λ to its lowest post-Roman value (0.225); Seshat SPC barely moves (6.55 → 6.52 → 6.52). CAMS is tracking the functional disruption; Seshat is tracking institutional persistence.
This orthogonality is a positive finding, not a failure: the two frameworks are measuring different dimensions of civilisational state — structural endowment (Seshat SPC) vs coordination quality (CAMS Λ). A society can hold high structural complexity while coordination quality collapses, and vice versa.

Phase I context (Rome CE 0–400): The earlier 5-point Rome comparison (Pearson r = 0.78, N = 5) captured a different regime — the terminal decline of a single peak state where both structural complexity and coordination deteriorated together. The new 1,300-year Latium Vetus dataset covers multiple regime changes (collapse, recovery, Black Death, Renaissance, Risorgimento) — a richer test environment that reveals the orthogonality between structural scale and coordination quality.

Statistical note: At N = 14 we can compute a meaningful correlation estimate (t = −1.13, df = 12). Granger causality requires N ≥ 15 post-lag — this dataset is one observation short. The 25-year resolution CAMS data (62 points) is Granger-ready for a CAMS-internal analysis; the Seshat century alignment constrains the cross-dataset test to N = 14.

3. Seshat NGA Explorer — All 35 Natural Geographic Areas

Select NGA

Metric

SPC Trajectory

NGA Metadata

All NGAs — SPC at Maximum (sparkline context)

4. Proposed Seshat NGA → CAMS Society Mapping

The table below lists proposed correspondences between Seshat NGAs and CAMS societies, ordered by temporal overlap quality. "Overlap" = the intersection of Seshat and CAMS time windows.

Seshat NGA	Best CAMS Match	Seshat Range	CAMS Range	Overlap (N centuries)	Granger-ready?
Latium	Latium Vetus (CE 460–2010)	−3600 → 1800	CE 460–2010	CE 500–1800 (N=14)	Near — N=14 (1 short)
Paris Basin	France (1785–2024)	−3200 → 1700	1785–2024	None (85y gap)	No — no overlap
Kansai	Japan (1850–2025)	−600 → 1800	1850–2025	None (50y gap)	No — no overlap
Middle Yellow River Valley	China (1900–2026)	−2000 → 1900	1900–2026	CE 1900 (N≈1)	No — N<15
Susiana / S. Mesopotamia	Iraq / Iran	−4000 → 1900	1900–2024	CE 1900 (N≈1)	No — N<15
Upper Egypt	No CAMS analog yet	−3600 → 1700	—	—	No match
Deccan / Middle Ganga	India (if extended)	−1500 → 1800	—	—	No match

Path to Granger: Two routes exist. (1) Extend CAMS backward — reconstruct Latium/Rome at century resolution from BCE to 476 CE using archival sources and the CAMS v2.3 rubric. This would yield N ≈ 40 usable Seshat-aligned centuries. (2) Extend Seshat forward — adapt the Seshat coding rubric to 20th-century polities in the CAMS corpus. Either approach would enable the full bidirectional Granger test (Λ → SPC and SPC → Λ) using granger_stationary_safe() from the validated tool suite.

5. Dataset Gap Visualisation

Temporal Coverage: Seshat NGAs vs CAMS Societies

Each bar shows the time span covered. Seshat NGAs are brown; CAMS societies are blue. The only vertical overlap is in the CE 0–450 window (Latium/Rome). This chart is the core motivation for the proposed backward-reconstruction programme.

6. Methodology

SPC — Seshat Social Complexity Index

SPC is the 1st principal component (PC1) of 8 coded complexity variables from the ImpSCDat sheet (imputed). The 8 components are:

PolPop — polity population
PolTerr — territory
CapPop — capital population
levels — administrative hierarchy levels
government — government specialisation
infrastr — infrastructure
writing — writing/information
money — monetary system

Higher SPC = more structurally complex polity. Range ≈ 2.3–8.3 in this corpus.

Λ(t) — CAMS Cross-Layer Coherence

Vi(t) = Ci + Ki + Ai/2 − Si Bij(t) = √(max(Vi+8,0)·max(Vj+8,0)) / 32 Λ(t) = mean over all 28 node pairs of Bij(t)

C = Coherence, K = Capacity, A = Abstraction, S = Stress. 8 nodes × 4 metrics. Bond strength B_ij captures the geometric mean of two nodes' adjusted values. Λ(t) = mean coupling across all node pairs.

Range in Latium Vetus corpus: 0.136 (CE 535, Justinianic nadir) to 0.598 (CE 1460–1510, Renaissance peak). Rome CE 0–400 corpus ranged 1.31–2.88 under a different scoring rubric.

Why the two metrics should correlate

Both SPC and Λ(t) are designed to capture civilisational complexity, but from different theoretical angles. SPC captures structural endowment — what institutional and material resources a polity possesses. Λ(t) captures coordination quality — how well those resources are being integrated across functional layers.

The finding that Λ(t) declines faster than SPC during Rome's CE 100–400 period is theoretically meaningful: structural endowment persists (armies, roads, laws remain nominally intact) while cross-layer coordination degrades (fiscal stress, elite conflict, provincial fragmentation). This is precisely the CAMS prediction: coordination failure precedes structural collapse.

The Granger question — whether Λ(t) leads SPC decline, or vice versa — remains open pending sufficient overlap data.