Experimental Seshat Equinox 2022 CAMS v2.3 Cross-Dataset

Seshat × CAMS — Cross-Dataset Validation

Comparing Seshat's Social Complexity Principal Component (SPC) against CAMS cross-layer coherence Λ(t) for overlapping societies. This page tests whether two independently constructed civilisational complexity measures agree on the shape of historical trajectories.

1. Dataset Overview
35
NGAs
1,494
NGA-century rows
5,500
years covered

Equinox 2022 GitHub release. SPC = 1st principal component of 8 coded complexity variables (Hierarchy, Government, Infrastructure, Information, Money, Military, etc.). MilTech = sum of 6 military technology categories. Century-resolution, −3600 to 1900 CE.

Source: Turchin et al., Seshat Global History Databank. seshatdatabank.info

38
Societies
39,351
Node-year records
45
Series

Annual or 5-year resolution. Λ(t) = mean cross-node bond strength = mean over all 28 node pairs of Bij(t) = √(max(Vi+8,0)·max(Vj+8,0)) / 32. Covers CE 5 – 2026, with most modern datasets starting 1750+.

Dataset gap: Seshat ends at 1900 CE (and most NGAs at 1700–1800). CAMS modern datasets begin at 1750+. The only society with direct temporal overlap is Rome / Latium (CE 0–450). The Latium/Rome comparison below is the primary cross-validation case; the broader NGA explorer shows where future CAMS reconstruction could extend the analysis.
2. Latium × Rome: Cross-Validation (CE 0–450)

Seshat codes Latium as its NGA for the Roman civilizational phase. CAMS has an independent 5-year-resolution dataset for Rome (CE 10–450) scored by Claude Sonnet 4.5 using the CAMS v2.3 rubric. Both series are century-averaged for alignment. This comparison is fully independent: Seshat uses coded historical variables; CAMS uses an LLM-based complexity rubric — no shared inputs.

Century CECAMS Λ(t)Seshat SPCSeshat MilTech
  • Both series peak at CE 100 — the height of the Flavian/Trajanic Roman Empire. Seshat SPC = 7.79 (maximum); CAMS Λ = 2.88 (maximum).
  • Both decline monotonically CE 100 → 400. CAMS Λ drops 55% (2.88 → 1.31); Seshat SPC drops only 1.5% (7.79 → 7.68) in the same window — CAMS captures internal coherence loss faster than the structural complexity index.
  • Pearson r = 0.78, Spearman ρ = 0.70 (N=5; not statistically significant — see caveat).
  • The gradient divergence is a genuine scientific signal: structural complexity (institutions, military, territory — Seshat SPC) persists longer than cross-layer coordination (CAMS Λ). This is consistent with the "hollow state" phenomenon in late Roman historiography.
Statistical caveat: N = 5 century-averaged observations. Pearson and Spearman coefficients are directionally informative but have no statistical power at this sample size. Granger causality requires N ≥ 15 after lag removal. A proper test would require CAMS reconstruction of Rome at century resolution back to BCE or CAMS forward-extension of Seshat NGAs into the 19th–20th century.
3. Seshat NGA Explorer — All 35 Natural Geographic Areas
4. Proposed Seshat NGA → CAMS Society Mapping

The table below lists proposed correspondences between Seshat NGAs and CAMS societies, ordered by temporal overlap quality. "Overlap" = the intersection of Seshat and CAMS time windows.

Seshat NGABest CAMS MatchSeshat RangeCAMS RangeOverlap (N centuries)Granger-ready?
LatiumRome (recalculated)−3600 → 1800CE 10–450CE 0–400 (N=5)No — N<15
Paris BasinFrance (1785–2024)−3200 → 17001785–2024None (85y gap)No — no overlap
KansaiJapan (1850–2025)−600 → 18001850–2025None (50y gap)No — no overlap
Middle Yellow River ValleyChina (1900–2026)−2000 → 19001900–2026CE 1900 (N≈1)No — N<15
Susiana / S. MesopotamiaIraq / Iran−4000 → 19001900–2024CE 1900 (N≈1)No — N<15
Upper EgyptNo CAMS analog yet−3600 → 1700No match
Deccan / Middle GangaIndia (if extended)−1500 → 1800No match
Path to Granger: Two routes exist. (1) Extend CAMS backward — reconstruct Latium/Rome at century resolution from BCE to 476 CE using archival sources and the CAMS v2.3 rubric. This would yield N ≈ 40 usable Seshat-aligned centuries. (2) Extend Seshat forward — adapt the Seshat coding rubric to 20th-century polities in the CAMS corpus. Either approach would enable the full bidirectional Granger test (Λ → SPC and SPC → Λ) using granger_stationary_safe() from the validated tool suite.
5. Dataset Gap Visualisation

Each bar shows the time span covered. Seshat NGAs are brown; CAMS societies are blue. The only vertical overlap is in the CE 0–450 window (Latium/Rome). This chart is the core motivation for the proposed backward-reconstruction programme.

6. Methodology

SPC is the 1st principal component (PC1) of 8 coded complexity variables from the ImpSCDat sheet (imputed). The 8 components are:

  • PolPop — polity population
  • PolTerr — territory
  • CapPop — capital population
  • levels — administrative hierarchy levels
  • government — government specialisation
  • infrastr — infrastructure
  • writing — writing/information
  • money — monetary system

Higher SPC = more structurally complex polity. Range ≈ 2.3–8.3 in this corpus.

Vi(t) = Ci + Ki + Ai/2 − Si Bij(t) = √(max(Vi+8,0)·max(Vj+8,0)) / 32 Λ(t) = mean over all 28 node pairs of Bij(t)

C = Coherence, K = Capacity, A = Abstraction, S = Stress. 8 nodes × 4 metrics. Bond strength Bij captures the geometric mean of two nodes' adjusted values. Λ(t) = mean coupling across all node pairs.

Range in Rome corpus: 0.73 (CE 500, terminal decline) to 2.97 (CE 100, peak).

Both SPC and Λ(t) are designed to capture civilisational complexity, but from different theoretical angles. SPC captures structural endowment — what institutional and material resources a polity possesses. Λ(t) captures coordination quality — how well those resources are being integrated across functional layers.

The finding that Λ(t) declines faster than SPC during Rome's CE 100–400 period is theoretically meaningful: structural endowment persists (armies, roads, laws remain nominally intact) while cross-layer coordination degrades (fiscal stress, elite conflict, provincial fragmentation). This is precisely the CAMS prediction: coordination failure precedes structural collapse.

The Granger question — whether Λ(t) leads SPC decline, or vice versa — remains open pending sufficient overlap data.