21CQ2 develops mechanistic interpretability research that reveals how transformer models encode content and framing at different layers of processing. We build products that use these findings to detect framing drift in regulatory, financial, and information environments.
A literary translation paradigm establishes that content representations peak at ~50% of network depth. Subspace activation patching demonstrates that a single direction in 8,192-dimensional space captures content-specific causal effect. Content directions are pair-specific and orthogonal, with mean pairwise cosine similarity of +0.041.
Framing similarity peaks at layer 25 (~31% of depth) while content peaks at layer 40 (~50%). A null baseline using different topics in the same registers produces the opposite pattern, confirming the signal is topic-dependent framing rather than register similarity. Applied to live RSS feeds, a two-layer extraction pipeline identifies a documented disinformation amplification network with zero false positives across mainstream partisan outlets.
CEO earnings language diverges from the same company’s filed risk disclosures at a structurally consistent rate across seven boom/bust cycles spanning 1999–2026. Drift widens before collapses and narrows when candid management replaces promotional management. Ground-truth validation using actual Nvidia and Agnico Eagle filings confirms the methodology on unmodified public documents.
Cross-lingual comparison replicates the three-phase trajectory from Paper I. Initial results across French, Greek, Russian, and Japanese show divergence gaps exceeding +0.36 between engines that capture semantic intent and those that translate literally.
Compares public language against filed disclosures at the sentence level. Identifies which specific sentences drive framing divergence and classifies the drift type: insertion, reframing, or substitution.
Deployable on air-gapped infrastructure. The open-weight model downloads once; after that, proprietary documents never leave the room.
Pharmaceutical promotional material compliance. Sentence-level drift detection between marketing copy and approved prescribing information. Validated against 20 FDA OPDP enforcement actions with 100% sensitivity and specificity. Live scans produce findings including data vintage mismatches and mechanism-of-action reframing.
Insurance compliance analysis. Detects framing drift between marketing materials and policy language across commercial lines including BOP, professional liability, cyber, and workers’ compensation. Validated on 24 cases with 100% sensitivity and specificity.
Information operation detection via two-layer residual stream extraction. Content-layer topic matching paired with framing-layer comparison identifies narrative amplification networks. Zero false positives across mainstream partisan outlets in live monitoring.