Measuring what language models know about how language works.

We use mechanistic interpretability findings to detect framing drift in information environments.

Paper I — 2026

Causally Functional Content Representations in Transformer Residual Streams

A literary translation paradigm establishes that content representations peak at ~50% of network depth. Subspace activation patching demonstrates that a single direction in 8,192-dimensional space captures content-specific causal effect. Content directions are pair-specific and orthogonal, with mean pairwise cosine similarity of +0.041.

Paper II — 2026

Depth-Dependent Dissociation of Content and Framing in Transformer Residual Streams

Framing similarity peaks at layer 25 (~31% of depth) while content peaks at layer 40 (~50%). A null baseline using different topics in the same registers produces the opposite pattern, confirming the signal is topic-dependent framing rather than register similarity. Validated across pharmaceutical, financial, and insurance domains.

In Progress

Residual Stream Trajectories as a Machine Translation Quality Metric

Cross-lingual comparison replicates the three-phase trajectory from Paper I. Initial results across French, Greek, Russian, and Japanese show divergence gaps exceeding +0.36 between engines that capture semantic intent and those that translate literally.

Courbis Whittle

Sentence-Level Framing Drift Detection

Pinpoints material divergence from approved disclosures, classifies and measures severity against a cross-domain baseline.

Open-weight models. Deployable on local, air-gapped infrastructure.

Contact for pilot — info@21cq2.com

Courbis CLINICAL

Pharmaceutical promotional material compliance. Sentence-level drift detection between marketing copy and approved prescribing information. Validated against FDA OPDP enforcement actions. Live scans produce findings including data vintage mismatches and mechanism-of-action reframing.

Pharma MLR — Validated

Courbis UNDERWRITE

Insurance compliance analysis. Detects framing drift between marketing materials and policy language across commercial lines including BOP, professional liability, cyber, and workers’ compensation.

Insurance Compliance — Validated

Courbis INTEL

Information operation detection via two-layer residual stream extraction. Content-layer topic matching paired with framing-layer comparison identifies narrative amplification networks. Zero false positives across mainstream partisan outlets in live monitoring.

Information Integrity — Validated

21CQ2 LLC
Sparta, New Jersey

Contact

info@21cq2.com

Research

github.com/21CQ2