Since 1969, a recurring argument in cognitive science, AI, and philosophy of language has held that distributed statistical computation is insufficient — in principle — for genuinely structured, context-sensitive linguistic competence. The specific formulations differ. Minsky and Papert argued combinatorial limits. Fodor insisted on a language of thought. Harnad demanded grounding in perception. Bender and Koller, most recently, argued that form alone cannot yield meaning. But beneath these varied framings lies a shared conclusion: something more than distributional context-conditioned computation is necessary for structured linguistic competence.
Meanwhile, transformer language models have become demonstrably competent language users — passing professional examinations, producing text that native speakers routinely judge as coherent and context-responsive, and exhibiting systematic linguistic competence across scales and domains. This is no longer seriously disputed. What remains contested is what governs this competence and whether it warrants semantic characterization.
Making the question tractable
In a paper currently under review at the Journal of Logic, Language and Information, I introduce a framework called appearance of meaning (AoM) — a controlled competence profile that isolates three properties measurable under contextual variation: context-sensitive disambiguation, selective sensitivity to meaning-altering edits over matched shams, and discourse-level constraint tracking beyond length-matched controls.
AoM does not define the boundary of transformer competence. It is deliberately small, templated, and designed for causal intervention. The point is to make one specific mechanistic question answerable: with weights held fixed, which internal variables causally control meaning-like preference patterns?
The answer, established via sham-controlled activation patching across GPT-2 and Qwen2.5 checkpoints, is that contextualized token-in-context states at characteristic depths exert donor-directed causal control over AoM-relevant preference margins. Sham baselines remain near zero. Depth profiles are structured and reproducible. This is not the trivial architectural observation that transformers compute contextual embeddings. It is the selective empirical finding that specific states at specific depths control the competence profile, while sham no-op controls remain near zero and nearby matched positional controls are weaker under the reported protocol.
The sufficiency claim
The key point is this. These are text-only, next-token-predictive transformers trained on linguistic data. They have no grounding channel, no embodiment, no explicit symbolic rule system, and no reference mechanism. Yet they realize a controlled competence profile exhibiting systematic context-sensitive behavior. And causal intervention identifies the internal variables that govern it.
Contextualized token-in-context states — computed by a purely distributional architecture without grounding, embodiment, symbolic rules, or explicit reference mechanisms — are sufficient to causally govern a competence profile exhibiting systematic context-sensitive disambiguation, selective sensitivity to meaning-altering interventions over matched shams, and discourse-level constraint tracking beyond length-matched controls.
The empirical contribution is not merely that the competence profile is realized without those further properties. It is that activation patching identifies specific contextualized token-in-context states that causally govern that profile under intervention. That is the mechanistic result: we know not only that the competence is present, but which internal variables control it.
The philosophical consequence is separate and logically clean. Let M denote the tested competence profile and X any proposed extra requirement — grounding, embodiment, symbolic rules, explicit reference.
- M ∧ ¬X (the tested competence profile is realized without X)
- If X were necessary for M, then M → X
- But M ∧ ¬X is a counterexample. ∴ ¬(M → X)
The mechanistic result gives this logical point its force. Without it, M ∧ ¬X would be a bare observation — the competence exists, the extra ingredient is absent. With the causal-governance finding, we know what does govern the profile: specific contextual states at specific depths, identified under controlled intervention. That is what makes the counterexample substantive rather than merely observational.
The upshot: grounding, embodiment, symbolic rules, and explicit reference may still contribute to or enrich linguistic competence — but they are not required for the competence established here.
What this changes
This is a controlled empirical counterexample. It does not settle whether transformer competence constitutes meaning proper. It does not show that grounding, reference, or normativity are irrelevant to richer semantic capacities. What it does show is that distributed statistical computation can, in principle, underwrite a controlled profile of structured, context-sensitive linguistic competence of the kind isolated here.
An advocate of stronger requirements may still defend them. But that advocate must now concede that grounding, reference, and symbolic structure are not necessary for the controlled competence profile tested here — and must specify what additional competence signatures would distinguish meaning-with-X from meaning-like-competence-without-X in a way that yields detectable empirical consequences. The burden of argument has shifted.
Objections worth handling
Yes. Deliberately so. The goal is not to win by definition. It is to isolate a competence profile precise enough for causal intervention. A necessity claim fails if even one genuine counterexample exists for the target competence it is supposed to cover.
Correct. That is exactly why the claim is restricted. I am not inferring "meaning" from causal control. I am inferring that a meaning-like competence profile is causally governed by internal states in a purely distributional architecture. That is enough to rebut necessity claims about what such architectures cannot do.
Also correct. But that does not rescue the stronger necessity thesis. The relevant point is architectural: the model has no online grounding channel, no embodiment, no explicit symbolic rule system, and no explicit reference mechanism. Yet the competence profile is realized, and its governing variables can be causally identified. For the restricted competence profile isolated here, that stronger necessity claim fails.
Mostly true. The broader extrapolation remains a hypothesis, not a result. Still, parsimony points in an obvious direction: the same forward pass, the same architecture, and the same class of internal variables plausibly support both AoM and wider linguistic competence. That broader claim needs more work — but the controlled case is now established.
What comes next
Parsimony favors extending the same mechanistic explanation more broadly. The same architecture, the same forward pass, and the same class of internal variables plausibly support both AoM performance and the wider range of linguistic abilities that transformers clearly display in practice. But that extension remains an empirical hypothesis rather than a result established here. The paper reports the controlled case. The broader generalization is next.