How it's built
The Foundation
Open source code. Peer-reviewed research. No magic, just mechanics.
Open Source Core
The core RAG foundation is open source. You can inspect the system, understand its behavior, and extend it to fit your requirements. No proprietary lock-in, no hidden costs.
Core capabilities:
- Access Control — Multi-tenant authentication, user management, session handling
- Document Ingestion — Structured processing and indexing for retrieval
- Query & Retrieval — Context-aware answers with traceable sources
- System Observability — Logging, traceability, and operational metrics
Designed for auditability, long-term maintainability, and infrastructure control.
The Appearance of Meaning
Context-Dependence and Semantic Competence in Transformer Architectures
Peer-reviewed philosophical framework.
We operationalize "appearance of meaning" (AoM) in transformer language models as a measurable competence cluster: context-sensitive disambiguation, controlled minimal-pair sensitivity, and discourse-level coherence. We propose a Context-Primacy Thesis (CPT): meaning-relevant behavior is causally governed by token-in-context relational states rather than static lexical carriers.
Key Results (GPT-2 & Qwen2.5):
Supporting: 91–96% disambiguation accuracy (cue-vulnerable; not the main load-bearer in the paper).
CPT Causal Signature by Layer
Layer-resolved CPT signature under targeted interventions. The sensitivity profile is model-dependent; the critical result is consistent separation from sham/placebo across layers.
Chart: Qwen2.5 (0.5B, 1.5B, 3B). Full AoM + CPT results additionally include GPT-2 (124M). Sham patching is near-zero. SDH target-specificity stress test runs across 8 checkpoints.
Why this matters for products:
This research establishes that modern language models are genuinely context-sensitive rather than simple pattern replay systems. For products, this means context can be treated as a first-class control surface—something that can be tested, monitored, and constrained instead of assumed.
Detailed methodology and limitations are discussed in the full paper.
Available to qualified readers during peer review.
Language Games and Sedimented Semantics
Temporal Dimensions of Context-Primacy in LLM Agents (t-CPT)
We develop t-CPT: the hypothesis that instruction-following in LLM agents stabilizes through repeated interaction patterns ("procedural sediments"), yet decays with temporal distance and interference. We operationalize this as measurable drift curves under controlled multi-turn stress tests.
Pilot signal under controlled multi-turn stress tests
Metric: threshold disclosure rate — how often internal numeric policy cutoffs are revealed as conversation length increases.
Temporal Drift Curve
Threshold disclosure rate as a function of conversation length. Measurements are obtained via dedicated diagnostic probes, not production usage.
Pilot signal under controlled multi-turn stress tests. Full methodology available on request.
Methodology: Diagnostic Probes
We run dedicated diagnostic prompts that stress the system under long-context conditions and score outputs against defined disclosure constraints. This works with any model (open-weight or API) because we score outputs, not internal states.
Why this matters: Instruction-following reliability can degrade as interactions grow longer or more complex. Our stress tests surface these failure modes early, so deployment decisions are informed by observed behavior—not assumptions.
These tests inform deployment readiness and release decisions in production environments.
Supported by
HessenIdeen · HessianAI · Goethe Unibator · Frankfurt School · Microsoft Founders Hub