| Mutatis: Autonomous Schema Evolution & Managed Deprecation I’ve seen a lot of discussion about "Memory Bloat" in RAG systems. In Mutatis, we solve this by treating the database schema as a fluid organism that evolves (and de-evolves) based on a combination of Semantic Pattern Detection and Confidence Decay. As the data scales, the system shadow-builds specialized tables for high-confidence entities, shifting query complexity from O(N) to O(log N). How we handle the lifecycle of a memory from "Generic" to "Optimized" and back again: 1. SEMANTIC LOGIC VS. REGEX
We don't trigger schema changes on keyword frequency alone. We use an LLM-driven classifier to distinguish Modal Logic (intent) from Foundational Facts.
- Intent: "I wish I lived in Florida" -> Stored as preference in a generic table.
- Fact: "I live in Florida" -> Triggers the evolution pipeline.
This prevents schema "pollution" from noise or aspirational intent. 2. MENTIONS, DECAY, AND "DE-EVOLUTION"
Schema evolution is a reward for frequently referenced data; deprecation is the penalty for irrelevance.
- Confidence Decay: When contradictory statements are detected (e.g., "I moved to Texas"), the confidence score for the "Florida" schema decays.
- Frequency Thresholds: If an optimized table isn't hit within a specific window, it is flagged for De-Evolution. 3. MECHANISM: SHADOW TABLES & ATOMIC SWAPS
To ensure zero-downtime, we use a shadow-table migration pattern:
- Selection: A schema is flagged for merging via periodic hygiene checks.
- Shadow Merge: A background transaction copies data from the specialized table back into a generic_memories table.
- Atomic Swap: We drop the specialized table and update the query router in a single atomic transaction. MANAGED MEMORY LIFECYCLE SUMMARY:
Mechanism | Purpose | Implementation
Mention Decay | Identifies stale data | Rolling counters on hits
Confidence Scoring | Handles contradictions | Drift via sqrt(2) weighting
Hygiene Checks | Prevents schema bloat | Periodic TTL-driven merges
Atomic Swaps | Safe transitions | Transactions + Shadow Tables
Modal Tagging | Filters intent vs fact | Zero-shot categorization THE BOTTOM LINE:
By allowing the schema to "de-evolve" back into generic tables, we maintain O(log N) performance for relevant data without the overhead of maintaining thousands of stale indices. |