| Just chiming in — been down this exact rabbit hole for months (same pain: useful != demo). I ended up ditching the usual RAG+embedding route and built a local semantic engine that uses ΔS as a resonance constraint (yeah it sounds crazy, but hear me out). Still uses local models (Ollama + gguf) But instead of just vector search, it enforces semantic logic trees + memory drift tracking Main gain: reduced hallucination in summarization + actual retention of reasoning across files Weirdly, the thing that made it viable was getting a public endorsement from the guy who wrote tesseract.js (OCR legend). He called the engine’s reasoning “shockingly human-like” — not in benchmark terms, but in sustained thought flow. Still polishing a few parts, but if you’ve ever hit the wall of “why is my LLM helpful but forgetful?”, this might be a route worth peeking into. (Also happy to share the GitHub PDF if you’re curious — it’s more logic notes than launch page.) |