Open Source AI Memory System
Memories crystallize into position based on what they are. Six indexed dimensions. Zero API calls. Fully offline.
SimHash + FTS5 + sentence embeddings. Single SQLite file. Open source.
Drag to rotate the crystal
How It Works
Feed a memory into the crystal. HA5H computes a 64-bit fingerprint, extracts entities, compresses into an inclusion, and indexes across six facets. 3.8ms. No LLM call. No API key. Your data stays local.
Query the crystal and it lights up everything relevant. FTS5 keyword search and SimHash band lookup run in parallel, then hybrid scoring ranks results. 9ms at 5,000 memories. Every result is the original, unmodified text.
One call generates ~139 tokens of startup context. Identity line plus your top critical memories. Paste into any agent's system prompt. The crystal remembers so the agent doesn't have to.
The beauty of a crystal isn't the count of its faces. It's whether light passes through cleanly.
Every memory you store gets a fingerprint: a 64-bit SimHash computed from the text itself. Similar memories produce similar fingerprints automatically. No one decides where anything goes. No taxonomy. No filing.
The crystal has six faces you can look through. Rotate it: see memories by content similarity. Rotate again: see them by when they were true. Again: by who was involved, by importance, by origin, by meaning. Same crystal, six views. Each facet is independently indexed, so any query finds its answer through whichever face catches the light first.
As memories accumulate, the crystal grows. Similar memories cluster in the lattice. Contradictions are detected and the older fact is retired. Growth rings mark temporal epochs: sprint boundaries, project phases, context compression events. The structure emerges from the data, never imposed on it.
The 5 in HA5H is the five-fold quasicrystalline symmetry, the "impossible" structure Dan Shechtman discovered in 1982. The first five facets use SimHash fingerprinting and keyword indexing. The sixth facet adds semantic embeddings: 384-dimension sentence vectors that catch what keywords can't. "Auth provider" finds "Clerk" because the model understands they mean the same thing. Install with pip install ha5h[embeddings] or leave it off. Five facets still work on their own.
The Architecture
Every memory exists in six-dimensional facet space. Each facet is independently queryable. Click any card to see the implementation.
Semantic fingerprint of what was said
When it was true (validity windows)
Entity graph connecting memories
Importance weight (1–5 stars)
Origin: session, project, trigger
Meaning-level similarity via embeddings
pip install ha5h[embeddings]. Degrades gracefully when not installed.
LongMemEval Benchmark
500 questions. Zero API calls. Zero LLM. Pure SimHash + FTS5 retrieval from a single SQLite file. Benchmark script included in the repo.
By Question Type (R@10)
End-to-End QA (GPT-4o judge)
Overall: 52% → 58.4% · Full analysis
Tested on LongMemEval-S (500 questions, ~115K tokens of conversation history per question). Session-level retrieval: did the correct evidence session appear in the top 10 results? End-to-end QA: Claude generates answers from retrieved memories, GPT-4o judges correctness. Reproducible: python benchmarks/bench_longmemeval.py
Install
Claude Code
ha5h_crystallizeStore a new memoryha5h_recallSearch across all 6 facetsha5h_invalidateMark memory as no longer validha5h_lattice_walkTraverse memory connectionsha5h_wake_upGenerate startup contextha5h_statsCrystal statistics