FAQ

Questions, answered.

Parity with RAG is our bar, and we measure it head-to-head on the same model, questions, and retriever. On single-document QA, cartridge retrieval has matched and beaten RAG in our tests. Cheaper-but-worse isn't the product — we only ship a config once it's at parity on your corpus.

Only the affected shards retrain — minutes, not a full rebuild. Incremental sync detects changes from your connected sources.

Open-weight LLMs you control (e.g. Qwen3), in your own cloud. Cartridges inject trainable KV into the frozen model — no per-token lock-in to a frontier vendor.

Inside your own cloud account, with no public ingress. Your documents never train a shared model and never leave your perimeter.

Fine-tuning bakes facts into weights — hard to update, prone to hallucination, no sources. Engram Smart CAG keeps knowledge in a swappable, grounded cartridge that updates per-shard.

A one-time training step (self-study + distillation) fanned out across cheap GPUs — and an amortized encoder is bringing that cost down further. After that, serving is flat and cheap, and the one-time cost typically repays within the first few thousand queries versus a frontier-RAG stack.

Still have questions?

We'll run a demo on your own corpus and answer them with your numbers.

Book a demo