01 The problem
Standing up retrieval-augmented generation usually means running infrastructure: a vector database, an embedding service, glue code to keep them in sync. That's fine for a platform team, but it's heavy when what you actually want is to hand someone a small searchable knowledge base and have it behave identically on their machine, in CI, and on a server with no network.
The constraints I set were portability, reproducibility, and offline-first. A capsule built today should give the same answers tomorrow, on another OS, without phoning home. And it should be one artifact you can copy, check into a release, or email.
02 What I built
RagCap packages everything (source documents, chunks, and their
embeddings) into a single SQLite file. It's CLI-first:
build a capsule from a folder or a recipe,
inspect its stats, search it,
ask it a question, diff two capsules,
serve it over HTTP, or export to
Parquet / FAISS / HNSW. The same engine ships as
RagCap.Core and RagCap.Export NuGet
libraries for embedding in .NET apps.
It supports three embedding providers: a bundled local ONNX
model (all-MiniLM-L6-v2), OpenAI, and Azure OpenAI.
Search runs hybrid by default (BM25 + vector), with optional
sqlite-vec indexing and MMR re-ranking for
relevance-vs-diversity control. Configuration resolves in a
predictable order: CLI flags override environment variables,
which override the global config file.
03 Key decisions & tradeoffs
-
One SQLite file as the unit of distribution
Sources, chunks, and vectors live together in a single capsule. Copy it, version it, run it anywhere SQLite runs.
Tradeoff Not built for billion-vector web scale. It's a deliberate exchange of raw scale for portability and reproducibility.
-
Local provider as the default
The ONNX model ships inside the tool, so
buildandaskwork with no API key and no network, giving you privacy and offline use out of the box.Tradeoff A larger package and lower-quality embeddings than a frontier API. Flags always let you switch to OpenAI or Azure when quality matters more than isolation.
-
CLI-first, library-second
Every capability is a scriptable subcommand before it's an API surface, which makes RagCap natural to drop into CI and automation.
Tradeoff More care spent on argument design and cross-platform quoting (Windows
.dllvs.so/.dylib) than a library-only tool would need. -
Hybrid search by default, tunable on demand
BM25 plus vector retrieval ships as the default; MMR re-ranking, candidate pools, and score modes are there when you need to tune precision against diversity.
Tradeoff More knobs to document, but sensible defaults mean the common path stays one command.
-
Dual license from day one
AGPL-3.0 for open use, with a separate commercial license for teams that can't adopt AGPL terms.
Tradeoff More upfront licensing work, in exchange for being genuinely open while leaving a sustainable commercial path.
04 Outcome
RagCap is open source on GitHub and installs as a .NET global
tool (dotnet tool install -g RagCap.CLI.Tool) or as
prebuilt, checksum-verified binaries for Windows, Linux, and
macOS (Intel and Apple Silicon) for environments without the .NET
SDK. The result is a retrieval pipeline that travels as a single
artifact and answers the same way everywhere.