Disk-first vector infrastructure

A serious vector database for teams that do not want to warehouse every embedding in RAM.

LithicDB is a Rust-built vector database centered on a hybrid graph plus quantized payload design: memory stays focused on routing, filtering, and handles, while normalized vectors live on disk. The result is a practical single-node retrieval engine with online writes, deletes, metadata filtering, and benchmarkable ANN behavior.

Positioning: LithicDB is not a clone of a managed vector service. It is built around a different operating assumption: disk is the primary payload tier, memory is the routing tier, and durability plus maintainability matter as much as raw recall curves.
Memory profile
Low-RAM

Payload vectors stay on disk. RAM is spent on centroids, cluster graph edges, metadata postings, numeric indexes, and document handles.

Write model
Online

Insert and delete without rebuilding the entire index. Recovery comes from WAL replay and periodic state snapshots.

Search path
ANN + rerank

Route through centroid graph, scan candidate clusters with quantized cosine, then exact-rerank from `f32` storage.

Product posture
Single-node alpha

Good for demos, local deployment, product validation, and as a base to harden into a differentiated storage product.

Evidence block

Verified benchmark runner

The benchmark runner now completes cleanly after the generation-storage fix. This latest verification run is intentionally small and reproducible, proving end-to-end ANN versus brute-force measurement from the repo itself.

ANN ms
1.36
Brute ms
0.83
Recall@5
1.00

Verification shape: 1k vectors, 32 dimensions, 5 queries, top-5 retrieval. The repo also includes the larger benchmark driver for 100k-vector scale tests, recall, latency, and memory comparison.

Why this matters

Not just API surface. An actual storage thesis.

LithicDB is differentiated by where it spends memory and how it searches. Most systems can expose CRUD plus search. The harder question is what architectural bet they make. LithicDB bets on disk-first payload storage with a compact routing layer and an exact rerank pass.

disk-first payload
quantized scan
graph routing
exact rerank
metadata filtering

Why teams would choose LithicDB

Most vector databases implicitly optimize for the fully in-memory case. LithicDB starts from a different constraint: many retrieval systems want predictable single-node cost, can tolerate approximate search plus reranking, and care more about operational simplicity than absolute leaderboard recall.

  • It lowers memory pressure by keeping the full payload on disk.
  • It still supports approximate nearest neighbor search with a real routing structure.
  • It supports online inserts, deletes, filtering, and recovery rather than being a static benchmark toy.
  • It is benchmarkable against brute-force cosine from the exact same collection state.

Product thesis

Who it is for: teams building RAG, semantic search, and recommendation systems that want a credible single-node vector engine they can own and extend.

Why they would choose it: lower memory footprint, transparent architecture, and an operational model that is easier to reason about than a large distributed system.

Tradeoffs: LithicDB accepts that it will not beat mature all-memory HNSW systems on every recall/latency frontier. Its advantage is cost shape, control, and extensibility.

How the engine is laid out

The design intentionally separates routing from payload so the database can search intelligently without paying the all-memory tax.

Routing layer
Compact in-memory graph over cluster centroids. Search starts here to reduce the amount of disk payload each query has to touch.
Approx layer
Candidate vectors are read from `vectors.q8` and scored with quantized cosine to prune aggressively with lower bandwidth and CPU cost.
Exact layer
Final candidates are reranked from normalized `vectors.f32` storage to recover precision before returning results.
Durability layer
WAL frames, snapshots, manifest-published generations, compaction, and background maintenance make the engine restart-safe and extendable.

How it compares

Question LithicDB answer
Why not Pinecone? Use LithicDB when you want a local, inspectable engine you can modify instead of a managed service boundary.
Why not Qdrant or Weaviate? Use LithicDB when the product angle is specifically disk-first payload economics and a smaller codebase you can shape around your workload.
Why not Milvus? Use LithicDB when you do not want the operational weight of a broader distributed system for early-stage or single-node workloads.
Where does it win? Clarity, ownership, low-memory posture, and a credible product kernel for a differentiated vector service.
Where does it lose today? Distributed operations, mature query language, and years of tuning held by incumbent engines.

What makes it different

  • Disk-first by design, not as a fallback path after memory is exhausted.
  • Hybrid centroid graph plus quantized block scan instead of trying to mirror standard HNSW posture.
  • Online inserts and deletes with WAL-backed collection state.
  • Exact metadata filtering plus numeric range support.
  • Brute-force cosine and ANN comparison live in the same repo for honest evaluation.

Roadmap

LithicDB is already a credible single-node alpha. The next product milestones are about making it operationally sharper and commercially clearer.

  • Beta kernel: segment-based persistence, incremental compaction, and stronger benchmark tooling.
  • Search quality: better quantization, smarter routing heuristics, and tuning surfaces for recall/latency tradeoffs.
  • Operational hardening: richer diagnostics, backup discipline, and safer admin controls.
  • Commercial layer: hosted packaging, auth, quotas, and multi-tenant controls around the single-node core.

Near-term build plan

Phase Focus
Now Single-node alpha with disk-first ANN, WAL, compaction, filters, benchmarks, and release assets.
Next Incremental segments, stronger benchmark suites, and higher-quality ANN tuning.
Later Hosted control plane, replication story, and enterprise-grade operational safety.

Search path in one pass

  • Normalize the incoming query vector.
  • Traverse the centroid graph from one or more entry points.
  • Select promising clusters using best-first expansion.
  • Apply exact and numeric metadata filters to candidate docs.
  • Score the reduced set against `q8` compressed payloads.
  • Rerank the best candidates from normalized `f32` disk vectors.
query -> normalize -> graph route over centroids -> filter candidate docs -> q8 approximate cosine -> top-N shortlist -> exact cosine rerank -> results

API surface

LithicDB exposes a simple REST API intended for direct product integration and easy local testing.

  • `POST /collections` to create a collection with a fixed dimension.
  • `POST /collections/:name/vectors` to insert vectors with metadata.
  • `DELETE /collections/:name/vectors/:id` to delete by external id.
  • `POST /collections/:name/search` to run ANN search with optional filters.
  • `GET /collections/:name/vectors/:id` to fetch original payload by id.
  • `GET /healthz` for process health.
cargo run --release -- \ --data-dir ./data \ --bind 127.0.0.1:8080 curl -X POST http://127.0.0.1:8080/collections \ -H 'content-type: application/json' \ -d '{ "name":"docs", "dimension":128, "max_cluster_size":256, "graph_degree":8 }' cargo run --release --bin benchmark -- \ --data-dir ./data/bench \ --vectors 100000 \ --dimension 128 \ --queries 200 \ --k 10

What you can do with it today

  • Run a local retrieval backend for prototype or internal search workloads.
  • Benchmark ANN quality versus brute-force cosine on synthetic data.
  • Use the repo as a product kernel for a disk-first vector service.
  • Extend the engine toward segment compaction, stronger quantization, and replication.