HN Reader

NewTopBestAskShowJob
Show HN: I built a deterministic vector DB kernel in Rust using fixed-point math
score icon1
comment icon0
4 hours agoby varshith17
While building a vector database in Python (valori: https://pypi.org/project/valori/), I ran into a bug that took longer than I want to admit.

The same embeddings, indexed on macOS and Windows, would sometimes produce slightly different nearest-neighbor results. Nothing dramatic but enough to break tests and make snapshots non-reproducible.

At first I assumed it was my code. Then I realized the problem wasn’t my logic at all it was "floating-point math".

Different CPUs, different compilers, FMA instructions, and subtle differences in IEEE behavior meant that “the same” computation wasn’t actually the same everywhere.

That sent me down a rabbit hole.

The idea:

Instead of trying to fight floating-point nondeterminism, I asked a simpler question:

What if the core of a vector system never used floats at all?

So I built a small kernel in Rust that uses fixed-point arithmetic (Q16.16) for all vector math. No floats in the core. No randomness. No architecture-dependent behavior.

Same inputs → same state → same snapshot → same search results. Always.

What the kernel does

Fixed-point vectors (Q16.16)

Deterministic insert, search, and replay

Snapshot + restore with bit-identical output

No RNG anywhere

No dependency on CPU floating-point behavior

The kernel is no_std friendly and is meant to be embedded under different layers (node server, Python client, edge devices).

Indexes and embeddings live outside the kernel. The kernel is just a deterministic memory engine.

Why this matters

Most vector databases optimize for throughput and scale. That’s fine.

But there’s a whole class of problems where reproducibility, auditability, and replayability matter more:

Edge devices (robots, drones)

Offline or intermittently connected systems

Long-running agents that need stable memory

Systems where “almost the same” isn’t good enough

This project is my attempt to explore that design space.

The code is early and opinionated, but the determinism is real.

Happy to hear feedback, skepticism, or pointers to similar work I should study.

No comments