| Hi HN — I built a bit-for-bit deterministic analytics engine. It runs classical ML pipelines (normalization → canonical transform → deterministic K-Means) with zero nondeterminism: • no floating-point divergence
• no randomness
• no environment drift
• no timestamp or locale sensitivity
• Docker-pinned numeric behavior
• reproducible across machines, OSes, and hardware The OSS drop includes: • deterministic ingest + normalization
• deterministic K-Means (Iris + Wine)
• golden-reference hashes
• cross-machine reproducibility tests
• 3-machine ingest demo video
(direct download: https://github.com/bryanziehl/prima-veritas/releases/downloa...
)
• MIT license + full docs + architecture diagrams If you work in ML, science, infra, or compliance, you already know how painful nondeterministic pipelines are.
This project is a first “Hello World” toward a broader deterministic verification kernel. Feedback, critique, or reproducibility tests welcome — especially on different machine architectures.
Happy to answer anything live. |
=== Prima Veritas OSS — Hash Check (iris) ===
normalized → MATCH Expected: EF28EA082C882A3F9379A57E05C929D76E98899E151A6746B07D8D899644372F Actual: EF28EA082C882A3F9379A57E05C929D76E98899E151A6746B07D8D899644372F
kmeans → MATCH Expected: DA96D0505BCB1A5A2B826CEB1AA7C34073CB88CB29AE1236006FA4B0F0D74C46 Actual: DA96D0505BCB1A5A2B826CEB1AA7C34073CB88CB29AE1236006FA4B0F0D74C46
Hashcheck PASSED — outputs match golden hashes.
---------
Next step is probably benchmarking this against sklearn? Accuracy comparison and performance hit from all the rounding operations. Anyone here working in maritime auditing, medical data, or other regulated stuff - would you actually use something like this? Trying to figure out if crypto- verifiable analytics is solving a real problem or just a cool technical exercise.