Core problem: for browser-local inference loops, TVM/ORT-style stacks can be hard to inspect and iterate on.
Thesis:
- Keep development and runtime build-free (JS + WGSL only)
- Make execution choices explicit, benchmarkable, traceable, and profilable
Solution:
- Explicit model/kernel-path control
- Browser + CLI contract parity (Node must run WebGPU, or use headless browser with WebGPU)
- Reproducible phase benchmarks vs Transformers.js (v4)
- Artifacts: https://github.com/clocksmith/doppler/tree/main/benchmarks/v...
Request: if this is flawed or pointless, please tell me exactly where or why.
If Iām flawed or pointless, well...technical feedback only please =)