| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sbszllr 156 days ago
	Interestingly enough, it is possible to do private inference in theory, e.g. via oblivious inference protocols but prohibitively slow in practice. You can also throw a model into a trusted execution environment. But again, too slow.

1 comments

ramoz 156 days ago

Modern TEE is actually performant for industry needs these days. Over 400,000x gains of zero knowledge proofs and with nominal differences from most raw inference workloads.

link

sbszllr 156 days ago

I agree that is performant enough for many applications, I work in the field. But it isn't performant enough to run large scale LLM inference with reasonable latency. Especially not when we compare the throughput numbers for a single-tenant inference inside a TEE vs batched non-private inference.

link

ramoz 156 days ago

We just served Deepseek R1 on this bad boy in CC+TEE (and an integrated signing layer we developed for vLLM).

https://pasteboard.co/k1hjwT7pWI6x.png

reach out if interested in collab.

link