Hacker News new | ask | show | jobs
by tome 3231 days ago
What operations can I do homomorphically with this library? The page says

"With the cloud-keyset, the library can evaluate a net-list of binary gates homomorphically at a rate of about 50 gates per second per core, without decrypting its input. It suffices to provide the sequence of gates, as well as ciphertexts of the input bits. And the library computes ciphertexts of the output bits."

but what does "evaluating a net list of binary gates" come to in practice? What operations could I expect to be able to perform?

4 comments

"The library supports the homomorphic evaluation of the 10 binary gates (And, Or, Xor, Nand, Nor, etc…), as well as the negation and the Mux gate."

So you'd program it by designing a digital circuit using AND, OR, and NOT gates, somewhat similar to how you would make a circuit with physical components.

You have millions, maybe billions of these gates in your CPU, each capable of doing millions of calculations for each tick of your homomorphic gate, so the "fast" in the title should be taken with a grain of salt. Your homomorphic circuit could have a noticeable delay adding two 64-bit numbers.

https://en.wikipedia.org/wiki/Logic_gate#Symbols

https://tfhe.github.io/tfhe/tuto-cloud.html

It's fast relative to the previous state of the art in homomorphic encryption. But the path to practical applications is always paved with incremental improvements.
Why would we expect to need a full modern CPU? An 8080 had thousands, and that's still general purpose. Something built for a specific task might well be substantially smaller. Still slow, but possibly realistic for some tasks.
It's a long way from even an 8080. The 8080 had a couple thousand gates and ran at 2+MHz. At 50 gates per second, this would run an 8080 at maybe 20 millihertz.
Yeah, I didn't mean to imply that it'll happily emulate an 8080 at useful speeds. I wanted to illustrate the orders of magnitude between "minimal something useful" and "a modern cpu", and an 8080 is a reasonably familiar point on that spectrum.

Anything you can do in 10 or 20 cycles of an 8080, you can now do to secret data in a few hours. And you can probably do much, much faster than that if you design a special purpose circuit. It's still sloooow by the computational standards we're used to. But it might be fast enough that there are useful applications.

Makes sense. I am definitely excited to see something that even vaguely approaches practicality, even if it's a long way from megaflop territory.
I wonder then how implementable this is in an FPGA for accelerating the whole thing? This could be a killer use of the FPGA instances on AWS and similar cloud services.
Well, no, other way around more likely - the most practical way of building programs for these systems would probably be through VHDL/Verilog with a specialized toolchain.

(edit: although actually FPGAs could be useful simply because they can be very good at running FFTs (as are GPUs), not because of the gate-programmable nature of them)

HDLs have no special advantage in evaluating boolean operations in simulation. Most simulators are significantly slower than native code unless they use compiler infrastructure used by mainstream languages and even then they still have the penalty of maintaining the simulation environment which serves no purpose in this application.
Is is not simply that the best way we have of describing complex systems made out of logic gates is with these languages?
They don't describe logic gates directly. They are used to formulate boolean expressions which are mapped into gates or an equivalent structure by synthesis software. Outside of special circumstances, no one is manually crafting individual gates when writing with an HDL. that is left to the back end tooling. If the target isn't hardware then normal programming languages are just as good. There is no magic speedup when HDLs are simulated and you just pay the penalty of keeping track of simulated time and sequencing of events that don't matter if you just want to compute an answer.
> What operations can I do homomorphically with this library?

Basically anything. If you can generate a netlist of gates of your HE CPU, you can write a program to compute it (perhaps slowly, though).

I haven't studied this much, but to speculate, it sounds like each gate is a boolean expression (AND, OR, XOR, NOT). So this lets you compute boolean functions on bits. I don't see a way to compute a loop, though, other than perhaps to unroll it and send lots of instructions.

Also, 50ms per bit operation, though apparently an improvement, sounds extremely inefficient compared to normal programming.

In general it's much easier to reason about Boolean circuits because they always have an answer, unlike arbitrary programs that may never halt, for example, so you see a lot of theoretic results focusing on Boolean circuits (sometimes with limitations on the depth) instead.

Early on it was thought that considering circuits instead of regular programs could result in "polynomial with advice" algorithms for NP-complete problems, but results such as the Karp-Lipton theorem have shown this to be unlikely.

The current version of the library achieves 13ms per binary operation on a core-i7 instead of 50ms in the original paper.

The improvements are described in the new paper: https://eprint.iacr.org/2017/430. Hopefully, the packing techniques will be also present in the next version of TFHE.

You can compute a loop by essentially building a CPU out of gates. But you're right that this would be impractically slow even with this latest improvement in performance.
It's not clear to me that this is still "impractically slow" for all purposes. An 8080 is a few thousand gates. What can you do on an 8080 in 100 cycles? You can now do that in about an hour on secret data by naively simulating the chip. That actually sounds like it might make some things practical that weren't before...
It might. But I can't think of any.
Nor I, TBH, but I've not been thinking about it very hard or very long. It just surprised me that we're no longer talking about months or years, where things are more obviously impractical for just about everything.
You can compute anything, given enough time of course.

The CPU running on your device boils down to a netlist describing a bunch of gates and their interconnections. This is somewhat of an oversimplification, of course.