| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by blintz 1137 days ago

Concrete is really impressive and permissively licensed. The ML library has a FHE version of (a subset of) scikit-learn, which I honestly thought I’d see in another 5+ years. Like look at this example:

    # Now we train in the clear and quantize the weights
    model = LogisticRegression(n_bits=8)
    model.fit(X_train, y_train)

    # We can simulate the predictions in the clear
    y_pred_clear = model.predict(X_test)

    # We then compile on a representative set 
    model.compile(X_train)

    # Finally we run the inference on encrypted inputs !
    y_pred_fhe = model.predict(X_test, fhe="execute")

    print("In clear  :", y_pred_clear)
    print("In FHE    :", y_pred_fhe)
    print(f"Similarity: {int((y_pred_fhe == y_pred_clear).mean()*100)}%")

There’s some ways to go on performance, but the ergonomics of using FHE are already pretty good!

3 comments

binoua 1135 days ago

Thank you! The python version is quite clear as well: still from the README,

``` from concrete import fhe

def add(x, y): return x + y

compiler = fhe.Compiler(add, {"x": "encrypted", "y": "encrypted"}) inputset = [(2, 3), (0, 0), (1, 6), (7, 7), (7, 1), (3, 2), (6, 1), (1, 7), (4, 5), (5, 4)]

print(f"Compiling...") circuit = compiler.compile(inputset)

print(f"Generating keys...") circuit.keygen()

examples = [(3, 4), (1, 2), (7, 7), (0, 0)] for example in examples: encrypted_example = circuit.encrypt(*example) encrypted_result = circuit.run(encrypted_example) result = circuit.decrypt(encrypted_result) print(f"Evaluation of {' + '.join(map(str, example))} homomorphically = {result}") ```

Here, that's more for non-ML computations.

link

sigmoid10 1137 days ago

Isn't this basically just unnecessary overhead if you can do a forward pass with encrypted weights? Like, what are you still protecting with encryption at that point?

link

eyegor 1137 days ago

You're protecting your inputs and outputs. If you have a model that's designed to run with sensitive data but you want to don't have the compute power to run it locally, what do you do? Putting the model on a cloud provider means their system would see your sensitive data, which may be unacceptable for contractual or legal reasons. This lets you send the inputs encrypted, receive the outputs encrypted, then you can decrypt the outputs in your weak but trusted environment.

link

Nullabillity 1137 days ago

That sounds like fairy tale engineering. Even for the IoT-with-a-potato-MCU use case, you'd be much better off offloading that computation to a trustable device (such as the user's desktop computer or home gateway) instead of shipping it off a cloud environment and paying the (absolutely massive) FHE tax.

link

fd0r 1136 days ago

In a case where you are offloading the computation to another device because of compute limitations it would indeed probably make more sense, at least at the moment, to offload the computation to a trusted device.

But there is always the case where the server side with the model does not want to disclose the model itself while the client does not want to disclose its data either (like in many healthcare applications for example or in the case of the recent Open ai Samsung incident). In this case the FHE tax might be a decent price to pay.

If you want to read more on the topic, there is blog post about the cost of running a LLM in FHE: https://www.zama.ai/post/chatgpt-privacy-with-homomorphic-en...

The main improvements in terms of speed will come from dedicated hardware accelerators but some models (those that run on tabular data for example) already have acceptable runtimes.

link

bczm 1137 days ago

Sure, sometimes, if you have a trusted device, great. However, in other use cases, there will be no device which is both trusted by the user _and_ the model owner, and FHE will help here. We have to remind how valuable the models are

link

Nullabillity 1136 days ago

Too bad for the model owner, then.

link

fd0r 1136 days ago

In the example above the parameters are in the clear and only inputs and outputs are encrypted!

That being said you could probably do the reverse and encrypt the parameters of the model and not the inputs/outputs if you are deploying the model directly to the client.

link

fire 1137 days ago

not op but I think i'm too dumb on this topic to understand what you mean, could you explain further? ( to me it sounds like you're suggesting using encrypted weights while they're suggesting using encrypted inputs which to me solves two different use cases )

link

dmos62 1137 days ago

Aren't you running the second prediction on unencrypted data, contrary to what's said in the comment?

link

bczm 1137 days ago

Actually the input is encrypted in the ‘predict’ function here. There are functions to encryption, run, decrypt separately

link