Hacker News new | ask | show | jobs
by 2ndorderthought 53 days ago
I could probably do this, but why on earth would I want to immediately put myself on a list as a dangerous person. The main problem with this is, even if somehow they stopped all points of failure with gpt5.5 which they can't, you can distill a new model from gpt5.5 or any other model and get anything you would want in probably under 4b parameters. A lot of this is theater so they don't get sued as easily when it inevitably happens.
1 comments

How can you distill a model from a closed-weights model like this? I've never heard of model reverse engineering.
Distillation doesn't have to use weights. Think of it as a fine tune. The basic form of it is, you ask a large model lots of questions and you train the small model on the results. Even better if you ask it to explain it's rationale. There are tons of schemes for it do some searching around. One I remember is for each prompt, ask the small model to answer, have a big model review and critique the answer, train on the results.

I won't go into how that applies specifically with relation to this article. But you can even use distillation as a service tools. I believe they support this to some extent, though probably not for chatgpt.

I think a year ago or so there was some sort of scandal about other companies doing this to chatgpt. As well as individuals dumping their entire training sets. Lots of ways, hypothetically of course things like this could be and likely are being done right now.

By making millions of queries to frontier models from a lot of accounts, collecting the results as a dataset, and finetuning your model on it. Chinese companies have been caught doing it on an industrial scale several times now.