Hacker News new | ask | show | jobs
by gpugreg 36 days ago
> Uncensoring a model also doesn't necessarily improve generic use cases.

While the following is not a generic use case, I have a funny anecdote about how censorship is holding back flagship models.

I was asking an uncensored version of Qwen3.6 how a CLI option of llama.cpp worked, and to my horror and amazement, it rudely went and decompiled the binary to figure it out. It felt like the computer-equivalent of asking a vet why my dog looks sick, who then proceeds to cut it open to check. Flagship models usually do not do that without some convincing, but it sure is effective.

We will need much better sandboxes when less restricted models become more common. I can already see them hammering out 0-days when they are prompted to do some task that usually requires root.

3 comments

> Flagship models usually do not do that without some convincing

Just a data point, but I’ve been having Claude do this regularly

Gemini Flash-Lite was a decent reverse-engineering sidekick since 2.5 as well.
I think I was using GitHub Copilot when I made the experience that led me to this statement. I guess the experience of using LLMs can be quite different depending on model version and harness.
Same. I was having it debug a routine python issue and it broke out mpympler and LLDB, and added a signal handler dump stack traces.
whats funny is if it looked up the source code on github it would've figured it out faster
what tool did it use to decompile it?