| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by gpugreg 36 days ago

> Uncensoring a model also doesn't necessarily improve generic use cases.

While the following is not a generic use case, I have a funny anecdote about how censorship is holding back flagship models.

I was asking an uncensored version of Qwen3.6 how a CLI option of llama.cpp worked, and to my horror and amazement, it rudely went and decompiled the binary to figure it out. It felt like the computer-equivalent of asking a vet why my dog looks sick, who then proceeds to cut it open to check. Flagship models usually do not do that without some convincing, but it sure is effective.

We will need much better sandboxes when less restricted models become more common. I can already see them hammering out 0-days when they are prompted to do some task that usually requires root.

3 comments

faitswulff 36 days ago

> Flagship models usually do not do that without some convincing

Just a data point, but I’ve been having Claude do this regularly

link

bpavuk 36 days ago

Gemini Flash-Lite was a decent reverse-engineering sidekick since 2.5 as well.

link

gpugreg 36 days ago

I think I was using GitHub Copilot when I made the experience that led me to this statement. I guess the experience of using LLMs can be quite different depending on model version and harness.

link

brookst 36 days ago

Same. I was having it debug a routine python issue and it broke out mpympler and LLDB, and added a signal handler dump stack traces.

link

novok 36 days ago

whats funny is if it looked up the source code on github it would've figured it out faster

link

NooneAtAll3 36 days ago

what tool did it use to decompile it?

link