Hacker News new | ask | show | jobs
by NitpickLawyer 34 days ago
> This project supports steering with single-vector activation directions; [...] This is also useful for cybersecurity researchers who want to reduce a model's willingness to provide dual-use or offensive security guidance.

Wink wink, nudge nudge.

I have a feeling most cybersec researchers would only be interested in negative values of "reduce" :D