Hacker News new | ask | show | jobs
by walt74 892 days ago
Ofcourse this article rubs the open source dogma believers here the wrong way and i'll get downvoted to hell, but i wholeheartedly agree with the author, and i think that you guys don't get what AI-systems are, at their basic level.

The are information interpolators, and all the information you can interpolate from the training data is readily present in latent space, waiting to be discovered by a prompt.

There is this argument: But but you can find a bomb instruction in a chemistry textbook. No, you can't. Without checking this, but i think you'll have a hard time finding any chemistry textbook that explicitly gives you a bomb instruction. Ofcourse, yes, you can find all the information necessary to build that bomb in that text book, but the key difference to a latent space full of interpolated data points is that: You have to sit down, find that information that is scattered throughout that textbook, write it down, interpolate that knowlegde yourself, write that down, and then you have a bomb instruction -- except you'll have written it for yourself.

Not so with latent spaces. The bomb instruction is already there, interpolated from all the data points, just waiting to be prompted, and that is easy peazy with, yes, open source models.

So spare me the whining about anachronistic software dev dogmas from the 90s and arrive in the present, pretty please.

3 comments

You can literally just get books about how to make bombs. No need to go with a chemistry textbook.

The reason these people want to control AI is about ideological control of narratives, not because low-iq terrorists will be empowered to bomb us. They already have the recipes and they're widely available online.

1. This would apply to any system with a rudimentary world model, including probably modern Google search or Wolfram Alpha. By this logic, any sufficiently advanced search engine, computational chemistry system, or perhaps even an NLP calculator like Sulver would to a varying level "aid in terrorist activities" by the virtue of just doing what it was designed to.

2. Unlike say Wolfram Alpha which can just remove any number of compounds from its knowledge base, erasing concepts from LLMs is much more complicated than an SQL query. In fact, it at present moment seems to be nearly impossible.

RLHF fine-tuning doesn't seem to add nor remove information learned in pre-training, naive regexes or classification models post-generation don't work well with response streaming nor are particularly difficult to circumvent with a small change of phrasing. Creating a smaller curated dataset thoroughly searched for all "dangerous" information doesn't work in today's paradigm of blind model scaling (and would by the way allow say your very phone to run a tiny "safe" model, since LLMs derive most of their world model through memorization)

3. Are OpenAI, or potentially very soon Microsoft, Google, Amazon, and the rest of big tech, trustworthy custodians for this supposedly dangerous tool? What if they themselves choose to forgo the safety measures if it means a higher eval score? What if they use their power of MITMing the almighty black box to hide evidence of copyright violation or hard-code correct answers to safety benchmarks? What if users' relationship with LLMs becomes more para-social and with increased pressure to actually make any real profit outside of VC speculation they'd increasingly override model's responses with advertisements?

---

I agree LLMs present a real problematic challenge to safety, but in my belief it stems not from them becoming too perfect search engines, but just very good stochastic parrots capable of inducing delusions in vulnerable individuals.

See cases below:

- https://www.theregister.com/2023/10/06/ai_chatbot_kill_queen... with a commercial model.

- https://www.euronews.com/next/2023/03/31/man-ends-his-life-a... with an open-weights model.

joke post?