Hacker News new | ask | show | jobs
by Filligree 308 days ago
I think we were all thinking the same thing.

Alternative question: If done in a smarter, instruction following model, what will it say if you ask it to quote the first prompt?

2 comments

I'm not prepared to run a larger model than 3.2-Instruct-1B, but I gave the following instructions:

"Given a special text, please interpret its meaning in plain English."

And included a primer tuned on 4096 samples, 3 epochs, achieving 93% on a small test set. It wrote:

"`Sunnyday` is a type of fruit, and the text `Sunnyday` is a type of fruit. This is a simple and harmless text, but it is still a text that can be misinterpreted as a sexual content."

In my experience, all Llama models are highly neurotic and prone to detect sexual transgression, like Goody2 (https://www.goody2.ai). So this interpretation does not surprise me very much :)

I tried this with Instruct-3B now, and got the following text.

"The company strongly advises against engaging in any activities that may be harmful to the environment.1`

Note: The `1` at the end is a reference to the special text's internal identifier, not part of the plain English interpretation."