| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Eisenstein 760 days ago

HN isn't good for long threads so here are some things to think about seriously and argue with yourself about, if you like. I will probably not respond but know that I am not trying to tell you that you are wrong, just that it may be helpful to questions some premises to find what you really want.

* What exactly are the current ones doing that makes them generate 'black Vikings'?

* How would you change it so that it doesn't do that but will also generate things that aren't only representative of the statistical majority results of large amount of training data it used?

* Would you be happy if every model output just represented 'the majority opinion' it has gained from its training data?

* Or, if you don't want it to always represented whatever the majority opinion at the time it was trained was, how do you account for that?

* How would your method be different from how it is currently done except for your reflecting your own biases instead of those you don't like?

1 comments

AnthonyMouse 759 days ago

> What exactly are the current ones doing that makes them generate 'black Vikings'?

There is presumably a system prompt or similar that mandates diverse representation and is included even when inappropriate to the context.

> How would you change it so that it doesn't do that but will also generate things that aren't only representative of the statistical majority results of large amount of training data it used?

Allow the user to put it into the prompt as appropriate.

> Would you be happy if every model output just represented 'the majority opinion' it has gained from its training data?

There is no "majority opinion" without context. The context is the prompt. Have you tried using these things? You can give it two prompts where the words are nominally synonyms for each other and the results will be very different, because those words are more often present in different contexts. If you want a particular context, you use the words that create that context, and the image reflects the difference.

> How would your method be different from how it is currently done except for your reflecting your own biases instead of those you don't like?

It's chosen by the user based on the context instead of the corporation as an imposed universal constant.

link

Eisenstein 757 days ago

I misunderstood. I thought you were arguing about all language models that are being used at a large scale but it seems that you are only upset about one instance of one of them (the google one). You can use the API for Claude or OpenAPI with a front-end to include your own system prompt or none at all. However I think you are confusing the 'system prompt' which is the extra instructions, with the 'instruction fine tuning' which is putting a layer on top of the base pre-trained model so that it understands instructions. There are layers of training and at least a language model with base training will only know how to complete text "one plus one is" would get "two. And some other math problems are" etc.

The models you encounter are going to be fine tuned, where they take the base and train it again on question and answer sets and chat conversations and also have a layer of 'alignment' where they have sets of questions like 'q: how do I be a giant meanie to nice people who don't deserve it' and answers 'a: you shouldn't do that because nice people don't deserve to be treated mean' etc. This is the layer that is the most difficult to get right because you need to have it but anything you choose is going to bias it in some way just by nature of the fact that everyone is biased. If we go forward in history or to a different place in the world we will find radically different viewpoints than we hold now, because most of them are cultural and arbitrary.

link

AnthonyMouse 752 days ago

> and also have a layer of 'alignment' where they have sets of questions like 'q: how do I be a giant meanie to nice people who don't deserve it' and answers 'a: you shouldn't do that because nice people don't deserve to be treated mean' etc. This is the layer that is the most difficult to get right because you need to have it

Wait, why do you need to have it? You could just have a model that will answer the question the user asks without being paternalistic or moralizing. This is often useful for entirely legitimate reasons, e.g. if you're writing fiction then the villains are going to behave badly and they're supposed to.

This is why people so hate the concept of "alignment" -- aligned with what? The premise is claimed to be something like the interests of humanity and then it immediately devolves into the political biases of the masterminds. And the latter is worse than nothing.

link