| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jstummbillig 771 days ago

A llm is biased by design, the "open models" are no different here. OpenAI will, like any other model designer, pick and chose whatever data they want in their model and strike deals to that end.

The only question is in how far this is can be viewed as ads. Here I would find a strong backslash slightly ironic, since a lot of people have called the non-consensual incorporation of openly available data problematic; this is an obvious alternative option, that lures with the added benefit of deep integration over simply paying. A "true partnership", at face value. Smart.

If however this actually qualifies as ads (as in: unfair prioritisation that has nothing to do with the quality of the data and simply people paying money for priority placement) there is transparency laws in most jurisdictions for that already and I don't see why OpenAI would not honor them, like any other corp does.

2 comments

nyokodo 771 days ago

> A llm is biased by design

Everything is biased. The problem is when that bias is hidden and likely to be material to your use case. These leaked deals definitely qualify as both hidden and likely to be material to most use cases whereas more random human biases or biases inherent in accessible data may not.

> non-consensual incorporation of openly available data problematic; this is an obvious alternative option

A problematic alternative to an alleged injustice just moves the problem, it’s not a true resolution.

> there is transparency laws in most jurisdictions for that already and I don't see why OpenAI would not honour them

Hostile compliance is unfortunately a reality so this ought to give little comfort.

link

jstummbillig 771 days ago

> These leaked deals definitely qualify as both hidden and likely to be material to most use cases whereas more random human biases or biases inherent in accessible data may not.

a) Yes, leaked information definitely qualifies as hidden, that is, prior to the most likely illegal leak (which we apparently do not find objectionable, because, hey, it's the good type of breach of contract?)

b) Anyone who strikes deals understands there is a situation where things are being discussed, that would probably not okay to be implemented in that way. Hence, the pre-sign discussion phase of the deal. Somewhat like one could have some weird ideas about a piece of code, that will not be implemented. Ah-HA!-ing everything that was at some point on the table is a bit silly.

> A problematic alternative to an alleged injustice just moves the problem, it’s not a true resolution.

The one characteristic I found that sets the people that are good to work with apart is understanding the need for a better solution, over those who (correctly but inconsequentially) declare everything to be problematic and think that to be some kind of interesting insight. It's not. Everything is really bad.

Offer something slightly less bad, and we are on our way.

> Hostile compliance is unfortunately a reality so this ought to give little comfort.

Yes, people will break the law. They are found out, eventually, or the law is found out to be bad and will be improved. No, not in 100% of the cases. But doubting this general concept that our societies rely upon whenever it serves an argument is so very lame.

link

Havoc 771 days ago

> A llm is biased by design

I don’t think some bias is inherently in models is in any way comparable to a pay to play marketing angle

link

jstummbillig 771 days ago

I reject the framing.

We can't have it both ways. If we want model makers to license content they will pick and chose a) the licensing model and b) their partners, in a way, that they think makes a superior model. This will always be an exclusive process.

link

swader999 771 days ago

I think we need to separate licensing and promotion. They have wildly different outcomes. Licensing is cool, it's part of the recipe. Promoting something above its legitimate weight is akin to collusion or buying up amazon reviews without earning them.

link

warkdarrior 771 days ago

That's just pushes up the cost of licensing.

link

swader999 771 days ago

Not if the pie grows bigger.

link

PeterisP 771 days ago

We don't want it both ways - if that's the price we'd have to pay, at least I definitely don't want model makers to license content.

link

xkcd-sucks 771 days ago

It's a question of axioms. LLMs are by definition "biased" in their weights; training is biasing. Now the stated goal of biasing these models is towards "truth", but we all know that's really biasing towards "looking like the training set" (tl;dr, no not verbatim). And who's to say the advertising industry-blessed training material is not the highest standard of truth? :)

link

nyokodo 771 days ago

> And who's to say the advertising industry-blessed training material is not the highest standard of truth? :)

Anyone who understands what perverse incentives are, that’s who. Or are you just playing the relativism card?

link