| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by voz_ 1100 days ago
	If an llm labels it, does that have the same value? Isn’t it just fancy regurgitation of knowns?

4 comments

natch 1100 days ago

Even humans disagree about labels. Especially humans willing to do this work.

And with the topical depth say ChatGPT4 has, I would think these labels have more value, although just as with humans some validation and verification steps are required.

link

nihit-desai 1100 days ago

Good question - one followup question there is value for who? If it is to train the LLM that is labeling, then I agree. If it is to train a smaller downstream model (e.g. finetune a pretrained BERT model) then the value is as good as coming from any human annotator and only a function of label quality

link

voz_ 1100 days ago

Why retrain that smaller model from scratch tho? Just do a little transfer learning, or get creative and see if you can prune down to a smaller model algorithmically instead of doing the whole label and train rigamarole from scratch on what is effectively regurgitation.

I’m not sold this has directional value.

link

nihit-desai 1100 days ago

Hmm, I'm not suggesting training a smaller model from scratch - in most cases you'd want to finetune a pretrained model (aka, transfer learning) for your specific usecase/problem domain.

The need for labeled data for any kind of training is a constant though :)

link

scotty79 1100 days ago

It has some value. If you let AI label the data and feed it back to it you are reaffirming it's guesses. If you independently verified that guesses are as correct as human ones you are teaching AI to be more sure about the correct thing.

link

mycall 1100 days ago

llm have emergent abilities [0] which could provide additional value to any output or label.

[0] https://www.jasonwei.net/blog/emergence

link

voz_ 1100 days ago

Not sold they do.

link