Hacker News new | ask | show | jobs
by zwaps 993 days ago
Having literally done it in an enterprise setting (and participated in experiments for some of the largest companies in the world in their respective domain fields), I have to say: your lack of nuance and abundance of arrogance does not come across very well.

It is important to distinguish between something being impossible, infeasible and not well understood. Fine-tuning "for effect" is mostly the latter.

You say "current fine-tuning techniques can only contribute to knowledge indirectly" and then in the next post row back to "except in toy examples" because the former is - literally - not correct.

This is HN. We are not advising clients on how "to get their data into their AI best". We can discuss here the actual technical detail of a thing. An intellectually honest discussion begins with saying: "From a scientific standpoint, and even from a practical standpoint, we are not sure yet, however..."

1 comments

"advising clients" is such an odd way of describing "making a complex topic approachable"

But you're correct, this is HN: so much pontificating without producing a single counterfactual implies you should speak for yourself and not the collective.

They said "LLM", but given the context it's an RLHF LLM, and presumably they want a generalized way to add factual information in a way that doesn't cripple the model's general performance (yes, I am being so arrogant as to draw obvious conclusions to give them a useful answer)

No paper on the subject has achieved this, the ones that come close (and by close I mean very far) fall back to BERT sized models which I already addressed below: so please petition your "enterprise" to share their secrets

(wrong crowd to get any gravitas out of the word enterprise btw, we understand it means "constrained usecase with minimal external validation")