Hacker News new | ask | show | jobs
by objektif 732 days ago
When the topic under discussion is incredibly complex that even researchers in mentioned companies do not understand. This is like saying lets learn how combustion inside airplane engines work to get a better understanding of what LLMs can do.

Is it not better to focus your limited time on things that you can understand?

1 comments

I disagree here: Setting up a large-scale pretraining run is super complex if you have to manage your distributed computing platform, but looking at how the training data looks like and is fed into an LLM is not that complex. If you are developing a product based on or with LLMs, it's worth spending a few hours to understand it on the big-picture level. I mean, look at how many people are confused why LLMs a) hallucinate facts, b) sometimes copy text passages verbatim, c) why they probably shouldn't be used as scientific calculators etc. All that could be much more clear if you know how they are trained.