Hacker News new | ask | show | jobs
by gdiamos 249 days ago
llm.finetune(data) is a leaky abstraction

Read Andrej’s blog that I linked earlier in the thread if you want to understand why.

1 comments

If it works it works? :shrug:
The problem is that it doesn’t always work and when it does fail it fails silently.

Debugging requires knowing some small detail about your data distribution or how you did gradient clipping which take time and painstakingly detailed experiments to uncover.

> The problem is that it doesn’t always work and when it does fail it fails silently.

Right, but why does that mean you need more employees? You need to figure out how to surface failures, rather than just adding more meat to the problem.