Hacker News new | ask | show | jobs
by itsiadam 1122 days ago
Both approaches are valid, but I would hope they are using a separate model to validate responses, rather than crippling the base model(s). In OpenAI's case, we don't know for sure, but it seems like a combination of both, resulting in lower quality responses overall.

I imagine LLaMA was fed highly-vetted training data, as opposed to being "fixed" afterwards.