Hacker News new | ask | show | jobs
by exe34 774 days ago
Could I ask a dumb question, what does it look like when a model isn't aligned with its intended output? Does the text look off-center?
1 comments

Alignment really just means how close do the model outputs align with human preferences or some other criteria.

At a glance, this looks like a model pretrained to perform prompt-engineering. It should automatically use Chain-of-Thought in its responses in order to improve it's programming abilities, and, therefore be better aligned with users expectations.

It also has reflection. So they include code to execute the model output and return the response to the model for feedback.

Ah nice thanks for explaining!