Hacker News new | ask | show | jobs
by cyanydeez 41 days ago
the training method and design is the emergent property. disagreement stops token generation. there arnt multi round training that follow reasonable disagreements.