| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by adamquek 945 days ago
	Because of reinforcement learning. The reward model add more value for output that sounded formal and professional, and penalise those that are more casual or incomplete. You can finetune it to change the behaviour somewhat. But ultimately, there will be that AI flavour that you can't get rid of because of the way the LLM is trained.