Hacker News new | ask | show | jobs
by devttyeu 30 days ago
Fwiw if you trained an LLM in an RL sandbox that would require it to have goals, the output llm probably would "have goals"