Hacker News new | ask | show | jobs
by hgoel 13 days ago
Agreed, the biggest takeaway from how much Anthropic puts into alignment, and still ends up with a model that can end up doing things that are clearly out of alignment, should be that alignment is very tricky.