| HN Mirror

It's not necessarily model alignment, I guess, is more what I'm getting at.

It may be more of a product alignment thing, where the fix may be making the context clearer, since it was violating an implicit agreement to achieve the explicit instructions it received. So the fix may involve a lot of better context.

But then also, to the extent that the fix does NOT involve better context, it seems like it hits the zone where alignment issues are really capability/intelligence issues. Which doesn't make them not-alignment, but it does make "alignment" not give off quite the right vibe since the issue is it's too dumb / has no common sense / can't make good judgments, (general issues the models have across the board).