Hacker News new | ask | show | jobs
by jmalicki 55 days ago
Its output quite literally is not independent, as the "thinking tokens" are attended to by the attention mechanism.