Y
Hacker News
new
|
ask
|
show
|
jobs
by
stingraycharles
73 days ago
And I wonder how redacting them reduces latency, as it sure as hell doesn’t make the responses any faster and bandwidth isn’t the issue here.
1 comments
sothatsit
73 days ago
They provide thinking summaries, so I assume they have to call Haiku or some other model to summarise the thinking blocks.
link
stingraycharles
73 days ago
That’s not asynchronous? Wouldn’t it make more sense to disable those thinking summaries in those cases rather than hiding the thinking altogether?
link