Y
Hacker News
new
|
ask
|
show
|
jobs
by
pton_xd
85 days ago
"in this paper we primarily evaluate the LLM itself without external tool calls."
Maybe this is a factor?
1 comments
simianwords
85 days ago
No tools were used.
link
chromacity
85 days ago
IIRC, web chat often uses tools / code without surfacing this information in any obvious way.
link