Y
Hacker News
new
|
ask
|
show
|
jobs
by
amoshaviv
81 days ago
I wanted to make sure "thinking" and "planning" features are not being tested in this comparison, but I definitely tested "simply phrased" tasks as well:
https://www.flowtester.ai/shared/ce1c8ef9-f387-48be-93f0-938...