Y
Hacker News
new
|
ask
|
show
|
jobs
by
port3000
163 days ago
The 'flash' / no or low-thinking versions of those models are crazy fast. We often receive full response (not just first token) in less than 1 second via API.