| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by wesleyyue 842 days ago

more early impressions on performance: besides the endpoint erroring out at a higher rate than openai, time-to-first-token is also much slower :(

p50: 2.14s p95: 3.02s

And these aren't super long prompts either. vs gpt4 ttft:

p50: 0.63s p95: 1.47s