|
|
|
|
|
by xpe
312 days ago
|
|
How many people are discussing this after one person did 1 prompt with 1 data point for each model and wrote a comment? What is being measured here? For end-to-end time, one model is: t_total = t_network + t_queue + t_batch_wait + t_inference + t_service_overhead |
|