|
|
|
|
|
by brrrrrm
610 days ago
|
|
fake it. add some latency to the first token and then "stream" at the rate you received tokens even though the entire thing (or some sizable chunk) has been generated. that'll give you the buffer you need to seem fast while also staying safe. |
|