|
|
|
|
|
by longbeachbass
355 days ago
|
|
Thanks for this! Learnt a lot. Curious to understand how do we ensure that the same model instance gets requests from the same client/user? Since conversations are stateful and the model needs context from previous turns of the conversation. Is this happening at the load balancer layer? |
|