Hacker News new | ask | show | jobs
by riedel 748 days ago
What does stateful mean: I always wonder how loading states of users is done, it seems that one can call `llama_state_set_data` , does this load balancer create a central store for such states? What is the overhead of transfering state?
1 comments

Currently, it is a single instance in memory, so it doesn't transfer state. HA is on the roadmap; only then will it need some kind of distributed state store.

Local states are reported by the agents installed alongside llama.cpp to the load balancer. That means they can be dynamically added and removed; it doesn't need a central configuration.