|
|
|
|
|
by crashedsnow
2629 days ago
|
|
Memory is a difficult one, especially in garbage-collected languages which have a habit of filling up the heap even when it's not used, so it's not always obvious how much memory is actually being used without having language/runtime specific signals. The mitigation in Cloud Run is both concurrency, and that you're only billed while a request is active. (Disclosure: Google Cloud PM) |
|
I'm not sure what do you mean here, Cloud Run uses Docker which runs regular processes in a cgroup, so it's sufficient to check the cgroup memory usage, right? Yes, Java can always use large heaps but we're running Python and C++ where a process' memory usage directly relates to what a program allocates (even PyPy with GC has this property).
> The mitigation in Cloud Run is both concurrency, and that you're only billed while a request is active.
When there are memory peaks, larger deployments without container-level concurreny look better. For my example 16GB of RAM allows running 8 containers to get a chance for a 2GB task to complete, but on average 90% of the memory will be wasted. On a single 16GB server I can run 48 tasks with 40% wasted and a high chance of the 2GB tasks finishing. Yes, in this scenario I must handle tasks killed due to OOM but the difference in throughput is so large that it's worth it.