Hacker News new | ask | show | jobs
by xgb84j 520 days ago
In the article they talk about how printing an error from the anyhow crate in debug format creates a full backtrace, which leads to an OOM error. This happens even with 4 GB of memory.

Why does creating a backtrace need such a large amount of memory? Is there a memory leak involved as well?

3 comments

I don't think the 4 GiB instance actually ran into an OOM error. They merely observed a 400 MiB memory spike. The crashing instances were limited to 256 and later 512 MiB.

(Assuming that the article incorrectly used Mib when they meant MiB. Used correctly b=bit, B=byte)

Well, based on the article, if there was a memory leak then they should see the steady increase in memory consumption, which was not the case.

The only explanation I can see (if their conclusion is accurate) is that the end result of the symbolization is more than 400MB additional memory consumption (which is a lot in my opinion), however the process of the symbolization requires more than 2GB additional memory (which is incredibly a lot).

The author replied with additional explanations, so it seems that the additional 400MB were needed because the debug symbols were compressed.
Sorry if the article is misleading.

The first increase of the memory limit was not 4G, but something roughly around 300Mb/400Mb, and the OOM did happen again with this setting.

Thus leading to a 2nd increase to 4Gi to be sure the app would not get OOM killed when the behavior get triggered. We needed the app to be alive/running for us to investigate the memory profiling.

Regarding the increase of 400MiB, yeah it is a lot, and it was a surprise to us too. We were not expecting such increase. There are, I think 2 reasons behind this.

1. This service is a grpc server, which has a lot of code generated, so lots of symbols

2. we compile the binary with debug symbols and a flag to compress the debug symbols sections to avoid having huge binary. Which may part be of this issue.

> 2. we compile the binary with debug symbols

symbols are usually included even with debuglevel 0, unless stripped[0]. And debuginfo is configurable at several levels[1]. If you've set it to 2/full try dropping to a lower level, that might also result in less data to load for the backtrace implementation.

[0] https://users.rust-lang.org/t/difference-between-strip-symbo... [1] https://doc.rust-lang.org/cargo/reference/profiles.html#debu...

Thanks, was not aware there was granularity for debuginfo ;)
> Sorry if the article is misleading.

I don't think the article is misleading, but I do think it's a shame that all the interesting info is saved for this hackernews comment. I think it would make for a more exciting article if you included more of the analysis along with the facts. Remember, as readers we don't know anything about your constraints/system.

It was a parti pris by me, I wanted the article to stay focus on the how, not much on the why. But I agree, even while the context is specific to us, many people wanted more interest of the surrounding, and why it happened. I wanted to explain the method ¯\_(ツ)_/¯
> we compile the binary with debug symbols and a flag to compress the debug symbols sections to avoid having huge binary.

How big are the uncompressed debug symbols? I'd expected processing uncompressed debug symbols to happen via a memory mapped file, while compressed debug symbols probably need to be extracted to anonymous memory.

https://github.com/llvm/llvm-project/issues/63290

Normal build

  cargo build --bin engine-gateway --release
      Finished `release` profile [optimized + debuginfo] target(s) in 1m 00s

  ls -lh target/release/engine-gateway 
  .rwxr-xr-x erebe erebe 198 MB Sun Jan 19 12:37:35 2025    target/release/engine-gateway

what we ship

  export RUSTFLAGS="-C link-arg=-Wl,--compress-debug-sections=zlib -C force-frame-pointers=yes" 
  cargo build --bin engine-gateway --release
      Finished `release` profile [optimized + debuginfo] target(s) in 1m 04s

  ls -lh target/release/engine-gateway
  .rwxr-xr-x erebe erebe 61 MB Sun Jan 19 12:39:13 2025  target/release/engine-gateway

The diff is more impressive on some bigger projects
The compressed symbols sounds like the likely culprit. Do you really need a small executable? The uncompressed symbols need to be loaded into RAM anyway, and if it is delayed until it is needed then you will have to allocate memory to uncompress them.
I will give it a shot next week to try out ;P

For this particular service, the size does not matter really. For others, it makes more diff (several hundred of Mb) and as we deploy on customers infra, we want images' size to stay reasonable. For now, we apply the same build rules for all our services to stay consistent.

Maybe I'm not communicating well. Or maybe I don't understand how the debug symbol compression works at runtime. But my point is that I don't think you are getting the tradeoff you think you are getting. The smaller executable may end up using more RAM. Usually at the deployment stage, that's what matters.

Smaller executables are more for things like reducing distribution sizes, or reducing process launch latency when disk throughput is the issue. When you invoke compression, you are explicitly trading off runtime performance in order to get the benefit of smaller on-disk or network transmission size. For a hosted service, that's usually not a good tradeoff.

Container images use compression too, so having the debug section uncompressed shouldn't actually make the images any bigger.
Thank you for this in-depth reply! Your answer makes a lot of sense. Also thank you for writing the article!