| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mjpa86 1320 days ago
	what happened to a memory leak being some memory that was allocated but had no reference to it so couldn't be freed? If you can copy the map and release it and the memory usage drops, there is no leak?

4 comments

fnord123 1320 days ago

That's the most used example of a memory leak but it is not the definition of a memory leak.

If you put data into a hash map and forget the key, you leaked.

link

saghm 1319 days ago

This is also why valgrind classifies the leaks it reports with stuff like "still reachable" or "possibly still in use" (I might be remembering the exact phrasing incorrectly). It would be pretty hard to programmatically determine whether the memory that's still kept around was intended to be kept around or not, which is why valgrind supports generating "suppressions" (and specifying them in subsequent runs to be ignored).

link

titzer 1320 days ago

This is the use case for weak maps, which both Java and JavaScript have. In the latter case, the map is not iterable, so one cannot observe JavaScript GC (through WeakMap at least).

link

simiones 1319 days ago

A memory leak has always meant "this program keeps allocating more memory as it runs, even though it's not being asked to store anything new". That is equivalent to saying that a program has a memory leak when it fails to free memory that is no longer needed, not just memory that is no longer reachable.

For example, a common example of a memory leak is adding items to a "cache" without any mechanism that evicts items from the cache in any scenario. The "cache" is thus not a cache, but a memory leak (a common implementation of this leaking scenario is that items are put in a map, but never removed from the map).

Memory leak has never, as far as I know, referred to the specific case of memory that is no longer accessible from program code to be freed. In fact, this definition doesn't even make sense from a runtime system perspective, since even in C, the memory is actually always still reachable - from malloc()'s internal data structures.

link

twic 1320 days ago

Those pretty much can't happen in garbage collected languages, so the usage of the term has been widened to include things like this. I agree it's a shame.

Roedy Green coined the name "packratting" for this modern kind of memory leak: https://www.mindprod.com/jgloss/packratting.html

link

simiones 1319 days ago

Memory leaks have always meant "failing to free memory that is no longer needed".

Garbage collection literature often stresses the difference between "no longer needed" and "not reachable", noting that the former is not automatically enforceable (it amounts to solving the halting problem), but the latter is only a heuristic. So, the fact that garbage collectors can't prevent all memory leaks is always stressed by the literature.

link

twic 1319 days ago

> Memory leaks have always meant "failing to free memory that is no longer needed".

Citation needed - that sounds reasonable, but i have never come across that formulation before.

link

simiones 1319 days ago

I read about this in the Garbage Collection Handbook [1], which is an excellent overview of the entire field (at least up to ~2016), which discusses the distinction at large. I don't have it on me to quote, but a very clear distinction is made between "live objects" and "reachable objects", with reachabillity acting as a computable proxy for the uncomputable property of liveness. Liveness is defined as "this object will be used again by the program in some way", and a memory leak is defined as "failing to free an object that is no longer live". An unreachable object can't be live, but there are many ways of having a reachable object that is not live.

To prove that this is used in the literature at large, here is the abstract of a random GC paper I found [0]:

> Functional languages manage heap data through garbage collection. Since static analysis of heap data is difficult, garbage collectors conservatively approximate the liveness of heap objects by reachability i.e. every object that is reachable from the root set is considered live. Consequently, a large amount of memory that is reachable but not used further during execution is left uncollected by the collector.

[0] https://dl.acm.org/doi/10.1145/3381898.3397208

[1] https://www.google.com/books/edition/_/TKOfDQAAQBAJ?hl=en&gb... (you may try to search for live/reachable/leak to get some idea here as well)

link

ufo 1320 days ago

I've also seen the term "memory bloat".

link

rob74 1320 days ago

Yeah, that makes the title pretty much clickbait, because a memory leak in a memory-safe language would really be a big deal...

link

TheDong 1320 days ago

> a memory leak in a memory-safe language would really be a big deal...

It is not.

Let me show you a memory leak in the memory safe language, rust:

    let vec: Vec<u8> = Vec::with_capacity(1024);
    std::mem::forget(vec);

Let me show you a memory leak in the memory safe language, go:

    _ = time.Tick(1 * time.Second)

See the docs for time.Tick in the stdlib, which documents that calling it is a memory leak: https://pkg.go.dev/time@go1.19.3#Tick

You can also, if you want to leak memory in go, set the environment variable GOGC=off, and there you go, instant memory leak.

Practically any language, memory safe or otherwise, will let you create a memory leak.

link

EdwardDiego 1319 days ago

Plenty easy to leak memory in memory-safe languages. I'm assuming we're including GC-ed languages in that set.

link

sigg3 1320 days ago

Go is just memory safe until you have a race, or so I have heard.

link

suremarc 1320 days ago

Indeed, this person showed that you can read/write to arbitrary memory addresses inside of a Go program: https://blog.stalkr.net/2022/01/universal-go-exploit-using-d...

Although, it's pretty useless as an exploit, since it requires you to be able to run arbitrary Go code to begin with (the author admits as much). It's _very_ unlikely that a remote attacker could exploit a data race in a regular Go program.

link

Thaxll 1319 days ago

Every GC language by definition are memory safe, memory safety in programming does not mean than accessing the same resources from two thread should be safe.

link

suremarc 1319 days ago

I don't know how it works in other languages, but accessing a partially overwritten slice in Go (as will happen in the presence of data races) can cause your code to access out-of-bounds memory. And as we all know, once you have read/write access to arbitrary areas in memory, you've basically opened up Pandora's box.

link

Thaxll 1318 days ago

Go is memory safe, if not then Java/C#/Python/Ruby are not either.

link

suremarc 1318 days ago

I don't think you can have data races (but certainly you can have race conditions) in python because of the GIL. I imagine Ruby is similar. Otherwise, no, the other languages you listed are not "memory safe". Once you start reading and writing to arbitrary locations in a process, almost anything can happen. But certainly you can say that there are different degrees of memory safety. All of the languages you mentioned are leaps and bounds above C/C++.

link

pclmulqdq 1320 days ago

The same goes for Rust and most other "safe" languages. They all have synchronization primitives that make it safe, but you need to use them - the compiler won't always tell you.

link

bilkow 1319 days ago

For Rust specifically, the compiler does force safe programs to have no data races. That's actually what the ownership system, Send and Sync are about. If you manage to corrupt memory or have undefined behavior in safe Rust, that should be a compiler or library bug.

See https://doc.rust-lang.org/nomicon/races.html

link

Dobbs 1319 days ago

That is basically the entire shtick of rust. That data is "owned", and only the owner can write. You can "borrow" something for read access, but if something is borrowed it can't be written to.

There are of course workarounds for this like reference counted wrappers and so on.

link

suremarc 1320 days ago

I have no idea what you mean here. Data races are next to impossible in safe Rust.

link