Hacker News new | ask | show | jobs
by CyberDildonics 93 days ago
If people didn't need to do it, they wouldn't generally do it. Not always, but generally.
1 comments

People do stuff they shouldn't all the time.

For example, some code I had to clean up pretty early on in my career was a dev, for unknown reasons, reinventing the `ArrayList` and then using that invention as a set (doing deduplication by iterating over the elements and checking for duplicates). It was done in the name of performance, but it was never a slow part of the code. I replaced the whole thing with a `HashSet` and saved ~300 loc as a result.

This individual did that sort of stuff all over the code base.

Reinventing data structures poorly is very common.

Heap allocation in java is something trivial happens constantly. People typically do funky stuff with memory allocation because they have to, because the GC is causing pauses.

People avoid system allocators in C++ too, they just don't have to do it because of uncontrollable pauses.

> People typically do funky stuff with memory allocation because they have to

This same dev did things like putting what he deemed as being large objects (icons) into weak references to save memory. When the references were collected, invariably they had to be reloaded.

That was not the source of memory pressure issues in the app.

I've developed a mistrust for a lot of devs "doing it because we have to" when it comes to performance tweaks. It's not a never thing that a buffer is the right thing to do, but it's not been something I had to reach for to solve GC pressure issues. Often times, far more simple solutions like pulling an allocation out of the middle of a loop, or switching from boxed types to primatives, was all that was needed to relieve memory pressure.

The closest I've come to it is replacing code which would do an expensive and allocation heavy calculation with a field that caches the result of that calculation on the first call.

"This same dev did things like putting what he deemed as being large objects (icons) into weak references to save memory. When the references were collected, invariably they had to be reloaded."

Well actually, this is what the Apple[1] docs instruct devs to do. https://developer.apple.com/library/archive/documentation/Co...

For .NET on iOS, the difference between managed and unmanaged objects is of particular concern. In the example you provide, the Icon Assets are objects from an Apple Framework, not managed by .NET. You might use them in the UIKit views for list items in a UIKit List View.

iOS creates and disposes these list view items independently of .NET managed code. Because the reference counts can't be updated across these contexts, you'll inevitably end up with dangling references. This memory can't be cleared, so inadvertently using strong references will cause a memory leak that grows until your app crashes.

The following is a great explainer in the context of Xamarin for iOS. https://thomasbandt.com/xamarinios-memory-pitfalls

The above still applies with different languages / frameworks of course, however the difference is less explicit from a syntax perspective IMHO

I'm not sure why you're rationale for how to deal with garbage collected memory is based on a guy that didn't know standard data structures and your own gut feelings.

Any program that cares about performance is going to focus on minimizing memory allocation first. The difference between a GCed language like java is that the problems manifest as gc pauses that may or may not be predictable. In a language like C++ you can skip the pauses and worry about the overall throughput.

Well, let me just circle back to the start of this comment chain.

> Many programs in GC language end up fighting the GC by allocating a large buffer and managing it by hand

That's the primary thing I'm contending with. This is a strategy for fighting the GC, but it's also generally a bad strategy. One that I think gets pulled more because someone heard of the suggestion and less because it's a good way to make things faster.

That guy I'm talking about did a lot of "performance optimizations" based on gut feelings and not data. I've observed that a lot of engineers operate that way.

But I've further observed that when it comes to optimizing for the GC, a large amount of problems don't need such an extreme measure like building your own memory buffer and managing it directly. In fact, that sort of a measure is generally counter productive in a GC environment as it makes major collections more costly. It isn't a "never do this" thing, but it's also not something that "many programs" should be doing.

I agree that many programs with a GC will probably need to change their algorithms to minimize allocations. I disagree that "allocating a large buffer and managing it by hand" is a technique that almost any program or library needs to engage in to minimize GCs.

This is a strategy for fighting the GC, but it's also generally a bad strategy.

Allocating a large buffer is literally what an array or vector is. A heap uses a heap structure and hops around in memory for every allocation and free. It gets worse the more allocations there are. The allocations are fragmented and in different parts of memory.

Allocating a large buffer takes care of all this if it is possible to anything else. It doesn't make sense to make lots of heap allocations when what you want is multiple items next to each other in memory and one heap allocation.

That guy I'm talking about did a lot of "performance optimizations" based on gut feelings and not data.

You need to let this go, that guy has nothing to do with what works when optimizing memory usage and allocation.

But I've further observed that when it comes to optimizing for the GC, a large amount of problems don't need such an extreme measure like building your own memory buffer and managing it directly.

Making an array of contiguous items is not an "extreme strategy", it's the most efficient and simplest way for a program to run. Other memory allocations can just be an extension of this.

I agree that many programs with a GC will probably need to change their algorithms to minimize allocations. I disagree that "allocating a large buffer and managing it by hand"

If you need the same amount of memory but need to minimize allocations how do you think that is done? You make larger allocations and split them up. You keep saying "managing it by hand" as if there is something that has to be tricky or difficult. Using indices of an array is not difficult and neither is handing out indices or ranges to in small sections.

Right? "I had this one contingent experience and I've built my entire world view and set of practices around it."
Premature optimization is the root of all evil.