Hacker News new | ask | show | jobs
by Jtsummers 1990 days ago
Ideally, if the language/standard library provides maps but not sets, and you wanted to use the idiomatic set = map of type -> bool approach, you'd create a wrapper so that intent is preserved but users don't have to know about the backing mechanism. Of course, it's obnoxious if everyone has to do this themselves and the language lacks generics so you have to write this once for each potential type.
1 comments

Wrappers that don't wrap very much aren't worthwhile IMO. Its like getting an amazon box with a fedex box inside. Just give me the package itself.
Shallow wrappers that don't wrap much (now) but convey intent better are valuable if you buy into the idea of modularity and encapsulation in general. Some reasons (probably more out there):

1. The now-provided interface can more clearly express what the code is intending to do (better names for the operations you're providing than the underlying system has, remove_from_end to pop or dequeue)

2. Hide methods of interacting with the underlying data structures that you don't want people to use (use a C++ vector as a stack, but don't want random access)

3. You can replace the underlying mechanisms at will without impacting the users

If you just wrap a vector in your own vector class and otherwise provide the same operations (or a limited set of operations but for no good reason to restrict usage), sure, that's moronic. But if you wrap a vector class in a "BigNumber" class and provide operations like add, subtract, mod, etc. then value has been added. Same thing with the idea of wrapping a map in a set interface.

Wrapping to hide is valuable, but wrapping has a cost which is generally underrated. Every wrapper is a thing itself which must also be understood when trying to understand how things work. And every wrapper is a division between blocks of code, meaning if you have changes which impact multiple layers of wrap, its harder to determine what to change, and to maintain the understandability of each layer.

For this reason im an advocate of lazy wrapping. Create an abstraction at the last moment, when its painfully obvious what benefit it will provide, when you can see how it ties together disparate pre-existing code blocks, and when you have the highest confidence that it will stick and not need to be unwrapped next week by the senior dev.

> Every wrapper is a thing itself which must also be understood when trying to understand how things work.

I'd offer a different view. Wrapping/abstracting like this should reduce the amount of things a user of the abstraction needs to know. I don't care how Java's BigInteger class works under the hood, only that it does what I need it to do. If I did have to know how it worked to use it, this suggests a failure on the part of whoever created it.

It does increase what the maintainer of the underlying system (including the abstraction) needs to know, but if done in a sane manner this should not be a burden. So we're making a tradeoff. The user gets something simpler, the underlying system maintainer gets something a bit more complex. Or the user gets something more complex and with more boilerplate but the underlying system maintainer gets something simpler (though will be pestered with, "Why don't you offer a generic set yet?" asked for years to come).

> meaning if you have changes which impact multiple layers of wrap, its harder to determine what to change, and to maintain the understandability of each layer.

When this happens, in my experience, it has meant one or more of:

1. The choice of how to wrap/abstract was poorly chosen

2. The choice was made too early (before the problem was properly understood)

3. A major change was made that would've been hard to identify/plan for earlier

I ignore (3) when writing code beyond what's reasonable to plan for. (1) and (2) though mean I mostly agree with this:

> Create an abstraction at the last moment

But rephrased, borrowing the phrase I first saw in some Lean Software book, "last responsible moment." It's not sensible, for instance, to use a map to booleans as a set throughout the project's life and only wrap it at the last moment. If you know it's going to be a set, wrap it early because this offers clarity to your code and reduces boilerplate/noise. If you know you need a stack, and have a vector available, wrap it and hide the random access option. If it later turns out that you also want random access, you can offer it, but if it's been available from the start then users will have abused that and you won't be able to rein it in later (without a lot of effort and heartache).

I'm not sure how I feel about the `set` wrapper. I suppose its nice to hide some of the detail of how the set works. On the other hand, it is confidence inspiring to be told "this is just a map, its really that simple" as a user. I have a similar conflict about string alias types like `type MyId string`.
But it's not a Map, it's a Set. If I see an API return a Map, I expect it's returning a relationship of keys to values, because that's what a Map is used for. If I see it returning a Set, I expect it's return a collection of unique values, because that's what a Set is used for.

I mean, you could support only List objects in the language and call it a day because they can be used as anything else. Or only lambdas, for the same reason. At the end of the day, though, having structures for the various ways you want to treat data is helpful. Using the right structure to hold data reduces cognitive load.