Hacker News new | ask | show | jobs
by silentplummet 3855 days ago
Sometimes you really do legitimately have a lot of static, global state. For instance, consider a program that needs to reference local, national, and/or global geography and its metadata, on a wide scale, randomly. All the countries have subdivisions, and subdivisions of subdivisions, and so on all the way down, which are all inter-referential. You can easily hit 100 MB of state that is essentially constant, and needs to be indexed 50 different ways for millions of function calls per user action that would access it.

Why not manage access to such things in a singleton class?

3 comments

Singletons are fine, but it's almost always better to lazily initialize them rather than eagerly, to save on startup time. As a bonus, if you have no eager global initialization in your language, you can make import completely side-effect-free, which is a really nice simplification that I wish more languages adopted.
The slow startup from imports is my biggest annoyance with python.

We had a decent sized library at a previous company that pulled in modules that defined huge register maps, wrapped c++ libraries, etc.

I wrapped all imports in a lazy importer that was triggered by the first attribute access. It brought our script startup times from 3 seconds down to a fraction.

Blows me away that this isn't default behavior for ALL modules.

That behaviour feels to me like it may result in faster startup, but would also result in less predictable performance for code bases with somewhat random access such as web applications.

You could I suppose do some cache warming to make sure the first user request isn't slowed down, but its one more thing to think about.

>"I wrapped all imports in a lazy importer that was triggered by the first attribute access."

Well, putting code in the root of your file is generally the problem to such things, I would argue. Granted, I don't know about how that is necessary when it comes to "register maps" and "wrapped c++ libraries". But I'd imagine you should be encapsulating them away anyways and that would include fixing large startup time by design.

If this was the default, any change could completely upend the initialization order of your app. "Explicit is better than implicit".
As long as these data are immutable, sharing them is easy.

If you want hundreds on megs of shared mutable state, a database is a proper solution.

And make 300,000 queries over TCP like getting the list of county names in a state, or getting the list of place names in a county, because my actual use case involves fuzzy matching an arbitrary subset determined by user input, of 18,000,000+ unsanitized data records against geographical place names so they can be assigned geometries?

I'd like the program to finish in 15 seconds or less, please.

If you're making 300K queries over TCP to a database in order to do a calculation, then I'd say you need a much better data structure and/or algorithm. Either that, or do the bulk of the calculations on the database in P/T-SQL, or pre-calculate before-hand so that your on-line queries are just lookups instead of actual calculations.
You know there is such a thing as querying a database without going over a network, right?
It's moot.

The train of the discussion, if you go and read the OP's link and inner links, is like this:

- Singletons are bad - Why are singletons bad? - They're not "real" OO, they're global state, they obfuscate dependency, etc, etc, etc - But what if I just legitimately have a ton of global state? - Use a database! Use a filesystem!

The last point in the chain admits that the first point is mistaken. "Use a database" is just saying "use someone else's code to solve your problem". What if the database is implemented using singletons? What if it uses code that isn't OO at all? All you've accomplished is to say "OO can't solve your problem, use something external". In fact, my problem is solved just fine by using a singleton.

>essentially constant

immutable singleton is fine. The other concern is performance, but if you don't have to do this, there is no point.