| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jstimpfle 1723 days ago

> last thing one wants is a language that requires a lot of thought about lifetimes etc.

I challenge you: A lack of understanding about the data lifetimes in a program means lack of understanding about the data.

Not saying you can't have a lot of short-lived data items that you don't want to manage one-by-one. I'm saying that for the vast majority of data items, one should be able to give a reasonably well defined lifetime upper bound. So a good solution is to make a few boxes that group items by lifetime. And from time to time, throw the outdated boxes away.

And of the few items that don't have such an upper bound at creation time, many can be created in a special box that allows migrating boxes later when required.

3 comments

chrisseaton 1723 days ago

> A lack of understanding about the data lifetimes in a program means lack of understanding about the data.

But this argument can extend forever.

Is your program precisely dependently typed? If not is that a lack of understanding about the nature of the data as well and should you challenge yourself to fix that?

You have to trade-off how much you specify things with how valuable it is to get the result more quickly.

link

jstimpfle 1723 days ago

What you say is true. I only brought up "boxes" because the concept is still not widely known.

link

jenny91 1723 days ago

You don't have to challenge the person you're responding to. You have to challenge their quants. And they're not going to want to add that into the million other things they're thinking about while doing research in a Jupyter notebook or something.

You're just not going to get this buy-in from people who want to use a tool to get their work done.

link

short_sells_poo 1723 days ago

Thanks, but I think we may be talking cross purpose here. 99% of the research code ends up being thrown away (well, archived). Not because it's bad code necessarily, but because the idea that was being prototyped is a dead end. This means it's paramount that the language you use has to be as low friction and interactive as possible.

Imagine you are trying to establish whether there's a relationship between timeseries X and timeseries Y. You just want a tool that allows you to quickly calculate some summary statistics of these timeseries, clean them, convince yourself that they behave according to your expectations and then run some form of regression.

Nowhere in this process do you care about lifetimes. It's literally irrelevant. In fact, as long as all your work fits into memory, you don't even care about memory management. Your objective is to answer the primary question, everything else is a costly distraction.

The 1% of ideas that ends up being worthwhile is what gets productionized and needs to be robust. But obviously rewriting everything from language A to radically different language B adds it's own headaches.

link