| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by mholt 259 days ago

Glad you like Caddy!

> How does this handle data updating / fixing?

In the advanced import settings, you can customize what makes an item unique or a duplicate. You can also configure how to handle duplicates. By default, duplicates are skipped. But they can also be updated, and you can customize what gets updated and which of the two values to keep.

But yes, updates do run an UPDATE query, so they're irreversible. I explored schemas that were purely additive, so that you could traverse through mutations of the timeline, but this got messy real fast, and made exploring (reading) the timeline more complex/slow/error-prone. I do think it would be cool though, and I may still revisit that, because I think it could be quite beneficial.

2 comments

whacked_new 259 days ago

Thanks for the reply! I'll have to try this out... it almost looks like what perkeep was meant to become.

One interesting scenario re time traveling is if we use an LLM somewhere in data derivation. Say there's a secondary processor of e.g. journal notes that yield one kind of feature extraction, but the model gets updated at some point, then the output possibilities expand very quickly. We might also allow human intervention/correction, which should take priority and resist overwrites. Assuming we're caching these data then they'll also land somewhere in the database and unless provenance is first class, they'll appear just as ground truth as any other.

Bitemporal databases look interesting but the amount of scaffolding above sqlite makes the data harder to manage.

So if I keep ground truth data as text, looks like I'm going to have an import pipeline into timelinize, and basically ensure that there's a stable pkey (almost certainly timestamp + qualifier), and always overwrite. Seems feasible, pretty exciting!

link

infogulch 259 days ago

Have you heard of XTDB / Bitemporality? The basic idea is to make time 2-dimensional, where each record has both a System Time range and a Valid Time range. Designed as a write-only db with full auditability for compliance purposes.

With 2D time you can ask complex questions about what you knew when, with simpler questions automatically extended into a question about the current time. Like:

    "What is the price?" -> "What is the price today, as of today?"
    "What was the price in 2022" -> "What was the price in 2022, as of today?"
    "What was the price in 2022, as of 2023?"

You probably don't want to just switch to XTDB, but if you pursue this idea I think you should look into 2D time as I think it is schematically the correct conceptualization for this problem.

Docs: https://docs.xtdb.com/concepts/key-concepts.html#temporal-co... | 2025 Blog: https://xtdb.com/blog/diy-bitemporality-challenge | Visualization tool: https://docs.xtdb.com/concepts/key-concepts.html#temporal-co...

link

mholt 259 days ago

Yeah, I did actually pursue this for a time (heh), but I might revisit it later. It was too much complexity for debateable value-add, though the value is growing on me.

link