|
|
|
|
|
by mholt
259 days ago
|
|
Glad you like Caddy! > How does this handle data updating / fixing? In the advanced import settings, you can customize what makes an item unique or a duplicate. You can also configure how to handle duplicates. By default, duplicates are skipped. But they can also be updated, and you can customize what gets updated and which of the two values to keep. But yes, updates do run an UPDATE query, so they're irreversible. I explored schemas that were purely additive, so that you could traverse through mutations of the timeline, but this got messy real fast, and made exploring (reading) the timeline more complex/slow/error-prone. I do think it would be cool though, and I may still revisit that, because I think it could be quite beneficial. |
|
One interesting scenario re time traveling is if we use an LLM somewhere in data derivation. Say there's a secondary processor of e.g. journal notes that yield one kind of feature extraction, but the model gets updated at some point, then the output possibilities expand very quickly. We might also allow human intervention/correction, which should take priority and resist overwrites. Assuming we're caching these data then they'll also land somewhere in the database and unless provenance is first class, they'll appear just as ground truth as any other.
Bitemporal databases look interesting but the amount of scaffolding above sqlite makes the data harder to manage.
So if I keep ground truth data as text, looks like I'm going to have an import pipeline into timelinize, and basically ensure that there's a stable pkey (almost certainly timestamp + qualifier), and always overwrite. Seems feasible, pretty exciting!