Hacker News new | ask | show | jobs
by eterps 1036 days ago
> I want to feed LLMs (and friends) messy data from my house and let it un-mess as best it can.

What would be your goal for doing that?

1 comments

To then be able to have secondary steps of lookup. I often want a search engine for "my life". Most commonly from text messages where i discussed something. In a perfect world it would be able to link context between emails, browser history, chat conversations, etc. I'd love a flexible system that could record what i have in boxes, in the fridge, etc.

Sounds a bit silly, but of course it's mostly just for fun. However on the more practical side, i do often find myself needing to dig through old text conversations trying to find that one message. Not having a flexible, deep search behind it sucks. I often find myself wanting to do the same with my browser history. Find that one website i visited, etc.

I have the thought that it would be great to make my data points more rich. Don't just tag my browser history with isolated tags, such as Programming, Rust, etc - but infer meaning from my searching. Be able to see that i'm working on ProjectX actively via CLI Git activity, and that i'm searching for Y. Be able to correlate commit Z with search Y. etcetc

It feels to me there's a ton of small, edge case utility that can be gained by dumping everything to a local server and having it link the data. But i don't want to do any of that manually.

Likewise, i've wanted to manage "Home Inventory" before - what's in boxes, etc. Managing that myself is tedious, though. LLMs seem ripe for figuring out associations - even dumb LLMs. My hope is that i eventually can start wiring things together and having the LLMs start making rich data out of messy untagged data.

Would be neat, /shrug

This would be truly great. I have been pushing back the task of itemizing all of my belongings into a spreadsheet in the event of a natural disaster/fire/theft and having an assistant that I could say, "I have this road bike I built" and have it look through my emails and gather all the components and associated costs, then add it to a spreadsheet, would be a boon to many people. Of course, having it do it automagically from my purchases would be even better.
have you heard of rewind.ai? it sounds like it might be a possible solution for what you're looking for (not affiliated with it though, and also don't have it on my Mac, so not sure how well it works in reality)
Just leaving this, there was an effort to build a truly open source one: https://github.com/dhamaniasad/cytev2
Yea, though it's not local. They claim it is, but then use ChatGPT .. which is odd.

Personally i want to build a fairly dumb system though. Ie make a system which can be useful with LLama2 13B or w/e. Something that doesn't require state of the art GPT4+.

If that means compromising on some features that's fine, but at least then it can be truly and fully local.