| Was just talking about this on reddit like two days ago Instead of data going to models, we need models come to our data which is stored locally and stay locally. While there are many OSS for Loading personal data, they dont do images or videos. In the future everyone may get their own Model but for now tech is there but product/OSS is missing for everyone to get their own QLORA or RAG or Summarizer. Not just messages/docs: What we read or write, and our thoughts are part of what makes an individual unique. Our browsing history tells a lot about what we read but no one seems to make use of it other than google for ads.. Almost everyone has a habit of reading x news site, x social network, x youtube videos etc.. Ok, here are the summary for you from these 3 today. Was just watching this yesterday https://www.youtube.com/watch?v=zHLCKpmBeKA and thought, why we still don't have a computer secretary like her after almost 30 years, who is one step ahead of us. |
Local models for images are getting pretty good.
LLaVA is an LLM with multi-modal image capabilities that runs pretty well on my laptop: https://simonwillison.net/2023/Nov/29/llamafile/
Models like Salesforce BLIP can be used to generate captions for images too - I built a little CLI tool far that here: https://github.com/simonw/blip-caption