|
Hey! Thanks for taking the time to comment.
I don't think it's that much of a magic with modern multimodal embedding models that are available out there. As you mentioned:
> /…/ The same with photos: you organize them once in the input stage or periodically, and then you can search for them easily later on. /…/ As a hobby photographer, I take lots of photos. For example, I know I've taken photos of my cats, tractors, bridges, forests, etc., but I never bother manually tagging them beyond basic editing (contrast, white balance, etc.). A system should be able to recognize what's in these photos and allow me to search for them not only by their content but by vibe as well. And once I find a photo I like, I'd really like to see similar photos (this in particular is very helpful for photographers curating their exhibitions). This is possible to achieve these days. Also, I fully understand your point of view on `find`, `fd`, `grep`, `cat`, etc., but in reality it's only us nerds who ever open a terminal. |