Yep, exactly, just looped through each image with the same prompt and stored the results in a SQLite database to search through and maybe present more than a simple WebUI in the future.
It's wrapped up in a bunch of POC code around talking to LLMs, so it's very very messy, but it does work. Probably will even work for someone that's not me.
Nice! How complicated do you think it would be to do summaries of all photos in a folder, ie say for a collection of holiday photos or after an event where images are grouped?
Very simple. You could either do what I did, and ask for details on each image, then ask for some sort of summary of the group of summaries, or just throw all the images in one go:
You might want to extract the location from the image exif data and include in the prompt as well. There are reverse geocoding libraries and services that takes coordinates and return a city, which would probably make for a better summary of a trip.
If you want to see, here it is:
https://gist.github.com/Q726kbXuN/f300149131c008798411aa3246...
Here's an example of the kind of detail it built up for me for one image:
https://imgur.com/a/6jpISbk
It's wrapped up in a bunch of POC code around talking to LLMs, so it's very very messy, but it does work. Probably will even work for someone that's not me.