| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by arecurrence 1420 days ago

I suspect this is where an API and additional cost reductions will move the needle even before we improve the models themselves (which seems to be coming at a rapid pace right now). I can see a scenario like this working well in the future:

1. Get close via prompt debugging to what you want (effectively where you are now)

2. Run an image generation pipeline that creates 10,000 images or an infinite stream

3. Run each image through an 'image to text' step for vector similarity filtering

4. Take images that have very similar 'image to text' similarity scores to the original prompt and present to the user.

Once we can run models of this quality locally, it can even be a job that runs overnight and you wake up in the morning to a set of results to look at.