Instruction-Based Image Editing via LLM

Y	Hacker News new \| ask \| show \| jobs

	Instruction-Based Image Editing via LLM (github.com)
	98 points by phront 871 days ago

7 comments

JamilD 870 days ago

What's interesting to me is that the project feels very "un-Apple", despite being open-sourced under the Apple org; some typos and lack of proper punctuation in the README, using jupyter notebooks for the data processing instead of scripts or a CLI, poor repo organization, no comments even in the demo: https://github.com/apple/ml-mgie/blob/main/demo.ipynb

Apple truly becoming an ML company when they release ML Engineer quality code ;)

link

latchkey 870 days ago

Clicking on the names at the top of the readme, it appears as though it is a collaboration of researchers both inside and outside of Apple.

link

kkukshtel 871 days ago

I came up with a similar idea to this (also pre-Dalle edits-via-instruction) with the idea that prompting generators kinda sucks (also chat interfaces for image editing aren't great) and really you just want to explore the latent space "around" an initial prompt.

Here's an overview of the tool (Dreamwalker): https://www.youtube.com/watch?v=k_mJgFmdWWY

And you can download/use it for free here (mac/pc): https://forums.afterschool.studio/t/dreamwalker-alpha-2-rele...

link

achalkley 871 days ago

It's incredible to see Apple contributing here. Excited to see what they bring to their platforms.

link

frenchie4111 871 days ago

Why is this being downvoted? Agreed. They have acquired so much talent [1] in this space, it's exciting to start seeing things come out of that.

[1] https://www.youtube.com/watch?v=Uj9Jg4WldJg

link

itake 871 days ago

I wish they had more examples. the image doesn't seem to be that much better than if you generate an image with stable diffusion and then tweak the prompt.

link

rodoxcasta 870 days ago

> Notices: Apple's rights in the attached weight differentials are hereby licensed under the CC-BY-NC license. Apple makes no representations with regards to LLaMa or any other third party software, which are subject to their own terms.

Wait, they can do that? Assuming weights have copyright, shouldn't the finetuning be a modification of the original work and so have the same license?

link

drdeca 870 days ago

The “weight differentials” are like, a patch that can be applied to the original weights, right?

link

vunderba 870 days ago

How similar is this to InstructPix2Pix?

https://github.com/timothybrooks/instruct-pix2pix

link

stcredzero 871 days ago

Has there been any work done on charts, graphs, and data visualizations produced by large AI generative models?

link

jefc1111 871 days ago

I came across this yesterday https://news.ycombinator.com/item?id=39265127

link