Hacker News new | ask | show | jobs
Instruction-Based Image Editing via LLM (github.com)
98 points by phront 871 days ago
7 comments

What's interesting to me is that the project feels very "un-Apple", despite being open-sourced under the Apple org; some typos and lack of proper punctuation in the README, using jupyter notebooks for the data processing instead of scripts or a CLI, poor repo organization, no comments even in the demo: https://github.com/apple/ml-mgie/blob/main/demo.ipynb

Apple truly becoming an ML company when they release ML Engineer quality code ;)

Clicking on the names at the top of the readme, it appears as though it is a collaboration of researchers both inside and outside of Apple.
I came up with a similar idea to this (also pre-Dalle edits-via-instruction) with the idea that prompting generators kinda sucks (also chat interfaces for image editing aren't great) and really you just want to explore the latent space "around" an initial prompt.

Here's an overview of the tool (Dreamwalker): https://www.youtube.com/watch?v=k_mJgFmdWWY

And you can download/use it for free here (mac/pc): https://forums.afterschool.studio/t/dreamwalker-alpha-2-rele...

It's incredible to see Apple contributing here. Excited to see what they bring to their platforms.
Why is this being downvoted? Agreed. They have acquired so much talent [1] in this space, it's exciting to start seeing things come out of that.

[1] https://www.youtube.com/watch?v=Uj9Jg4WldJg

I wish they had more examples. the image doesn't seem to be that much better than if you generate an image with stable diffusion and then tweak the prompt.
> Notices: Apple's rights in the attached weight differentials are hereby licensed under the CC-BY-NC license. Apple makes no representations with regards to LLaMa or any other third party software, which are subject to their own terms.

Wait, they can do that? Assuming weights have copyright, shouldn't the finetuning be a modification of the original work and so have the same license?

The “weight differentials” are like, a patch that can be applied to the original weights, right?
How similar is this to InstructPix2Pix?

https://github.com/timothybrooks/instruct-pix2pix

Has there been any work done on charts, graphs, and data visualizations produced by large AI generative models?