Hacker News new | ask | show | jobs
by pj_mukh 867 days ago
Theoretically, most OpenCV-type image pre/post-processing stuff is available in blocks and then all the major multi-modal + diffusion AI blocks are also available. As a sampling of what we've recently added:

AI Blocks: - Multimodal LLM (GPT4v)

- Remove objects in Images

- AI Upscale 4x

- Prompted Segmentation (SAM w/ text prompting)

Editing Blocks: - Change format

- Rotate

- Invert Color

- Blur

- Resize

- Mask to Alpha

If we've missed something please let us know, we just went through a big exercise in making sure we can quickly add new blocks.

1 comments

Is this all AI or using something like Imagemagick for the lower level tasks?
It's a combination of things. The idea is that you can build workflows that chain functionality from ai models, as well as lower level image processing tasks. For lower level tasks we use the usual suspects - PIL, ImageMagik, OpenCV etc.