Hacker News new | ask | show | jobs
by chaoz_ 873 days ago
Love the idea, however, it's not clear whether I will get access to a large collection of components for building such workflows or what is currently possible? Would nice to get this info before proceeding with auth.
2 comments

Theoretically, most OpenCV-type image pre/post-processing stuff is available in blocks and then all the major multi-modal + diffusion AI blocks are also available. As a sampling of what we've recently added:

AI Blocks: - Multimodal LLM (GPT4v)

- Remove objects in Images

- AI Upscale 4x

- Prompted Segmentation (SAM w/ text prompting)

Editing Blocks: - Change format

- Rotate

- Invert Color

- Blur

- Resize

- Mask to Alpha

If we've missed something please let us know, we just went through a big exercise in making sure we can quickly add new blocks.

Is this all AI or using something like Imagemagick for the lower level tasks?
It's a combination of things. The idea is that you can build workflows that chain functionality from ai models, as well as lower level image processing tasks. For lower level tasks we use the usual suspects - PIL, ImageMagik, OpenCV etc.
To add to pj's comment -

We are adding more blocks constantly. We're also considering allowing the community to push their own blocks using an open api schema.