Love the idea, however, it's not clear whether I will get access to a large collection of components for building such workflows or what is currently possible? Would nice to get this info before proceeding with auth.
Theoretically, most OpenCV-type image pre/post-processing stuff is available in blocks and then all the major multi-modal + diffusion AI blocks are also available. As a sampling of what we've recently added:
AI Blocks:
- Multimodal LLM (GPT4v)
- Remove objects in Images
- AI Upscale 4x
- Prompted Segmentation (SAM w/ text prompting)
Editing Blocks:
- Change format
- Rotate
- Invert Color
- Blur
- Resize
- Mask to Alpha
If we've missed something please let us know, we just went through a big exercise in making sure we can quickly add new blocks.
It's a combination of things. The idea is that you can build workflows that chain functionality from ai models, as well as lower level image processing tasks. For lower level tasks we use the usual suspects - PIL, ImageMagik, OpenCV etc.
AI Blocks: - Multimodal LLM (GPT4v)
- Remove objects in Images
- AI Upscale 4x
- Prompted Segmentation (SAM w/ text prompting)
Editing Blocks: - Change format
- Rotate
- Invert Color
- Blur
- Resize
- Mask to Alpha
If we've missed something please let us know, we just went through a big exercise in making sure we can quickly add new blocks.