Computation time increases quadratically with image size, so we had to set some limit. For now that's 500x500 px but we are looking into ways to increase the limit!
Video: Possible, we tried this prototypically already, but it would need some more optimization for good results (e.g. to avoid flickering between frames).
For higher resolution you should seriously consider charging a subscription, or on a pay-for-credits basis. Many organizations, especially those with limited or highly-demanded in-house design talent (ranging from finance to marketing to funded startups), would absolutely justify this product at rates absurdly greater than server costs. Unlike Fivver talent manually tracing boundaries, this has near instant turnaround, and that is HUGE for people with deadlines and infinite Uber budgets who just need stock images combined together.
Just want to jump in on this - I spent a significant amount of time in the print ad design world and something like this would be an easily justified expense.
I do video effects work and I'd suggest this could be very useful even with jitter, if you allow expanding the selection. A lot of time is spent creating "garbage mattes" for green screen footage, basically just roughly rotoscoping out the background so you can do key removal on just the important bits. So you could even massively downsample the video for your processing and still have a good enough matte.
Although, with your tech, and the more limited problem space of green screens and poorly lit green screens, you could probably make a pretty amazing tool to do the entire green screen removal.
Computation time increases quadratically with image size, so we had to set some limit. For now that's 500x500 px but we are looking into ways to increase the limit!
Video: Possible, we tried this prototypically already, but it would need some more optimization for good results (e.g. to avoid flickering between frames).