|
|
|
|
|
by nickserv
30 days ago
|
|
Gave it a try for structured data extraction. Tested returning a JSON object from images. The output was correct, and seemed deterministic, although I ran it only 2-3 times on the same image. Main problem is response time: it took about 20-25 seconds for a simple structure of 5 fields. As such unusable at scale, let alone "real time" processing. Other problem is cost, it is considerably more expensive than more established models for the same document, like flash-light. Shame, the architecture is very interesting. |
|
We're working a lot more on speed in the coming few weeks :) More GPUs and more optimizations.
Our has been focus on quality of output first and we'll make optimizations as we grow :)
The lite models are great for simple use cases but won't don well in more complex OCR use cases.