Wouldn't "Serverless OCR" mean something like running tesseract locally on your computer, rather than creating an AI framework and running it on a server?
You might be conflating "cloud" with serverless. Serverless is where developers can focus on code, with little care of the infrastructure it runs on, and is pay-as-you-go.
> You might be conflating "cloud" with serverless. Serverless is where developers can focus on code, with little care of the infrastructure it runs on, and is pay-as-you-go.
That's not what serverless means at all. Most function-as-a-service offerings require developers to bother about infrastructure aspects, such as runtimes and even underlying OS.
They just don't bother about managing it. They deploy their code on their choice of infrastructure, and go on with their lives.
A runtime is notably NOT infrastructure, had you said instruction set you might have landed closer to making a compelling argument, but the whole point is that AWS (and other providers) abstract away the underlying infrastructure and allow the developers to as I said, have "little care of the infrastructure it runs on". There is often advanced networking that CAN be configured, as well as other infrastructure components developers can choose to configure.
Unless the engineer takes steps to spin down EC2 infrastructure after execution, it is absolutely persistent compute that you're billed for whether you are doing actual processing or not. Whereas lambda and other services are billed only for execution time.
You can still be excited! Recently, GLM-OCR was released, which is a relatively small OCR model (2.5 GB unquantized) that can run on CPU with good quality. I've been using it to digitize various hand-written notes and all my shopping receipts this week.
(Shameless plug: I also maintain a simplified version of GLM-OCR without dependency on the transformers library, which makes it much easier to install: https://github.com/99991/Simple-GLM-OCR/)
When people mentions the number of lines of code, I've started to become suspicious. More often than not it's X number of lines, calling a massive library loading a large model, either locally or remote. We're just waiting for spinning up your entire company infrastructure in two lines of code, and then just being presented a Terraform shell script wrapper.
I do agree with the use of serverless though. I feel like we agree long ago that serverless just means that you're not spinning up a physical or virtual server, but simply ask some cloud infrastructure to run your code, without having to care about how it's run.
> When people mentions the number of lines of code, I've started to become suspicious.
Low LoC count is a telltale sign that the project adds little to no value. It's a claim that the project integrates third party services and/or modules, and does a little plumbing to tie things together.