|
|
|
|
|
by souvik3333
241 days ago
|
|
We have developed DocStrange to create LLM-ready data from images and PDFs. We have open-sourced a 3B finetuned model also. You can try both the open-sourced and private models from the demo. HF: https://huggingface.co/nanonets/Nanonets-OCR2-3B
Demo: https://docstrange.nanonets.com/
Blog: https://nanonets.com/research/nanonets-ocr-2/ This model is an improvement over our last open-source model. We have fixed some of the issues that the community faced and some of the features that were requested (handwritten, multi-lingual). The models are trained on 3 million documents, including handwritten documents, financial reports, complex tables, documents with watermarks, and stamps. Feel free to try it and share feedback. |
|