|
|
|
|
|
by kapitalx
472 days ago
|
|
Yes I used the API. They have examples here: https://docs.mistral.ai/capabilities/document/ I used base64 encoding of the image of the pdf page. The output was an object that has the markdown, and coordinates for the images: [OCRPageObject(index=0, markdown='', images=[OCRImageObject(id='img-0.jpeg', top_left_x=140, top_left_y=65, bottom_right_x=2136, bottom_right_y=1635, image_base64=None)], dimensions=OCRPageDimensions(dpi=200, height=1778, width=2300))] model='mistral-ocr-2503-completion' usage_info=OCRUsageInfo(pages_processed=1, doc_size_bytes=634209) |
|
Feels like something is missing in the docs, or the API itself.
https://imgur.com/a/1J9bkml