Hacker News new | ask | show | jobs
by alexcg1 1430 days ago
The version in the notebook is just for simple text-based PDFs. I wrote some posts on our company blog[1] about the sheer agonies of dealing with PDF as a data format, so wanted to stick with as simple as possible for now.

That said, I'm planning future notebooks where you can perform text-to-image or image-to-image search, integrate OCR, scale it up, serve it, deploy it, etc.

[1] https://medium.com/jina-ai

1 comments

Awesome, will be on the lookout for that!
We've got quite a few other notebooks for other kinds of search on the blog. Would love to hear your thoughts!