Hacker News new | ask | show | jobs
by InvidFlower 466 days ago
It is confusing, but they have diff calls for pdfs vs images. In their example google colab: https://colab.research.google.com/drive/11NdqWVwC_TtJyKT6cmu...

The first couple of sections are for pdfs and you need to skip all that (search for "And Image files...") to find the image extraction portion. Basically it needs ImageURLChunk instead of DocumentURLChunk.