|
|
|
Show HN: One-liner CLI for batched PDF-to-Markdown at $1 per ~6k pages
(github.com)
|
|
8 points
by monatis
409 days ago
|
|
Extracting clean text from PDFs is still a mess. Tools like dockling and marker do a decent job—but they’re slow and resource-hungry. pymupdf4llm is fast, but it’s AGPL-licensed, which means you'd need to open-source everything that talks to it—even over the network. Gemini Batch Prediction gives you blazing throughput and unbeatable pricing—$1 for 6,000 pages. The catch? It’s a pain to use. That is, until now. We wrapped it up in a few friendly CLI commands—simple enough for your grandparents to enjoy. |
|