|
|
|
|
|
by amindiro
470 days ago
|
|
After battling Python dependencies, slow processing, and deployment headaches with tools like unstructured, I finally snapped—and built Ferrules, a blazing-fast document parser in Rust. Why Ferrules?
- Speed – Native PDF parsing (pdfium), hardware-accelerated ML inference
- No Python – Single binary, zero dependency deployment, built-in tracing
- Smart Processing – Layout detection, OCR, intelligent element merging
- Flexible Outputs – JSON, HTML, Markdown (ideal for RAG pipelines) Tech Highlights
- Runs layout detection on Apple Neural Engine/GPU
- Apple Vision API for high-quality OCR (macOS)
- Multithreaded, CLI + HTTP API server
- Debug mode with visual parsing output If you're tired of Python-based parsers in production, check it out ! (P.S. Named after those metal rings on pencils—because it keeps your documents structured ) |
|