| Hi HN, creator of EmbedPDF here. I recently posted my open-source PDF viewer here, and one thing I really value is that it runs completely offline. I started wondering if we could push that further: could we do full ML layout analysis (detecting tables, headers, columns) directly in the browser? To my surprise, it actually works. The catch: It is far from production-ready. It crashes on most phones, and on older computers, it can be incredibly slow. The why: I believe the future of document processing is local. Many users work with sensitive documents (bank statements, legal contracts) and simply do not want to upload them to a cloud endpoint just to parse a table or analyze layout. This is a proof of concept for that future—where models get smaller, WASM/WebGPU gets faster, and we can keep data entirely on the client side. Demo: https://www.embedpdf.com/layout-analysis
Repo: https://github.com/embedpdf/embed-pdf-viewer I'd love to hear your thoughts on the performance and where you think browser-based ML is heading. |