Hacker News new | ask | show | jobs
by vlmutolo 4 days ago
Pretty cool, rendering PowerPoint files to an image is probably the only way for LLMs to make sense of them.

Does this work in Cloudflare’s workerd environment? Would be nice to have a cheap serverless render -> LLM (GLM-OCR / PaddleOCR) -> Markdown pipeline for the various MS Office formats.

1 comments

This code creates a JSON intermediate representation that LLMs could probably consume. You might want to simplify it to focus on content and reduce token usage.