|
|
|
|
|
by arvind_k
255 days ago
|
|
At Zipphy, I worked on solving similar problems in on-prem environments — building an OCR + NLP + CV pipeline to generate spatial layouts and classify documents at scale. One persistent challenge was generalizing across “wild” PDFs, especially multi-page tables. Your mention of agentic OCR correction and semantic chunking really caught my attention. I’m curious — how did you architect those to stay consistent across diverse layouts without relying on massive rule sets? |
|