| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by HannesWes 488 days ago

This looks very interesting. I conducted some explorations of whether LLMs can be used to extract information from hand-written forms [0][1]. Such a system could allow users to snap pictures of forms and other legal documents, automatically extract structured information, and use this information to e.g. automatically fill out new forms or determine whether the user has the right to a government benefit.

The initial results were quite promising, as GPT-4o could reliably identify the correct place in the form for the information, and moderately reliably extract the values, even if the image was blurry or the text was sloppily written. Excited to see how Gemini 2.0 would do on this task!

[0] https://arxiv.org/abs/2412.15260

[1] https://github.com/hwestermann/AI4A2J_analyzing_images_of_le... (code and data)