Hacker News new | ask | show | jobs
by themanmaran 447 days ago
> Never change the original language of any text. Keep Korean in Korean, Japanese in Japanese, and English in English.

I love the double prompting to keep GPT from translating the text. I've definitely had this problem before, and spent ages trying to prompt it into not randomly translating the text.

1 comments

Yeah — I ran into that exact problem during early testing. The prompt has since been adjusted to prevent GPT from auto-translating non-English text (Korean, Japanese, etc.).

If it still misbehaves in any edge cases, feel free to open an issue on GitHub — happy to patch it up.

What’s the use of using generative AI to OCR the text?
Great question — I’m using traditional OCR engines for the initial text extraction (e.g., MathPix, Google Vision), but then I apply generative AI models in a second stage to refine the output. This includes removing noisy or irrelevant elements, normalizing format inconsistencies, and improving alignment across multi-modal inputs.

In addition, for figures and diagrams, I use Gemini Pro Vision not just to extract the content, but to generate context-aware, structured descriptions that are better suited as ML training input — rather than just dumping raw image text.

So in short, generative AI is used here more as a smart post-processing layer to enhance the usability and semantic clarity of the OCR outputs.