|
|
|
|
|
by darren_
5657 days ago
|
|
Depending on how their OCR works, Mandarin/Japanese->English(/spanish/&c) could be a _lot_ more difficult (thousands of characters to distinguish) and in the case of Japanese the translation a lot trickier (completely different grammar, Mandarin's not so bad though). This could just be a case of needing to optimize and train their recognizer to deal with a much larger set of possible characters, or it could require implementing kanji-specific OCR techniques like attempting to decompose the characters into their constituent strokes, and recognize based on classification of those strokes (orientation, position, direction). |
|
Of course, guessing the correct word when performing word-for-word translation of hanzi is almost impossible, so even the extremely primitive product I'm thinking of is very difficult.