|
|
|
|
|
by ithayer
4323 days ago
|
|
Probably way beyond the scope of that project, but actually this is a pretty interesting and well-studied problem called segmentation, where you train a system to insert the correct breaks. This is used in natural language problems like OCR or machine translation. Maybe google "machine translation segmentation" or OCR for references to papers on the topic, but methods for doing this are really successful (especially on english). |
|