|
|
|
|
|
by breuderink
4268 days ago
|
|
One method that I have used in the past was über-simple, yet extremely effective. It exploits ZIP compression, based on the the insight/assumption that two concatenated texts compress beter when they share their language. I think I found it in this paper [1]. The implementation was like 13 lines of Python code. I wonder how it would compare. [1] http://www.ccs.neu.edu/home/jaa/CSG399.05F/Topics/Papers/Ben... |
|