| HN Mirror

Html size and html tags count was a natural choice. If it didn't work out the next step would be to try something else. You're right that it's a very naive example and that in a way it was solved before any ml was applied. The surprising part for me was that both features turned out to be interchangeable i.e. any of them could be used. I would expect html tags count to be much more accurate / reliable, etc. Another interesting part for me was the threshold. It's somewhat clear that it should be somewhere between 20 and 1 sec probably but where exactly?