Hacker News new | ask | show | jobs
by networked 1228 days ago
Thanks for your reply. I agree about (1). I have checked the datasets I have set up search for, and they either have no or under 1% of documents with more than 65535 words. (This is without any processing to break up the documents into sections.)