Hacker News new | ask | show | jobs
by shakna 310 days ago
You should be able to do a K-means type thing. Where your query is an entire group, and you grab the field from the chunk locally.

But you might still be able to use some frequency sampling to predict the words used, unless those chunks are very very carefully constructed.