Hacker News new | ask | show | jobs
by Scene_Cast2 894 days ago
Check out sampling with lightweight coresets if your data is big - it's a principled approach with theoretical guarantees, and it's only a couple of lines of numpy. Do check if the assumptions hold for your data though, as they are stronger than with regular coresets.
1 comments

Do you have a link to any implementations for this?