|
|
|
|
|
by sandwell
3475 days ago
|
|
This is a great post, and overlaps somewhat with a project I'm working on (scraping and classifying large amounts of text). I know it's not the focus of the post, but was there any particular reason why you went with the MultiNomialNB classifier? I've been getting pretty good results recently with LinearSVC which seems to be a lot faster and in my case a bit more accurate too. An interesting metric for a future post might be how your proxy compares with scrapy + httpcache middleware. |
|