| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by vslira 47 days ago
	Hm, that’s a multinomial classification with a very high cardinality. It’s really weird it works. I’m sure it does as the author states, but for how many authors (out of the whole web) does this work?

4 comments

dmd 47 days ago

It worked on me, and I would be shocked if my blog (dmd.3e.org) has more than a dozen readers. I am stunned.

link

skeledrew 47 days ago

It's not about the readers, just the fact that there's enough of a sample that it can use, with sufficient differentiation from other content.

link

dmd 47 days ago

I’ve posted on average 3 things a year.

link

londons_explore 47 days ago

There are ~8 billion people. Sounds big, but it's only 2^33. Ie if you can find 33 things about the text which halve the number of possible writers, you have narrowed it down to 1 person.

Just a couple more things and you can accommodate some of your things being mistaken/wrong/uncertain too.

link

kelseyfrog 47 days ago

Sure the cardinality is high, but the model isn't using a uniform prior. What do you suppose all the the values in each of the terms are, P(Text sample | Kelsey Piper) * P(Text sample) / P(Kelsey Piper)?

link

astrange 47 days ago

Maybe it just says all writing is Kelsey Piper.

link