| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jdrock 6074 days ago
	(1/26)^7 .. without getting too fancy with any sort of linguistic probabilities...

4 comments

ars 6074 days ago

Well, let's get fancy:

  f	%2.228	0.02228
  u	%2.758	0.02758
  c	%2.782	0.02782
  k	%0.772	0.00772
  y	%1.974	0.01974
  o	%7.507	0.07507
  u	%2.758	0.02758

  =

  1/185,399,389,457

This is global letter frequency, I need a table of first letter frequency.

link

ars 6074 days ago

Could not find a good first letter table, so made my own from gutenberg, so:

  f	%3.779	0.0378
  u	%1.487	0.0149
  c	%3.511	0.0351
  k	%0.690	0.0069
  y	%1.620	0.0162
  o	%6.264	0.0626
  u	%1.487	0.0149

  =

  1/487,158,294,227

link

gort 6074 days ago

That's the odds of it happening in 7 specific lines. But what are the odds of it happening by chance at any point in time to any person of Schwarzenegger's importance or higher?

link

dangoldin 6074 days ago

Just multiply those odds by the percentage of people that qualify as important.

link

gort 6074 days ago

And by the number of 7-line sequences produced by each.

link

pjvandehaar 6074 days ago

One has to factor in that there are thousands of things he could have printed that would have had roughly the same effect (eg. "Piss off", etc).

link

10ren 6074 days ago

= about 8,000,000,000 to 1

But capitalization and spacing was correct too.

link

yan 6074 days ago

Capitalization would be correct in almost all cases. You always capitalize first letter and chances are, first word of each line won't be the beginning of a sentence. Ditto for spacing, 3-5 line paragraphs are fairly common.

link

jcl 6074 days ago

How do you figure "almost all"? Even if the text could only contain paragraphs of length 3 and 4, the spacing alone would be incorrect in half the cases (assuming that a 3-4 and 4-3 split is equally likely).

(Of course, we are leaving aside the fact that the full body of the message contains an additional one-line paragraph, which could also be considered part of a correctly capitalized and spaced message.)

link

rbanffy 6074 days ago

One would have to analyze all previous relevant correspondence out of the AS's office as to measure vocabulary and other structural elements.

It would be a fun project.

link