Nice. I wonder what methods and tools did you use for defining pronouncability. Care to share some tips? Just curious about language-oriented programming.
Off the cuff I'd use a phonetic algorithm such as Soundex or Metaphone for mapping strings to abstract phonetic representations. These could then be run through simple regular expression pattern matching. There's a fairly limited set of syllable structures in English that make for pronounceable words.
As with many linguistic algorithms this approach is not language-agnostic though. If you wanted to predict the pronounceability for a language other than English you'd need different algorithms and patterns.
A much more sophisticated approach (as opposed to the heuristic one above) would involve training a Markov model (with characters as states). More probable words (i.e. those containing a likely sequence of characters) are more likely to be easily pronounceable.
It appears to me (from my quick glance through it) that all of the domains follow the pattern of alternating consonants and vowels (with an additional rule which is that it allows for two identical consonants in a row, e.g. "mm" or "ll"). Combinations of letters which alternate between vowels and consonants are typically at least somewhat pronounceable.
I've recently compromised and started using a hyphen in my domains. Any ideas on the potential downsides for this? It certainly helps with finding reasonable .coms without resorting to dropping vowels or weird spelling combinations.
The guys on the fizzle show recently opined that you shouldn't use hyphens. From memory, the reasoning was mainly based on the domain being slightly harder to share (try saying your thing with the hyphen in it).
I like the phone test. You should be able to tell someone your domain over the phone and they should understand what it is without any clarification. Hyphens and misspellings fail this test.
if only the root cause of these issues could be solved. The domain squatting problem is a real one. My understanding is that 15 years ago when it became obvious to some that this internet thing might take off, many shady characters (oops i mean entrepreneurs!) set up shell companies as registrars for the sole purpose of bulk registering domains as quickly as possible.
I don't know what the solution could have been and the point is moot any way since we cant go back in time to fix things :(