I find it hard to believe that HTML sanitization is central to the Stack Overflow site. (At least, not central in the "worth-reinventing-the-wheel" sense. Note that Atwood is depending on Markdown rather than inventing his own markup syntax and implementing a library for translating it into HTML.) Does the quality or efficiency of HTML sanitization make such a difference in the overall performance of the site?
Let me know if I got this straight, Markdown doesn't have it's own HTML sanitation mechanism, so Stack Overflow is rolling their own general HTML sanitation solution?
Why use Markdown in the first place? Is this a time when Markdown doesn't solve the problem Stack Overflow has? Look for another markup language?
Markdown doesn't even try to solve the HTML sanitation issue because it was designed for use when you have complete control over the content, so it passes all HTML through in the clear so that you can use Markdown to make the usual/trivial stuff easier, and leave the complex stuff to how HTML designed it.
Because it ignores HTML altogether, you need to have a separate sanitation process if you only want a subset of HTML to be usable.
The story is the same with most other simple markup languages, like Textile.