|
|
|
|
|
by jws
4596 days ago
|
|
In moving a site from PHP to go in a rewrite, I'm about to head down that same road. The plan is to use the HTML parser from the "not quite in the real distribution but kind of official" net repository: http://godoc.org/code.google.com/p/go.net/html Parse the HTML, walk the result, write that which is acceptable. I have to restrict by tag, url scheme, and url server name in various contexts. |
|
ParseFragment throws an error on bad input, but actually I just want that stripped and to carry on processing things. If a user has put in a mostly usable piece of HTML and then got something wrong as an error rather than bad intent then permissiveness in how we handle that should rule.
And then I wondered about the wisdom of creating a potentially large security library on a not quite nailed down API.
Ultimately, given that this is a security thing, I figured it's best to go with the proven many-eyeballed solution that was had widespread acceptance.
Feel free to use the package we've provided, the bit of go code you need for it is: