|
|
|
|
|
by jerf
4012 days ago
|
|
Just checked, and \v is illegal in the characters of an XML document: http://www.w3.org/TR/REC-xml/#dt-text But you should have gotten an error, of course, not the silent truncation you imply. If you need to salvage the character, your XML library may let you specify it as �b;. That is still a violation, but a lot of libraries seem to let it through: http://www.w3.org/TR/REC-xml/#sec-references (see "Well-formedness constraint"... you are specifically not allowed to use this to do what I'm suggesting here). Anyways, the moral here is that XML CAN NOT carry arbitrary binary, and EVERY TIME you output something in XML, something in the system needs to run some sort of encoding & illegal-character cleaning pass on the output text. The moral equivalent of "<tag>$content</tag>" in your language is ALWAYS wrong, unless you specifically processed $content into XML character content earlier. This is true even when your really sure $content is "safe". Even if you're right... and statistically speaking, you're not... do it correctly anyhow and call the right encoding function. |
|
It's a hack, sure, having to encode/decode all the time, but if you need to store those characters, it's the only bulletproof way I've found.