| The biggest problem appears once you start wanting to combine such strings. Say a user inputs some raw text in a form that is intended to be the title of a button in a HTML form that will be sent in a JSON file to be stored in a SQL db. The expectation is that you can later retrieve this HTML snippet from the DB and display it on the screen. You have to first escape the raw text from the user so that it can be safely used in HTML - so you will go from user_input_string to html_pcdata_escaped_ user_input_string. Then you compose a bit of HTML that contains the button and this part; let's say you store it in some HTML DOM object. Then you want to send this HTML object as JSON, so you have to know to convert html_pcdata_escaped_ user_input_string into json_string_escaped_user_input_string - but that loses type information which may hurt us later, so maybe we want to actually store it as json_string_escaped_html_pcdata_escaped_user_input_string. Then, if we want to use this as part of an SQL query string, by the same considerations, we want to put it in a mysql_like_filter_escaped_json_string_escaped_html_pcdata_escaped_user_input_string - which is getting really ugly, and easy to mess up. Of course, the order of escaping matters, so an mysql_like_filter_escaped_json_string_escaped_html_pcdata_escaped_user_input_string and a json_string_escaped_mysql_like_filter_escaped_html_pcdata_escaped_user_input_string are different things that need to be decoded differently (of course, for SQL in particular we could use prepared queries instead). Also, we can't ever concatenate this with any other string-like type until perhaps the final use point (such as sending a query string to the DB), since we need to remember which part of the string is escaped in which way, and for what types of uses it is safe (an HTML-escaped string may still contain SQLi or JSON injection). The point is that even with proper types, this is not easy to manage or fix. It also requires quite advanced type systems to be able to use these in normal contexts - say, you want to store several such strings with different provenances in a Map or Set or even List, without "forgetting" the provenance. |
It's then up to us to decide how to best make use of the type system of whatever language we end up implementing it in (or, indeed, to treat the ability to deal with this well as a requirement when we're choosing a language).
For me, effects like "we can't ever concatenate this with any other string-like type" are desirable features, not problems with this approach: either it's possible to convert both strings to a common form, or I shouldn't be trying to combine them.