|
"A type system isn't going to save you from users submitting all kinds of potentially different encodings." Yes, it is, because you give that a type that indicates you don't know what the encoding is, like RawInput or something. You then can not pass this type to any other function that doesn't explicitly call for that type. If you have some function that accepts it, blindly casts it to UTF-8, and slams it out into a file, well, that's not the type system's fault [1]. Of course a type system won't prevent you from still just being wrong or writing bugs; nobody promises that, not even the formal methods advocates. But it will prevent you from just accidentally blindly shoveling it out somewhere it doesn't belong without ever examining it or thinking about it. I think you may be believing in a popular myth about strong typing systems, that they are designed to somehow prevent bad data from coming in to your system at all. You correctly identify that as impossible. But what strong typing systems can do is force you to deal with the fact that bad data may be coming in. On the outside, you have the chaos of, say, a bag of bytes that may or may not be JSON. On the inside, you have a "type SomeStruct { int a; int b }". A strong type systems forces you to write some sort of adapting code between those two, and guarantees that the result of that adapting code will be only and exactly the type that comes out of that adapting code, no "whoops, sometimes this dynamic code just returns a string, or maybe a network socket, or who knows what". Nothing can prevent your HTTP API from receiving a JPG of an anime character instead of JSON specifying a user to delete, but a strong type system can make you deal with that immediately and fully, instead of garbage data of indeterminate type floating through the system for an indeterminate period of time. [1]: Also note there are a lot of "strong type systems" in the world that still fail to take advantage of their own capabilities and let bare string types and such float around too much. There are reasons why libraries must support the lowest common denominator; a file is a series of bytes with no further constraints, so the lowest level API has no choice but to accept that, but higher level APIs should more often take more restricted types. That strong type systems can save you from this doesn't mean they all do. I have a number of wrapper types in various languages just to add these guarantees to my programs not provided by the underlying libraries, though I also have some code that just wraps the underlying libraries that can't help but correctly take raw bytes at the lowest level. |
Unfortunately, if you interact with services you didn't write, you're usually back to getting "strings" of unknown encoding, and typically requirements that force some blind or semi-blind guessing.