I have sympathy for its authors. There is no way to really know what the right encoding is. Your options are to guess based on heuristics, allow the user to specify, or demand a particular format. Even the friendliest applications just guess and allow the user to override.