Hacker News new | ask | show | jobs
by coldtea 4305 days ago
>So...here's the question. I don't think these are broken, so what are you fixing?

First, it's not just about something not working. It's about creating tools that are extensible and understandable and hackable. Open Source is not just about "working", it's about being modifiable by the end user. All this cruft (a mess of 200 obsolete architectures, dead code and deprecated library support that nobody used since 1988) works against that goal.

Second, there are things that would be essential for some people, like international users (e.g proper multibyte support) that cannot be added due to dependancy of some custom methods of handling encodings. That's not some wishy washy magical unicorn feature request, it's essential for the main operation of what less does for those that have to deal with these encodings.

Third, there's nothing wrong in taking pride and crafting finely your tools. UNIX is supposed to be made of things that "do one thing and do it well". Less having its own utf-8 support breaks this division of responsibility. We have libaries for that. Same for getopts vs it's custom options parsing.

1 comments

What's the library for UTF-8?
At least for programs written in C, most (all?) modern Unix-like platforms should include the functionality in the base install. On the language side, C89 requires support for wide and multibyte characters in a conforming libc implementation. And POSIX furthermore requires a locales/iconv system to specify and convert between encodings. Neither of those strictly require that UTF-8 be one of the supported encodings (C89 predates Unicode), but any reasonably modern implementation will include Unicode locales. And if it doesn't, I think at this point you can just consider that to be the system's problem: the current assumption for POSIXy programs is that they will use the system locales, not try to implement their own encoding machinery.
Well, ICU http://site.icu-project.org/ would be a good start.