Hacker News new | ask | show | jobs
by tibbe 3620 days ago
I wouldn't recommend following the Haskell approach. It hasn't worked well for us. (I took part in creating the Haskell Platform and the process used to add packages to it. I also used to maintain a few of our core libraries, like our containers packages and networking).

Small vs large standard library:

A small standard library with most functionality in independent, community-maintained packages has given us API friction as types, traits (type classes), etc are hard to coordinate across maintainers and separate package release cycles. We ended up with lots of uncomfortable conversions at API boundaries.

Here's a number of examples of problems we currently have:

- Conversions between our 5(!) string types are very common.

- Standard library I/O modules cannot use new, de-facto standard string types (i.e. `Text` and `ByteString`) defined outside it because of dependency cycle.

- Standard library cannot use containers, other than lists, for the same reason.

- No standard traits for containers, like maps and sets, as those are defined outside the standard library. Result is that code is written against one concrete implementation.

- Newtype wrapping to avoid orphan instances. Having traits defined in packages other than the standard library makes it harder to write non-orphan instances.

- It's too difficult to make larger changes as we cannot atomically update all the packages at once. Thus such changes don't happen.

Empirically, languages that have large standard libraries (e.g. Java, Python, Go) seem to do better than their competitors.

7 comments

I don't think most of these are applicable to Rust.

> - Conversions between our 5(!) string types are very common.

> - Standard library I/O modules cannot use new, de-facto standard string types (i.e. `Text` and `ByteString`) defined outside it because of dependency cycle.

We have one string type defined in std, and nobody is defining new ones (modulo special cases for legacy encodings which would not be worth polluting the default string type with).

> - Standard library cannot use containers, other than lists, for the same reason.

> - No standard traits for containers, like maps and sets, as those are defined outside the standard library. Result is that code is written against one concrete implementation.

Hash maps and trees are in the standard library already. Everyone uses them.

> - Newtype wrapping to avoid orphan instances. Having traits defined in packages other than the standard library makes it harder to write non-orphan instances.

This is true, but this hasn't been much of a problem in Rust thus far.

> - It's too difficult to make larger changes as we cannot atomically update all the packages at once. Thus such changes don't happen.

That only matters if you're breaking public APIs, right? That seems orthogonal to the small-versus-large-standard-library debate. Even if you have a large standard library, if you promised it's stable you still can't break APIs.

But if you have a large standard library and want to break the API, you can.

If you have 100 different libs that are basically "standard" (who doesn't have `mtl` in their applications at this point), now you have to coordinate 100 different library updates roughly at the same time. If you forget even one of them, then you've broken everything.

I think the argument for a large Prelude/standard lib is similar to Google's "single repo" argument: Easy to catch usages and fix them all at once. Plus you're making the language more useful out of the box. People coming from python can understand this feeling of opening a python shell and being productive super quickly form the get-go.

Arguments for small std lib exist, of course. But Giant standard libraries are more useful than not.

EDIT: I think the failure of the Haskell Platform has a lot more to do with how Haskell deals with dependencies, and the difficulties it entails, than with the "batteries included" approach itself.

Standard libraries - types, in particular - are the lingua franca between unrelated libraries. The more that's in your standard library, the easier it is to integrate different libraries.

The higher level the library (e.g. containing content specific to an application domain), the more magical-seeming libraries can be added to the ecosystem. The counter-risk is the standard library growing in undesirable directions that you can never change because you can't remove stuff.

The interstitial glue that lets third party libraries integrate with one another and be usable by your app: that's the single biggest reason for having a bigger standard library than a smaller one. It has very little to do with including the batteries in the box.

If you think it has something to do with including the batteries in the box, you'll be lured into the trap of making it easy to fetch the batteries from across the internet (that's almost the same, right?). The trouble is, the internet has 100 different batteries to choose from, and not only have you offloaded the choice onto the user, but the batteries use mutually incompatible terminals and you have to jerry-rig interfaces between them. Let a thousand flowers bloom, say some people: trouble is, waiting for the biggest flower can take years, and people pick different ones in the early days. A bad choice is better than indecision.

Low effort updates are even less what large standard libraries are about. Large standard libraries are much harder to update, not easier: there's much more surface area, so it's far easier to break an application - and since every application uses the standard library, you could potentially break them all. Easier versioning and updates are a strong argument for extracting out things into third-party libraries.

But even then, languages that have great, thriving easy to use dependency systems and package managers with small standard libraries still run into problems

see: javascript

The issue with comments written this way is that there are no details to support the claim.

Writing "see: JavaScript" doesn't really help without context. Without context, one does no know if you meant "JavaScript in browser" or "JavaScript via Node.JS" or "I simply don't like npm".

I'm not claiming there aren't any problems; however, "problems" are situational and one person's "problem" is another person's meh.

I just think it's irresponsible to not provide detail when making such claims.

> But if you have a large standard library and want to break the API, you can.

We have a policy of no breakage for stable libraries post 1.0 (as does Python, and Go, etc.). So no, we can't.

The size of the standard library has nothing to do with it.

I'm curious about which language features or tooling do other languages have that make them better at dealing with dependencies than Haskell?
> We have one string type defined in std

The standard library also includes Path/PathBuf and OsStr/OsString. And third-party libraries also use [u8] for bytestrings.

It'd be nice to improve handling for user-supplied text where you can't assume UTF-8. For instance, git2-rs provides the contents of diffs as [u8], because it can't assume the diffed files use UTF-8. That led to this commit today: https://github.com/ogham/rust-ansi-term/pull/19/commits/a0da...

That felt like a lot of boilerplate to abstract between str and [u8]. Is there a better way to solve that problem?

(As much as I'd love to just say "use UTF-8", that would break on many git repositories, including git.git and linux.git.)

> The standard library also includes Path/PathBuf and OsStr/OsString.

Right. And you want people to explicitly convert between them.

Having to convert between string types isn't a problem. String encoding is hard, and you're going to have to pay that cost somehow.

Having to convert isn't a problem. Having to write some algorithms multiple times for different string types is a problem.
Fair. Most of these algorithms could be written generically I guess.
> That felt like a lot of boilerplate to abstract between str and [u8]. Is there a better way to solve that problem?

This doesn't have anything to do with large vs. small standard libraries, because all of these string types are defined in libstd.

libstd defines varying amounts of string manipulation and abstraction for those string types, though.

I'd love to see additional support for handling bytestrings in libstd, to make it easier to write code that handles both &str and &[u8].

I think rust needs to slow down in this regard. I have been with Python since 1999 and the stdlib has held it back, I have also used Scala and Haskell and have witness the mess that platform libs on each have caused.

What Rust has right now is pretty amazing. What needs to happen is a way for devs to easily break the dependency cycle and include multiple versions of the same crate. Something that has plagued Haskell. I dunno what the answer is, trait only crates, struct only crates?

If people want to 'curate' (shop) a set of packages, they can make a meta package that exports its deps.

There is literally no reason to ship libs with the compiler aside from the basic verbs and nouns.

With verioned and properly name-spaced imports, one could use different curated libs.

If you can, could you elaborate more on python's stdlib holding it back? I think batteries-included experience is one of the reasons why so many people (including myself) use python.

It's also one of the features I sorely miss when using Rust. Luckily, Rust's stdlib is starting to tend towards being more practical with recent additions like system time.

The 'std lib is where libraries go to die' was invented by Python. The libs are shallow, don't break backwards compat and provide a substandard experience. Things that continue to improve provide an out of tree alternative package name. Python codebases that are resilient don't use much of "core", arrow for time, requests for http, simplejson, etc. Using core is an antipattern that will get you stuck on a version of the language which is ridiculous.

Linking the language and the libraries together is a mistake.

I disagree.

In the enterprise space it is quite common that we only get to use what it is in the computer and access to anything else is strictly controlled by IT.

So if it isn't in the standard library or some internal library mirror, we don't get to use it, as simple as that.

I think it would be terrible for Rust design/evolution/policy be constrained with that kind of enterprise badness that basically bans crates.io, and crates.io is an awesomeaspect of the Rust ecosystem.
So maybe there's value in shipping a "standard bundle" that includes popular libraries or some such. But it's not worth distorting the whole language design to accommodate bad policies.
I see where you're coming from, but I feel like it would be a mistake to expect the language or std lib to try to solve problems that are effectively organizational/cultural issues.
Plus, python's std lib is a mumbo jumbo of all sorts, there is no API coherence.
That's a failure at the moment of inclusion. I'm guessing it was done for convenience and to increase adoption (getting decent libraries in the standard library faster).
Just as a data point... I like and heavily use the core libs... And not once i used arrow, request or simplejson, while knowing them, because i didn't feel the needs
Then you most likely have various security or logic problems in your application unfortunately.
Arrow seems particularly useless as it just wraps stdlib datetime and its awful 10 byte size rather than moving to an 8 byte representation like np.datetime64 uses.
But isn't requests built off of urllib?

The thing I like about python is it gives tools for library writers to build things without going too low level.

Application writers will always write with better libs, but don't have to worry about third party lib compatiblity on platforms because of the stdlib serving as a virtual machine (most of the time)

requests is built on urllib3, and it includes its own version of urllib3 (to avoid dependency problems).

If I'm not mistaken, the stdlib contains urllib and urlib2, but not urllib3.

The fact that there 3 "urllib" packages show that the Python way is not so good.

Many libraries in the stdlib have much better alternatives, because libraries with their own release cycle can evolve much quicker. But people get stuck on the "standard" version because it's what's in the stdlib. Worse, people write for compatibility with whatever was in stdlib 2.4 because that's what RHEL6 ships.
Rust will already allow you to have multiple versions of transitive dependencies.
This needs to be screamed from the hill tops!
To be honest, this is the kind of thing that would be great to highlight immediately on https://github.com/rust-lang/cargo or http://doc.crates.io/guide.html.

I go to those pages and I am given 0 information on how the thing actually works.

Cargo's docs need a bunch of work, it's true. I have so much to do :(
Which I guess is normal since it does not create any dependency cycle. A new version might as well be thought as a completely different package (of perhaps similar functionality).
One of the things I love above all about Python and Ruby are the kitchen-sink standard libraries. The node ecosystem is deeply frustrating in this respect.
It has been a while since I did anything with Python, but I did like its standard library. It was reasonably comprehensive without feeling bloated, and the documentation was pretty good (mostly).

Having a good standard library also makes deployment easier. (In Go, OTOH, I tend to care less, even though its standard library is quite good, because thanks to static linking, deployment is always easy, no matter how many third-party libraries I use.)

> We have one string type defined in std, and nobody is defining new ones (modulo special cases for legacy encodings which would not be worth polluting the default string type with).

There's also `inlinable_string`, `string_cache`, `tendril`, `intern` if you need inlining for performance.

The bigger problem is with other things like 2D/3D points which can be (f32, f32), [f32; 2] or a custom struct.

I really really would advise having a word with Snoyberg about this. The Haskell Platform has been a pretty deadly experience. It's also ridiculously beginner-hostile (sounds like it won't be, is in practice).
Hash maps and trees: fine. What about database interfaces (e.g. a JDBC/ODBC/whatever equivalent)? What about HTTP servers - even the minimal declaration for what a synchronous request handler might look like? How about threadpooling - if you have multiple libraries that have parallelizable work, you certainly don't want multiple threadpools each thinking they have X many cores to work with, and you don't want the user to have to partition these things either - that's not a happy problem.

All things you can delegate to third parties, but not without lots of cross-talk and confusion until things settle down to winners and losers, which may be a long time in the future. Indecision can be costly.

Consider standard library profiles, with progressively higher levels of abstraction supported. It's the right decision for creating a good ecosystem. C and C++ took decades to build consensus on the more complicated libraries, and C++ eventually grew a pseudo-standard library in the form of boost to centralize efforts, simply because it is more efficient that way.

  > Empirically, languages that have large standard 
  > libraries (e.g. Java, Python, Go) seem to do better than 
  > their competitors.
You seem to be overlooking the ultimate counterexample: C. :P
You were being down voted, maybe for perceived snark, but I think you raise an interesting point.

To me, C did have a standard library: Unix. It's a runtime system too! Due to the nature of the original C bootstrapping process it just happens to be possible to remove this standard library, and Windows was evidence of this.

There is another interesting potential counter example: Lua. It's minimalistic standard library is part of what makes it so attractive for embedding, eg. in game engines. However, Lua's embedding API is so good, you could almost say that it comes with a large standard library too: Your existing C code!

I guess my larger point is that languages rarely are able to stand completely on their own. They need some sort of valuable body of code to justify people to choose the language and libraries together. It might have been the case 40 years ago that you'd reasonably choose to build something "from scratch", but today, if you start on an island, you need to build a bridge, lest you remain on an island forever. Better to start on the mainland.

It's one thing to build a layered system with a small core. It's another thing to completely ignore the fact that the libraries and community _are_ the language, in the only ways that actually matter.

> To me, C did have a standard library: Unix. It's a runtime system too!

Fully agree. We just ended up with ANSI C + POSIX, because the standard bodies refused to put everything into the same bag.

In the early days, most C compilers were anyway shipping partial UNIX APIs on top of their K&R and ANSI implementations.

Lua's lack of a stdlib is also a curse. I can't imagine how many incompatible versions of string.trim and OOP libraries are out there in the wild right now...

Things have been getting better lately because of Luarocks but its still an uphill battle.

String trim is just:

  foo:gsub("%s*$", "")
or

  foo:gsub("^%s*", "")
The standard idiom for OOP in Lua is a one-liner:

  return setmetatable(self, mt)
where mt.__index has all the methods. How you assign to mt.__index can vary across modules according to style, but that's a _purely_ asethetic issue. The mechanics are identical. Using a module to accomplish it creates a useless dependency.

There are many criticisms one could make of Lua, but I don't think those two particular criticisms are legit. They're classic bikeshedding.

The function you presented that trims to the right has quadratic runtime behavior if your string has a long sequence of spaces that is not at the end of the string. For example, "a b". A similar performance bug was behind a 30 minute downtime at stackoverflow.com, because a code snippet with 20 thousand spaces inside a comment showed up on their frontpage.

See, its not that simple :) http://lua-users.org/wiki/StringTrim

Anyway, I wasn't trying to say bad things about Lua with my examples. Its just that if you go to any large Lua project out there there is a very good chance you will find some "utils" module in there with yet another reimplementation of a lot of these common functions. Ideally we should have people reusing more stuff from Luarocks than they are right now.

If you're reading a pile of string processing code, seeing

    s.rstrip()
helps make code self-documenting, compared to

    s:gsub("%s*$", "")
I don't want to argue for a massive standard library (for instance, I don't think Python should have shipped modules for dbm, bdb, sqlite, or XML-RPC), but simple string processing seems like a good thing to standardize.
String processing is never simple. Simply identifying "what is whitespace?" is a big undertaking in Unicode.

Lua's philosophy seems to be to include the absolute minimum that is unacceptably painful to omit. This is a perfectly reasonable tradeoff for Lua's primary use case: embedding.

With respect to strings in particular, most systems that Lua is embedded in has its own string type, or inherits one from a framework. This is an unfortunate reality of the C/C++ world.

Returning to my point about language standard libraries: The lack of a traditional "standard library" is a feature for Lua, but only because Lua has a strong FFI and C API that acts as a "bring your own standard library" mechanism. It's less about needing a standard library, and more about admitting a language is only one piece of the puzzle. For a language to flourish, you need to have some story for interfacing with the rest of the world in a rich way.

  > It's another thing to completely ignore the fact that 
  > the libraries and community _are_ the language, in the 
  > only ways that actually matter.
I'm unclear, what is this aimed at? Who's ignoring anything?
JS, too, right? Forget "large" standard library, there really isn't any standard library at all
You have these built-in objects like Math and String and Array. Are those not the standard library?
I'm not sure I'd classify them as a standard library; they're essentially just pervasive global variables. For a comparison, think of Java; the standard library is things like `java.util` and `java.swing`, which goes far beyond having the `System` and `Math` classes available in `java.lang`.
You don't need a standard library to win if you don't have any competitors (in the browser). :)

Fitness for purpose is relative to the other options.

Well, JS was originally competing with Java applets in the browser, but, like you said, fitness for purpose is pretty significant!

My point (or rather, the point of the parent comment that I'm agreeing with) there's a lot more than just the presence and characteristics of a standard library that determine how widespread a language becomes

That's overly dismissive, Node.js has the same issue as JS in the browser and does very well. Small core doesn't matter there.
You don't consider the dynamic, rich document presentation engine that is HTML to be a standard "library"? Seems like it is to me.
HTML doesn't do anything for JS other than provide a way to create visual interfaces. It might be comparable to the role that `tkinter` plays for Python's stdlib, but HTML alone is emphatically not a standard library.
My point is that most languages are totally useless on their own. JavaScript the _language_ doesn't offer any FFI or other mechanism to call outside services. Without a browser or something like Node's libuv, JavaScript wouldn't be useful at all. The capabilities provided in the box are part of the language in terms of what actually matters in motivating people to choose to use the language, no matter what form those capabilities come in.
It is called UNIX and re-branded as POSIX.

C would not have gotten where it is today if it wasn't for the rise of UNIX in the industry, fueled up by free UNIX clones.

> You seem to be overlooking the ultimate counterexample: C. :P

I think one reason (of many) that C++ has replaced C almost completely for new development is the STL. Of course, the STL fundamentally depends on the language feature of templates, which you can only approximate in C, but considering that Java and Objective-C, among other languages, lasted pretty long with no generics and only non-type-safe containers, I think C could have benefitted greatly from basic things like resizable arrays, hash tables, trees, better strings, etc. in the standard library. Now it is probably too late for it to matter (which most people consider a good thing).

FWIW, the last time I cooked up something in C, I liked Judy very much: http://judy.sourceforge.net/

It has slightly awkward but very simple API, and it's very fast.

But C didn't have competitors with large standard libraries, so it didn't suffer as much for it.
It had, Modula-2 and Pascal dialects usually had richer libraries.

For example check Turbo Pascal libraries, including Turbo Vision, already on MS-DOS.

C took off thanks to UNIX's adoption, like JavaScript on browsers nowadays, it became the language to use for anyone working on the enterprise on those new shiny UNIX boxes.

In Europe it was just another systems language to choose from, back when CP/M and other 8 / 16 bit systems were common.

C was usable on MS-DOS before Modula-2 or Turbo Pascal were available.
Both C compilers and Turbo Pascal already existed in CP/M, which preceded MS-DOS.

Also there were C, Pascal and Modula-2 compilers available for ZX Spectrum.

And on my part tiny of the globe I can guarantee that everyone only cared about x86 Assembly, Turbo Basic and Turbo Pascal, with Clipper for business stuff.

I only got to learn C in 1993, after having been a Turbo Pascal 3, 5.5 and 6.0 user.

Being able to compile stuff on CP/M wasn't much help if you wanted to develop MS-DOS applications.

I first used C in 1983 on MS-DOS, I didn't use UNIX until a couple of years later. I bought Turbo Pascal 1.0 when it was released but already had a C compiler at that point.

This is exactly the main problem with Haskell. A stunning language with a lousy standard library. In my opinion, Haskell should offer arrays and maps as built-ins (like Go) and ship with crypto, networking, and serialization in the standard library (I know serialization is already there, but everyone seems to prefer Cereal, so...)
> (I know serialization is already there, but everyone seems to prefer Cereal, so...)

This is precisely why shipping things in the standard library is a bad idea. It ends up full of cruft that no-one uses because there are better alternatives.

>Haskell should offer arrays and maps as built-ins

Why? What does that gain?

The standard platform provides Data.Map for maps, Data.Vector for arrays, and Data.Sequence for fast-edit sequences.

It's not even clear what a "built-in" array or map in Haskell would even look like, or what semantics it should have. Especially in a pure functional language, you need to be clearer about what your intentions are. A regular mutable packed array won't work most of the time.

    > This is exactly the main problem with Haskell.
    > A stunning language with a lousy standard library.
I dream of the day where we can say that the main problem of Haskell is which libraries are included in the standard library. To me, we would already have reached programming nirvana at that point.
Agreed, except for the "like Go" part, which is unnecessarily ad-hoc.
Haskell's actual problem isn't the lack of a comprehensive standard library, but rather the presence of core language features that actively hinder large-scale modular programming. Type classes, type families, orphan instances and flexible instances all conspire to make as difficult as possible to determine whether two modules can be safely linked. Making things worse, whenever two alternatives are available for achieving roughly the same thing (say, type families and functional dependencies), the Haskell community consistently picks the worse one (in this case, type families, because, you know, why not punch a big hole on parametricity and free theorems?).

Thanks to GHC's extensions, Haskell has become a ridiculously powerful language in exactly the same way C++ has: by sacrificing elegance. The principled approach would've been to admit that, while type classes are good for a few use cases, (say, overloading numeric literals, string literals and sequences), they have unacceptable limitations as a large-scale program structuring construct. And instead use an ML-style module system for that purpose. But it's already too late to do that.

How are type families worse than fundeps? That's a pretty ridiculous assertion; the things you can do with fundeps are strictly fewer than the things you can do with type families.

> The principled approach

You're dead wrong. The principled approach here is dependent types and full-featured type-level functions. Fundeps are a hack that let you implement a small subset of such functions (while type families gets us a bit closer to the ideal).

> they have unacceptable limitations as a large-scale program structuring construct.

Such as?

> And instead use an ML-style module system for that purpose.

How about we just use C macros for parametricity?

ML-style modules have their uses, but they aren't nearly as elegant as a clean type-level solution.

> How are type families worse than fundeps? That's a pretty ridiculous assertion; the things you can do with fundeps are strictly fewer than the things you can do with type families.

It's not about how much you can do (otherwise, just use a dynamic language, you can do everything, even shoot yourself in the foot!), it's about whether the result makes sense, and how much effort it takes to make sense of it.

> You're dead wrong. The principled approach here is dependent types and full-featured type-level functions. Fundeps are a hack that let you implement a small subset of such functions (while type families gets us a bit closer to the ideal).

You wanna play the dependent type theory card? Type families as provided in Haskell are incompatible with univalence.

    type instance Foo Bool = Int
    type instance Foo YesNo = String
Please kindly provide the isomorphism between `Int` and `String`.

Case analysis only makes sense when performed on the cases of an inductive type, which the kind of all types is not.

> Such as?

The insistence on globally unique instances?

> How about we just use C macros for parametricity?

What does this even mean?

> ML-style modules have their uses, but they aren't nearly as elegant as a clean type-level solution.

See here for how modular type classes, as proposed for ML, would actually prevent the issues caused by Haskell-style type classes: http://blog.ezyang.com/2014/09/open-type-families-are-not-mo...

> You wanna play the dependent type theory card? Type families as provided in Haskell are incompatible with univalence.

Hi. As someone that knows type theory and knows homotopy type theory and also knows Haskell well I would pose the following question to you: what purpose on god's green earth would be served by introducing univalence directly to haskell?

(Oh, and furthermore, you realize that fundeps have precisely the same issues in this setting?)

Contrariwise, don't you find it _useful_ that we can have two monoids, say And and Or, which have different `mappend` behaviour?

Now, can you imagine having that feature and _also_ respecting the idea that set-isomorphic things should be indistinguishable? How?

> what purpose on god's green earth would be served by introducing univalence directly to haskell?

Generally, when I want to reason about tricky data structures, what I do is:

(0) Define a set-isomorphic auxiliary type that's easier to analyze, and whose operations are easier to implement, but have worse asymptotic performance.

(1) Prove that transporting the operations on the auxiliary type along the isomorphism yield the operations on the original tricky type.

I need univalence for this argument to hold water.

> (Oh, and furthermore, you realize that fundeps have precisely the same issues in this setting?)

Type classes are already Haskell's controlled mechanism for adding ad-hoc polymorphism “without hurting parametricity too much”. I consider it healthier to reuse and extend this mechanism (which is what functional dependencies do) rather than add a second one for exactly the same purpose (type families).

> Contrariwise, don't you find it _useful_ that we can have two monoids, say And and Or, which have different `mappend` behaviour?

Sure. In ML, I'd just make two structures having the MONOID signature. Haskellers have this wrong idea that the monoid is just the type - it's not! A monoid is a type plus two operations. Same carrier, different operations - different monoids.

> Now, can you imagine having that feature and _also_ respecting the idea that set-isomorphic things should be indistinguishable? How?

Yes. Acknowledging that an algebraic structure is more than its carrier set.

> I need univalence for this argument to hold water.

No, you don't. Univalence is the axiom that transporting operations across such equivalences _always_ works. If you're doing equational reasoning directly it doesn't arise.

Furthermore, all you need to do is to establish that the _type operations_ regarding one type respect the equivalence to the other type as an additional step.

As you say "a monoid is a type plus two operations" -- so fine, we can treat the monoid And as the type bool and the dictionary of operations on it, and all this still works out.

> otherwise, just use a dynamic language, you can do everything, even shoot yourself in the foot!

Type classes allow huge flexibility while maintaining type safety, to a much greater degree than fundeps allow.

> it's about whether the result makes sense

Which they do. Perhaps you have some examples of when type families confused you or made you perform an error?

> Type families as provided in Haskell are incompatible with univalence.

TFs aren't dependent types. However, they are on the right track. Fundeps are farther away from the right idea. Could you explain to me what's wrong with your example? I'm not up to date on HoTT, but it seems like there's nothing in principle wrong with pattern matching on elements of *. That seems like an important feature of type-level functions.

>The insistence on global unique instances?

Why is this a problem? It makes sense from a theoretical perspective (we don't associate multiple ordering properties with the things we call "the integers"), and it's very easy to use newtype wrappers to create new instances if needed.

> What does this even mean?

ML modules are flexible, but backwards from a theoretical perspective. Parametricity is something that should be embedded in the type system, not the module system.

> See here

Interesting example. However, I doubt that the syntactic cost of using such a system is less than the syntactic cost of enforcing global instance uniqueness and using newtype wrappers.

> Type classes allow huge flexibility while maintaining type safety, to a much greater degree than fundeps allow.

Um, aren't functional dependencies an add-on to multiparameter type classes? I don't see where the opposition is.

> Which they do. Perhaps you have some examples of when type families confused you or made you perform an error?

I already gave an example above. I defined two type instances that violate the principle of not doing evil: https://ncatlab.org/nlab/show/principle+of+equivalence

> TFs aren't dependent types. However, they are on the right track.

Dependent types are a good idea. The way Haskell attempts to approximate them is not. Parametricity is too good to give up. With the minor exception of reference cells (`IORef`, `STRef`, etc.), if two types are isomorphic, applying the same type constructor to them should yield isomorphic types.

You know what type families actually resemble? What C++ calls “traits”: ad-hoc specialized template classes containing type members.

> Fundeps are farther away from the right idea.

Functional dependencies are a consistent extension to type classes, which don't introduce a second source of ad-hoc polymorphism, unlike type families.

> Why is this a problem? It makes sense from a theoretical perspective (we don't associate multiple ordering properties with the things we call "the integers"),

What if I want to order them as Grey-coded numbers? In any case, the integers are far from the only type that can be given an order structure, and many types don't have a clear “bestest” order structure to be preferred over other possible ones.

> and it's very easy to use newtype wrappers to create new instances if needed.

Creating `newtype` wrappers is easy at the type level, but using them is super cumbersome at the term level.

> ML modules are flexible, but backwards from a theoretical perspective.

ML modules are plain System F-omega: http://www.mpi-sws.org/~rossberg/1ml/ . Where's the backwardness?

> Parametricity is something that should be embedded in the type system, not the module system.

It's type families, as done in Haskell, that violate parametricity! Standard ML has parametric polymorphism, uncompromised by questionable type system extensions.

> Interesting example. However, I doubt that the syntactic cost of using such a system is less than the syntactic cost of enforcing global instance uniqueness and using newtype wrappers.

I can't imagine it being more cumbersome than wrapping lots of terms in newtype wrappers just to satisfy the type class instance resolution system.

>Um, aren't functional dependencies an add-on to multiparameter type classes?

You're right, I meant "type families".

> I defined two type instances that violate the principle of not doing evil:

We're not doing abstract category theory; we're writing computer programs (well, I am). Have you ever run into a problem with type families in that capacity?

>if two types are isomorphic, applying the same type constructor to them should yield isomorphic types.

Agreed, but there's a difference between type functions and type constructors. TFs are (a limited form of) type functions. Value-level constructors admit lots of nice properties that value-level functions do not, and I see no reason to be uncomfortable with this being reflected at the type level.

> What if I want to order them as Grey-coded numbers

Use a newtype wrapper. Even if a language allowed ad-hoc instances, I would consider it messy practice to apply some weird non-intuitive ordering like this without specifically making a new type for it.

> Creating `newtype` wrappers is easy at the type level, but using them is super cumbersome at the term level.

And using ML-style modules is easy at the term level, but cumbersome at the type level.

It's a tradeoff, and I suspect that newtypes are usually the cleaner/easier solution.

> ML modules are plain System F-omega

I hadn't seen the 1ML project. That's pretty cool.

> It's type families, as done in Haskell, that violate parametricity!

How so? I really don't understand your argument here, if you just take TFs to be a limited form of type function.

> Parametricity is too good to give up. With the minor exception of reference cells (`IORef`, `STRef`, etc.), if two types are isomorphic, applying the same type constructor to them should yield isomorphic types.

You know that's not what parametricity means, right? Like, at all?

Here's a challenge.

`foo :: forall a. a -> a`

Now, by parametricity that should have only one inhabitant (upto iso). Use your claimed break in parametricity from type families and provide me two distinct inhabitants.

There can be more than one problem, including having a small standard library.
The lack of a standard library can be fixed relatively easily: write libraries! OTOH, the existence of anti-modular language features that are extensively used in several major libraries, is a more serious problem, because:

(0) It means that libraries in general won't play nicely with each other, unless they're explicitly designed to do so.

(1) It can't be fixed without throwing away code.

This whole thread is exactly about how "write libraries!" (if done outside the standard library) doesn't work (see my top post).

I do agree that lack of modularity features certainly doesn't help though.

One of the common mantras I've heard among Rust core devs is "std is where code goes to die". Where do you feel the line should be drawn between standard lib and external libraries?
Maybe a change in attitude? Stability can be a good thing. Go's standard library doesn't change that much, and that's a strength.

How about: "std is where code goes when it's done".

As is, really done. The API's won't need changing.

The counterpoint being Python simplejson vs json. Most working Python developers I know try simplejson first (when they are not controlling dependencies in the environment) and fall back to stdlib json because simplejson got much faster as it evolved outside of the standard library[0]. Most who don't know this go the other way[1].

There are a number of counterpoints in Python, in fact, which epitomizes the "standard library is where code goes to die" thing. Adding modules to the standard library in Python is, more often than not, overall a bad thing for the module. Python has not historically been awesome with standard library quality, either; see Java-style logging and unittest (I mean naming, not "Java idiomatic," which I think is fine for both).

This comes down to release cycles for the language, mostly. So I think API stability is a bit of a red herring when discussing Python, at least.

I tend to appreciate languages where I can remove the entire standard library and "start over," like C. (Yes, you can.) This can be good for a number of things: porting, embedding, frameworks, and so on.

[0]: http://artem.krylysov.com/blog/2015/09/29/benchmark-python-j...

[1]: https://github.com/search?q=simplejson+ImportError&type=Code...

Case by case. If you absolutely need the speed, go with what gives you the boost. Otherwise, I always encourage people to use json and not have to have an additional external dependency. One of the reasons some people was using simplejson isn't really speed IMO, but because json module was not in stdlib until what, late 2.6?

I try to keep my dependency list as tiny as possible, and use what makes sense for my development and for future maintenance. Also, look at the result, in Python 3, json module beats simplejson.

But speed isn't the only thing that matters, it seems ujson would use more memory (https://news.ycombinator.com/item?id=9326499).

It wasn't "late" 2.6 (that's not how Python releases work for changes like that), it was 2.6, which was October 2008. Nearly eight years ago. Most distributions are even on Python 2.6 now.

Anyway, my point isn't the specific example. That you and I even have this discussion at all and that there are hundreds of thousands of caught ImportErrors on that specific example on GitHub is my point regarding standard library stability; folks seem to think the standard library is the end-all (wherein we wouldn't be having this conversation at all), but Python has shown it is anything but when not carefully maintained. I think Rust is wise to approach this with caution.

Honestly, I'm not extremely familiar with Rust, but it seems it elected the C approach where you can gut the language. A+. Good. How it should be for a systems language like that, because now it can be ported, embedded, and so on.

> That you and I even have this discussion at all and that there are hundreds of thousands of caught ImportErrors on that specific example on GitHub is my point regarding standard library stability;

That fact that thousands of files catching ImportError does not necessarily implies folks are questioning stdlib's stability. That merely means some people are deliberately choosing to prefer simplejson over json. The benchmarks demonstrated json module before Python 3 could be slower than simplejson, but json module since Python 3 has beaten simplejso in terms of speed of execution. Furthermore, there are old Stackoverflow threads on usjon vs simplejson vs json regarding performance. All the above would naturally suggest folks who choose to prefer simplejson over json is due to the concern of speed, rather than opinion on stability.

Also, stability is the wrong term for the problem you are describing. Agility is probably the better word. Python release tends ot be backward compatible (of course except Python 2 vs Python 3 and a few other modules like asyncio). Python core developers try not to break applications. If anything, non-core libraries will break compatibility more frequently without having to face larger opposiitons; I can break simplejson if I were the maintainer of simplejson. The consequence is maybe a couple angry GitHub issues and a few blog posts, unlike Python 3 which still gets a lot of angry media coverage till this day.

The problem with stdlib is absolutely about agility. The core community is extremely small. It can take many weeks and sometimes months to get your commit merged. The reason I like to keep stdlib around is good citizenship. I would love to have requests in the stdlib, but in a more agile and more frequent release. Python isn't the only player. OS distro are also responsible for the slowness. There's been discussion on python-dev regarding more frequent release and even potentially breaking up stdlib could be an option for the Python community.

The way I see it, the packages that support both do so because they know over 90% of users are satisfied with the performance of the standard library package and don't want to install extra dependencies to get the library or utility to work.

Even more code just use the standard json package without any fallback. The ease of development or deployment is clearly worth more to them than what small speed advantage they can get from going with the external dependency.

The calculus will be different for Rust, of course, with different build and deployment system.

> Maybe a change in attitude? Stability can be a good thing.

I haven't seen anybody say otherwise.

New standard library APIs are stabilized at a steady clip with every new release: https://github.com/rust-lang/rust/blob/master/RELEASES.md

In the Ruby world, very few people use the standard library because it's got so many flaws, and they can't be fixed. So you end up with Nokogiri rather than REXML, all the various HTTP libs rather than net/*, etc. So it just ends up being bytes sent over the wire, wasting disk and bandwidth...
I wonder if identifying the atomic aspects of what you intend your language to be used for ultimately helps in narrowing down what should be in std lib.

Go prioritizes network programming and bundles the necessary components, like http & rpc servers and json.

The http libraries are extensible enough to allow for customization where it's wanted (like http mux) while still creating a canonical implementation that'a still viable.

Has Rust identified the core demographics of who they're targeting in order to provide the most applicable platform? Is the target everyone and all application type, therefore there is no default platform?

Edit: To put it another way, is there a set of packages that is either necessary for rust, rust development, or most development in rust? If std lib includes everything necessary, then who are you targeting with the default platform?

  > Has Rust identified the core demographics of who they're targeting
  > in order to provide the most applicable platform? Is the target
  > everyone and all application type, therefore there is no default platform?
Our target audience is still a bit too broad; "systems programming" can mean a lot of things. Application developers build a _lot_ of different applications, those who embed Rust in other languages have different set of requirements, OS/embedded devs have another. There's a lot of stuff in common, but there's also significant differences.
Well, the trick is to actually get it right before standardizing it - much easier said than done. Keeping the standard library small helps with that since the bar is higher.
> Well, the trick is to actually get it right before standardizing it - much easier said than done.

And that's what we're doing.

> Keeping the standard library small helps with that since the bar is higher.

But weren't you just advocating for a large standard library?

I guess I didn't understand what was meant by "where code goes to die". Graduation != dying.
I wouldn't say very few people use the standard library?

CSV, logger, json, fileutils, tempfile, pp, and on and on are used all the time...

There are some parts that are good, and some parts that are bad, for sure.
Go isn't immune to the problem either. See the `flag` package, which is something that new users are encouraged to avoid in favor of e.g. https://github.com/jessevdk/go-flags .
I am a Go programmer and I've never seen anyone anywhere encouraging people to use anything over the flag package. How did you get such impression?
Docker, for example, say "seriously just don't use it".

http://www.slideshare.net/jpetazzo/docker-and-go-why-did-we-...

Everybody I have ever seen do CLI applications in Go will recommend heavily against it.
The package you mentioned has about 200 imports. Compare that with the standard package. https://godoc.org/?q=Flag
I think that Go can pull off a good standard library because there's a big corporate sponsor behind it, whereas Ruby may have had difficulty with its standard library for the lack of a sponsor.

Standard doesn't mean completely done. Standard should be able to accomodate things like HTTP2, as Go has done, whether that means expanding the API or whatever.

The "big corporate sponsor" argument often comes up when discussing language success. Google doesn't really put more than a few people's time into Go, the rest is open source. Other languages like Python didn't have any real backing until way after success.
Guido and a few other core developers were employed to work full time on Python for quite some time, even before it became successful.
It's too early to draw conclusions about Go's standard library. Python's standard library seemed like a good idea at the time too. Come back in 15 years and let's see how good it looks then.
Sql?

It does all the wrong things; singletons, no testability, cgo for implementations, side effects and you have to use every database differently based on their individual semantics.

Virtually everyone I've ever spoken to either uses a high level wrapper around the sql library or a no-sql solution.

That's the definition of 'stdlib is where packages go to die'.

It's not that the API is unusable, it's just basically not used by the community because there are other better things out there...but you're stuck with it forever, because it's there and some people do use it, and changing or removing it would be a breaking change.

Anyhow, we're just speculating. Does anyone actually collect metrics about the usage of different parts of the stdlib for any language?

Without hard data to back it up, you couldn't really make a strong argument either way.

I didn't see anyone actually mention sql so I'll just assume your first line is to be interpreted as "sql is the counterexample of why Go's standard library is not as great as it may seem."

>Virtually everyone I've ever spoken to either uses a high level wrapper around the sql library or a no-sql solution.

How does that reflect the quality of the std lib implementation? All the high-level wrappers I've seen still utilize database/sql, they just provide convenience methods on top of the existing functionality. Are people using NoSQL databases because database/sql is so bad or merely because that technology fits their project's requirements?

>That's the definition of 'stdlib is where packages go to die'.

steveklabnik's example of Ruby XML parsing libraries is a better example of this, if only because the std lib implementations are almost completely ignored by all other gems. Go's database/sql is actively used outside of the std lib to great affect, whether in wrappers and ORMs or in implementing other SQL databases (like Postgres).

> "Sql? It does all the wrong things; singletons, no testability, cgo for implementations, side effects and you have to use every database differently based on their individual semantics."

SQL has its flaws, but it is testable. The testing approaches available vary depending on the implementation. For example, can write unit tests for SQL Server (using tSQLt, to give one example: http://tsqlt.org/ ).

yeah no. If you have an interface and you need a separate test suite for each implementation of that interface, it's a terrible interface.
The point is, there's nothing inherent in the design of SQL that stops it being testable, it just hasn't reached the SQL standards yet.

Plus, there are plenty of ways to test standard SQL, you can easily do so through stored procedures.

Out of curiousity: what do the 5 string types do differently?
String: Linked list of Char. Nice for teaching, horrible in every other aspects. Text and lazy Text: modern strings, with unicode handling and so on. ByteString and lazy ByteString: these are actually arrays of bytes. Used to represent binary data. Because haskell is lazy by default, and sometimes you want strictness (mostly for performances), there are two variants of Text and ByteString, and going from one flavor to the other requires manual conversion.
Risking to go off-topic a bit, I think the lazy versions of Text and ByteString wouldn't have been needed if we had nice abstractions for streams (lists are not, they cause allocation we cannot get rid of) so that you don't need to implement a concrete stream type (e.g. lazy Text and lazy ByteString) for every data type.

Rust does this well with iterators, for example.

The problem is that streams actually have very complicated semantics when they interact with the real world. What does it mean to traverse an effectful stream multiple times? Can you even do that?

Data.Vector provides a very efficient stream implementation for vector operation fusion, but it's unsuitable for iterators/streams that interact with the real world. Pipes, on the other hand, combined with FreeT, provides good, reasonable semantics for effectful streams.

As with many other things, Haskell forces you to be honest with what your code is actually doing (e.g. streaming things from a network) and this means that there's no one-size-fits-all implementation we can stuff everything into.

Just sticking with the pure types there's currently no generic stream model that works well. No stream fusion system fuses all cases (even in theory) and they also fail to fuse the cases they're supposed to handle too often in practice.

I haven't looked at pipes, but I'm guessing it doesn't all fuse away either.

You're right, I believe Haskell's fusion framework could be greatly improved (although it is the best production solution I'm aware of). However, how would you go about solving this? I don't think there's any generalized solution to the problem of creating no-overhead iteration from higher-level iterative combinators.
>Conversions between our 5(!) string types are very common.

All five of those string types do different things. This isn't a problem; we just have increased expressivity. We couldn't fix this by having a more coordinated standard library. 5 is also a very manageable number IMO.

>It's too difficult to make larger changes as we cannot atomically update all the packages at once.

That's what Stack is for, no?