Hacker News new | ask | show | jobs
by jeffdavis 5009 days ago
I think blaming the user here is partially valid (he didn't read the docs), but that's not the whole story.

There is a discontinuity between the ease-of-use story and the blame-the-user story, regardless of how well documented the async insert behavior is.

And it doesn't have to be this way. There are ways of designing interfaces, APIs, and even naming that go a long way to prevent your users from shooting themselves in the foot.

Take postgres. It also supports at least a couple kinds of async insert, one of which is a part of libpq (postgres C client library). It's called "sendQuery" and it's documented under the "Asynchronous Command Processing" section. It's hard to imagine a user trying to use that and expecting it to return an error code or exception. Even if the user doesn't read the docs, or reads some fragment from a blog post, they will still see that the name suggests async and that it returns an int rather than a PGResult (which means it obviously doesn't fit into the normal sync pattern).

There is no reason mongo couldn't be clear about this distinction -- say, rename "insert" to "async_insert" and have "insert" be a wrapper around async_insert and getLastError. But instead, it's the user's fault because they didn't read the docs.

Careful API design is important to reduce the frequency of these kinds of errors. In postgres, it's relatively hard to shoot yourself in the foot this badly in such a simple case. I'm sure there are gotchas, but there is a conscious effort to prevent surprises of this sort.

1 comments

> There is no reason mongo couldn't be clear about this distinction -- say, rename "insert" to "async_insert" and have "insert" be a wrapper around async_insert and getLastError. But instead, it's the user's fault because they didn't read the docs.

Because if you don't read enough of the docs to understand that 'insert' is asynchronous insert, you don't understand MongoDB and haven't done your research.

Why should 'insert' default to synchronous? Why shouldn't we instead have a sync_insert function instead? The only reason is that you're assuming familiarity for people coming from SQL/synchronous-oriented DBMS, but why should they be forced into an awkward design just because it's what people are familiar with from other DBMS?

A good system is forgiving; it encourages exploration; if there's a choice between safety and performance it defaults to safety. If/when profiling shows the safe behaviour to be a bottleneck, then users can Google the issue and discover "Oh, I just need to set flag X; I can live with the consequences here".

Expecting the user to be an expert in your product from the start is simply not realistic; a well-designed system facilitates use by people of varying levels of expertise.

> A good system is forgiving; it encourages exploration; if there's a choice between safety and performance it defaults to safety.

Not if you're choosing a system that's explicitly marked for performance over safety.

> Expecting the user to be an expert in your product from the start

The 'product' in this case is a non-relational database, not an iGadget. The user can and should be expected to be familiar with the main strengths and weaknesses of the database as a whole.

There is no way you can convince me that someone who has done a reasonable level of due-diligence in investigating MongoDB can be surprised when it behaves asynchronously.

Kudos to you for doing your research. If you're saying "don't use MongoDB without doing at least N days of research first", then you're very much at odds with (my perception of) the 10gen marketing message.

I think you're right though: MongoDB should not be used without _lots_ of research into its limitations.

> I think you're right though: MongoDB should not be used without _lots_ of research into its limitations.

That's true about any database, not just MongoDB; nothing new here.

> then you're very much at odds with (my perception of) the 10gen marketing message.

10Gen is fairly straightforward about the original issue, having blogged openly several times about their decisions - but at the end of the day, any engineer should do research beyond the simple marketer's pitch.

I won't doubt that there are people who make snap judgements about fundamental architecture based on marketing pitches[1], but that's very unfortunate, and the marketers really can't be blamed, especially when they make no effort to conceal the truth or deceive you!

[1]http://www.pinaldave.com/bimg/dilbert5.jpg

> That's true about any database, not just MongoDB; nothing new here.

That's exactly the point where we started. A well-designed system fails "safe"; it should obey the principle of least surprise. Specifically: MongoDB should default to synchronous writes to disk on every commit; official drivers should default to acknowledging every network call; MongoDB shouldn't allow remote access from the network by default. Once you want higher performance or remote access, you can read about the configuration options to change and learn on-the-fly, evaluating the trade-offs as needed.

Other systems are safe by default (e.g. PostgreSQL), and their out-of-the box performance and setup complexity suffers because of it. MongoDB could ship "safe" (with the same trade-offs), but chooses not to. That sort of marketing-led decision-making has no place in my technology stack.

It's not that way because somebody in the 70's flipped a coin and decided that sync was heads.

It's because it's a reasonable assumption to make. Data loss shouldn't be a surprise, if I need speed and am willing to risk dataloss I should have the option, but should explicitly choose to use it.

> if I need speed and am willing to risk dataloss I should have the option, but should explicitly choose to use it.

You did, by choosing to use MongoDB.

(And if you chose MongoDB without being aware of that implication, you didn't choose MongoDB for the right reasons or didn't do your due diligence, because you cannot understand MongoDB's use case and tradeoffs if you were unaware of this.)