| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jabagonuts 4883 days ago
	At what point do you abandon mongodb and just use postgresql?

1 comments

danielpal 4883 days ago

At no-point. Stripe is doing it right. They are using the right tool for each job. Mongo for storage speed etc and then postgres to analyze query etc.

This kind of comment shows how little knowledge you have about NoSQL and SQL. Is not a SQL vs NoSQL, it's about using the right technology for the job.

seanwoods 4883 days ago

> This kind of comment shows how little knowledge you have about NoSQL and SQL.

The question is perfectly valid. In many scenarios (not necessarily Stripe's), PostgreSQL is fast enough to do the job. Stop putting people down for legitimate engineering questions.

eksith 4883 days ago

>This kind of comment shows how little knowledge you have about NoSQL and SQL.

Try not to be condescending and your point will be better received. "Right technology" as I'm sure you're aware, has as much to do with subjectivity as appropriateness. Familiarity, workflow, ease of use (and did I mention familiarity?) cannot be overstated even when the perceived benefits are considered.

Read: religion.

Some of the people who rally against NoSQL may be deriding it from a knee jerk reaction, however others are simply frustrated with developers who, as Ted Dziuba would say, "value technological purity over gettin' shit done".

dennis82 4883 days ago

are you kidding me? There is absolutely NO reason whatsoever to use a NoSQL database for a financial services company. Postgres is more than capable of sustaining the necessary speeds of a startup.

Relational databases were created in the first place to solve these very problems around transactionality and analytics for finance.

This library is a beautiful example of reinventing the wheel, and otherwise creating a patchwork of unnecessary - and ultimately brittle - infrastructure.

gdb 4883 days ago

(I work at Stripe.)

Where we use MongoDB, it's not because of speed. PostgreSQL is certainly capable of fast performance. MongoDB is useful for its ability to log freeform data as well as for its replication model. (We use sharded MongoDB in a few places, but mostly use straight replica sets.)

We use MySQL, MongoDB, PostgreSQL, and Impala. They're all useful in different places.

pvh 4883 days ago

Mongo's probably still got the edge as a JSON store overall, but definitely check out the new JSON object dereferencing functionality coming in 9.3. There's a Russian indexing posse consisting of Oleg, Teodor, and Sasha who have been looking at doing proper indexes for JSON but haven't managed to secure funding. (Disclosure: I think they should get funded.)

These are the same guys who built hstore, full text search, GIN and GIST indexes and I think are working on a generic regular expression index type right now.

dennis82 4883 days ago

> "We use MySQL, MongoDB, PostgreSQL, and Impala."

Thanks for the clarification, but this makes it even more obvious your engineering team is introducing needless complexity into your organization.

Postgres can store unstructured data just fine, so you have a 'solution' that uses 3 OLTP stores instead of one.

taligent 4883 days ago

PostgreSQL is awful for storing unstructured data. It is the most cumbersome, clunky syntax I've seen for a while and it lacks ORM support meaning you are forced to manually write it.

Making developers productive is an important aspect for choosing a database.

gfodor 4883 days ago

Choosing a data store based upon syntax and slightly limited ORM support isn't exactly a great idea. Both of these things can be improved rapidly with a little code.

More important questions are how is the data stored, how is it accessible, how can you scale the system, what operational constraints are there, how fast is it, what types of data modeling can be done, what consistency/transaction guarantees does it provide, etc. These are the things that will make developers productive because they will not be putting out fires all the time.

spicyj 4883 days ago

Why do you use MySQL over Postgres and vice versa?

monstrado 4883 days ago

(Clouderan here)

How are you liking Impala? We just dropped 0.5 release yesterday which includes the JDBC driver :D!

Edit: Awesome job on the Ruby client, it's great!

gdb 4883 days ago

It's been great -- setup was a bit of work (we're on Ubuntu, so had to build from source), but once up and running it's allowed us to do lots of ad-hoc analysis that would have been too hard otherwise.

I've been meaning to write a MoSQL equivalent for our Impala data, but at the moment we're doing a more traditional ETL.

odellk 4883 days ago

gdb - If you have Impala, Hadoop, and Hive right now. Why use MongoDB instead of HBase and make it all work in a happy harmony?

monstrado 4883 days ago

Awesome! Great to hear it's working out for you guys, looking forward to MoSQL for Impala :-)

nelhage 4883 days ago

We've been pretty happy so far. There have been a few rough edges getting it up and keeping it running, but we've been very impressed with the performance so far.

I've passed your comment on to Colin, who wrote the Ruby client -- I'm sure he'll appreciate it!

monstrado 4883 days ago

I got myself a little Impala Herd server setup, pointed it at my Impala cluster and it's working great ;).

rdouble 4883 days ago

Everywhere I've worked that did high volume transaction processing had an architecture that required a piece like this. Even if you use a relational database for intake, you still need to move the data to another database for analytics. Moving the data automatically via replication sounds a lot better than the typical batch process running at 4am.

Ingaz 4883 days ago

Tell this FIS Global.

There is absolutely no reason to make banking system on GT.M but they did.

Although: GT.M is the only(?) NoSQL that is ACID-compliant.

taligent 4883 days ago

> There is absolutely NO reason whatsoever to use a NoSQL database for a financial services company

Yes there is. PostgreSQL doesn't support multi master replication which makes it a terrible choice if you really want to make sure every transaction gets written. I really wonder at what point people that keep recommending PostgreSQL are going to wake up and realise what is happening in the industry.

People are scaling OUT not UP. Especially startups.

shawn-butler 4883 days ago

I'm sorry, postgres-xc doesn't work for you needs? [0] It has worked for me in the past.

[0] http://postgres-xc.sourceforge.net/

knightni 4883 days ago

I would imagine that for your average startup, using solutions that don't even support transactionality will cause greater complexity issues. Especially given the enormous window before db scale out/up becomes an issue on well-designed applications.

taligent 4883 days ago

Enormous window ?

Many startups would be using AWS and it is not inconceivable that you would have Multi-AZ/Multi-Region VPSs. Scaling out != Expensive.

j-kidd 4883 days ago

> People are scaling OUT not UP. Especially startups.

Startups need to scale out because many of them like to deploy on mediocre EC2 instances with the slowest SAN storage ever.

People that keep recommending PostgreSQL are rightfully ignoring this industry.

taligent 4883 days ago

> Startups need to scale out because many of them like to deploy on mediocre EC2 instances.

No. They need to scale out because providers like AWS have outages. And so startups et al need to deploy in multiple AZ/regions in order to have as close to 100% uptime as possible. You can't do that with a well considered multi master style replication strategy which PostgreSQL frankly doesn't have.

>People that keep recommending PostgreSQL are rightfully ignoring this industry.

Sure. And soon enough they will be relegated to the dustbins of history. The trends don't lie.

gbog 4883 days ago

"The trends don't lie"

Wah. And you do not even seem to be ironic. Trends always lie, there is always a next thing that will take the opposite direction, in philosophy, in science, and particularly so in computing stuff.

nirvdrum 4883 days ago

In all fairness, you could use something other than Postgres that's also ACID.

lucian1900 4883 days ago

The only advantage MongoDB has over Postgres is built-in sharding, and even that is of dubious value.

nelhage 4883 days ago

To pick one, we like the fact that MongoDB lets you change your schema and add new fields to your documents without having to worry about migrations or keeping track of schema versions, or any of that.

You could build something like that on top of SQL, but it's nice to have a tool where you don't have.

psaintla 4883 days ago

Serious question to you or anyone else who uses schemaless databases. Why is the ability to change schemas on the fly a good thing? Having worked at two companies that did, it was nothing but a recipe for disaster in large groups. Code that was dependent on expecting an integer or a string and not a collection would constantly break because a developer in some other group decided to store a collection instead of a the original data type that was expected. Schemaless databases required more documentation to track changes made between groups and led to more bugs because we could never be guaranteed of what kind of data we would be receiving. I've always thought of a database schema as a contract that makes guarantees to all applications. Why would you want to be able to break that contract?

PommeDeTerre 4883 days ago

There's no such thing as a "schemaless" database. There are, however, different ways of handling the storage and management of the schema.

In the situations you describe, and when using most NoSQL databases, there's still a schema. It's just stored in the minds of developers, in documentation that's correct and up-to-date, in documentation that is incorrect and outdated, throughout application code, and numerous other places.

Then there's the sensible approach taken by most relational database systems, where the schema is centralized, it is described with some degree of rigor, and it can be more safely modified and managed.

lucian1900 4883 days ago

I've found a good SQL library (like Alchemy), a good migration library (like Alembic) and a DB with non-blocking migrations is much nicer to use, since it makes data migrations very easy.

djb_hackernews 4883 days ago

How does MoSQL handle schema changes and new fields in mongo?

I'm imagining with this tool you start to need to be a bit more careful with the flexibility which initially drew you to mongo.

nelhage 4883 days ago

MoSQL will just throw any fields it doesn't recognize into a JSON "extra_props" field (if you ask it to). So everything will work fine, and existing SQL code (which doesn't know about those fields) will continue to be fine.

If you need the data in SQL, you can either parse the JSON somehow, or rebuild the SQL table with a MoSQL schema that knows about the new fields.

jcase 4883 days ago

Automatic failover is a pretty big feature though. I wish Postgres had a built-in solution. Sure, I could use Pacemaker but it's no where near as painless.

lucian1900 4883 days ago

You should be aware however that Mongo's failover incurs downtime.

A Postgres bouncer + WAL replication achieves a similar result: There is no downtime on failover, but there is a single slave.