| From another comment I made, on why I don't think is a good article even using the proposed thesis of "mongo doesn't work for graph like relationships": Even though their data doesn't fit well in a document store, this article smacks so much of "we grabbed the hottest new database on hacker news and threw it at our problem", that any beneficial parts of the article get lost. The few things that stuck out at me: * "Some folks say graph databases are more natural, but I’m not going to cover those here, since graph databases are too niche to be put into production." - So you did absolutely no research * "What could possibly go wrong?" - the one line above the image saying those green boxes are the same gets lost. Give the image a caption, or better yet, use "Friends: User" to indicate type * "Constructing an activity stream now requires us to 1) retrieve the stream document, and then 2) retrieve all the user documents to fill in names and avatars." - Yep, and since users are indexed by their ids, this is extremely easy. * "What happens if that step 2 background job fails partway through?" - Write concerns. Or in addition to research, did you not read the mongo documents (write concern has been there at least since 2.2) Finally, why not post the schemas they used? They make it seem like there are joins all over the place, when I mainly see, look at some document, retrieve users that match an array. Pretty simple mongo stuff, and extremely fast since user ids are indexed. Even though graph databases are better suited for this data, without seeing their schemas, I can't really tell why it didn't work for them. I keep thinking "is it too hard to do sequential asynchronous operations in your code?". |
Did Sarah model the data poorly ("We stored each show as a document in MongoDB containing all of its nested information, including cast members").
Or is there an easy way to extract that information that Sarah just doesn't know about yet?
Keep in mind the constraints in the article, for example: some shows have 20,000+ episodes, actors show up in 100s of shows, and "We had no way to tell, aside from comparing the names, whether they were the same person".
The last part seems like a really straightforward relational critique to me. If you don't break the actors out into unique entities then you can't compare them across shows. But if you do break them out into unique entities, then how to you present the show information without doing joins?