|
|
|
|
|
by dasil003
6284 days ago
|
|
I sort of see where you're coming from in terms of the parity mismatch between RDBMSs and typical application code. Certainly we can avoid some coding headaches by just dumping things into a data store that is optimized for what we want to do with that data right now. But then you say that most applications have a well-defined data model and rarely run "queries". This is where I think you've gone terribly terribly wrong. The value of a relational database is that it most agnostically represents the reality of what the data represents. It's not about "big piles of data" or "making sense of your data", relational databases are about making your data as expressive as possible. You're selling this idea that most applications only use data in a few predefined ways. I have to say that sounds like a complete pipe dream. Requirements change all the time. Reporting needs often are not even conceived until you have hundreds of megabytes of data. Let's not even get into multiple applications using the same database. In every business I've ever been involved with, the data is always more valuable than the code, and it always outlives the code. Too much of the hype around these alternative database technologies are throwing the baby out with the bathwater. The idea that "most" applications don't need structured data just strikes me as incredibly naive and short-sighted. Far more applications need structured data than need to scale. |
|
Let's say you have two types of data, customers and orders. Customers have many orders, an order belongs to a customer. This is easy enough with a typical relational database. You have a customers table and an orders table. You can join the tables and ask questions like "how many customers spent more than $300 last year?"
Now let's consider the graph/object database equivalent. You have two classes, Order and Customer. A Customer has a set of Orders, and the Order has a customer. (Cycles are fine, this is a graph, not a tree.) Creating an order works with some method like $customer->new_order_for('some pants'). You store this in the database, and the graph structure is stored and indexed. (Usually, object databases index on class name, but you can always specify other conditions. This makes it basically equivalent to the relational database.) Note that this structure works very well; the in-memory relationship is the same as the in-storage relationship. You can also write the same query as with the relational database. Get all customers, find their orders from this year, and sum the totals. (Instead of writing SQL, you would just write a script here. You can index things like the order year to speed up the query, as well. Otherwise, it's O(n), but so is the relational database without an index. There is no magic after all.)
Anyway, there is no lack of flexibility with the graph database. If you want to query your data, you can. It's just less convenient, since you have to write a program to do it, instead of letting your database management engine do it. (This is actually not true in general, AllegroGraph has a querying engine based on prolog.)
Back when I did data warehousing, we had to move all our data from the web app servers to a warehousing server in a specialized schema so that some GUI software could manipulate the data. Even though we used a relational database, we had to convert anyway. Using an object database would have made the app code simpler, and the warehousing code equally complex. So I think that would be a gain, not a loss.