Hacker News new | ask | show | jobs
by jackowayed 4916 days ago
Databases. Oracle, Microsoft, etc. have figured out a lot about how to make high-performance query execution engines and transactional storage systems and written about very little of it.

Research has caught up some, but it definitely lags.

3 comments

You're confusing solid engineering with research.

SQL Server started as a fork of Sybase (Microsoft bought Sybase's source code and started hacking). Sybase, in turn, was based on Ingress and "Ingres was first created as a research project at the University of California, Berkeley, starting in the early 1970s and ending in the early 1980s" (http://en.wikipedia.org/wiki/Ingres_(database)).

SQL Server is literally based on technology for 70s.

Ingress was started by Michael Stonebraker, who then did PostgreSQL (which added novel, at the time, extensions to relational model), who then did Aurora, C-Store and Vertical (column-oriented databases), them Morpheus, then H-Store and then VoltDB.

Stonebreaker did more research (as in: creating novel things) than Microsoft as a whole in SQL Server.

SQL Server is a great database but it's a result of Microsoft paying an army of programmers for 24 years to work on improving a single product. It's a result of running a profiler often, not some unknown-to-the-world algorithms.

That has happened in the past too, in many areas. eg compiler research used to be like this. But many of the companies died, and stuff was reinvented again, probably wastefully (or differently, who knows).
I am sure they have a lot of tiny performance improvements. But since when is CS research about tiny performance improvements?

It's not like Oracle or anyone else has any secret algorithm which runs in linear time when all of academia only knows of exponential time solutions for the same class of problems.

If "CS research" includes "Software Engineering" then yes, query plan optimisers are definitely currently researched.
Do you think quicksort was an unimportant advance? After all, it's only O(n log n) in the average case, and is O(n^2) in the worst case. By your standards, it shouldn't have been seen as any kind of improvement over mergesort.