Hacker News new | ask | show | jobs
by selcuka 966 days ago
> In 2019, when we were conceiving of Dolt, MySQL was the most popular SQL-flavor. Over the past 5 years, the tide has shifted more towards Postgres, especially among young companies, Dolt's target market.

Really? I was under the impression that PostgreSQL had already won by then.

3 comments

Among one-off toy projects shared on HN, probably. In the industry, MySQL is still the dominant system in my experience. Just to check though, Statista has it as dominant [1] as well as 2 other sources that appear to be geared towards sales & marketing research [2] [3].

[1] https://www.statista.com/statistics/809750/worldwide-popular...

[2] https://6sense.com/tech/relational-databases/mysql-market-sh...

[3] https://www.datanyze.com/market-share/databases--272

MySQL is still dominant in the PHP world, and even if HN will laugh at you for using PHP it's still really popular
> Statista has it as dominant [1] as well as 2 other sources

I'm pretty sure that's the Wordpress effect. How many Wordpress users would buy a database engine that is 99% compatible with MySQL, because it supports branching?

The engine is free and open-source. You can buy support, but these are free products.

https://github.com/dolthub/dolt

In fact, we've got a blog post about Dolt and Wordpress.

https://www.dolthub.com/blog/2023-08-04-wordpress-on-dolt/

> The engine is free and open-source. You can buy support, but these are free products.

Sure, but my question was about the feasibility of selling support to Wordpress users vs PostgreSQL users. The latter usually care more about advanced features such as the ones Dolt offers.

Don't get me wrong; I think Dolt is a great product. I was simply replying to the 2023 stats posted by k1ns.

Yeah 2019 was definitely Postgres world at least by 5 years.
Back then, when we were surveying the landscape, PostgreSQL was definitely in widespread use, but we were seeing MySQL being used in more professional/non-hobby spaces. Nowadays that's not necessarily the case, and we are seeing shops that were once MySQL-only now adopt PostgreSQL for some of their newer projects. With Dolt supporting MySQL and DoltgreSQL supporting PostgreSQL, we're hoping to appeal to both audiences.
Let's put it this way - someone who would use Dolt would have been using Postgres in 2019.
The adoption of Dolt has not pointed to this being the case. Many of our (paying) customers have not had PostgreSQL setups, and only around last year have we started to run into potential customers who truly needed a PostgreSQL variant of Dolt. Dolt and DoltgreSQL are open source and will forever be as we believe in true open source, however as a business, we prioritize the features that our customers need.

Either way, with DoltgreSQL now in development, we will now have solutions for both sides! (Other popular databases with their own syntax, such as Oracle and Microsoft, are not planned for the future at this time)

> The adoption of Dolt has not pointed to this being the case.

That's incredibly self-fulfilling. Counter status quo signals self-select out before even talking with status quo vendors.

This reply suggests you are able to see point-in-time among your self-selected subset of the world, but not able to see trend-through-time across the overall.

It's as if driving by rear view mirror based on noticing cars around you, rather than having a traffic helicopter high overhead looking across all the roads and ahead.

Inability to see difference in trending versus point in time is how "incumbents" end up having their lunch eaten. Looking at point in time instead of trend, the challenger isn't a threat until they've crossed over, and by then, well, you've lost 5 years. Of course only looking ahead, you might never get anywhere most people want to go today, and you have to get paid. So the trick is always considering both!

All that said, strong pivot now and we'll be checking your take out.

probably has more to do with people counting the number of mature projects vs greenfield development

by sheer numbers, MySQL is probably still dominant but if projects are starting today it's likely postgresql

Why a fork? Couldn't you just add a separate front-end? Will the underlying storage format stay the same so one can flip over or do you want to get closer to PG semantics?
This isn't a fork of PostgreSQL, it's a completely bespoke database solution. Its only tie to PostgreSQL is that we've chosen to appear as a PostgreSQL server to clients. If a user didn't use any versioning features, then the goal is that they should be unable to tell that they're not on an actual PostgreSQL server.

The versioning features are an important distinction though. Dolt (production ready, MySQL protocol) and DoltgreSQL (pre-alpha, PostgreSQL protocol) are built specifically to address the lack of versioning support in databases, and gaining these versioning features is as easy as swapping out the database you are using for Dolt and DoltgreSQL (once it's finished). MySQL and PostgreSQL are written using C/C++, while Dolt and DoltgreSQL are using Go, so there is no shared code. The storage format is implemented using prolly trees (https://docs.dolthub.com/architecture/storage-engine/prolly-...), which are based on merkle trees (used by Git and Bitcoin), so there is no overlap with any existing database solutions.

Hmm, when choosing database solution for a new project I'm not selecting sql dialect (like postgresql sql), but stability and ecosystem. So having this as an extension to postgresql, and possibility to combine it with other extensions it would be great, but reimplementation is a no go here.

And I can not use it for existing projects, because again extensions, and I surely don't want to findout how your implementation differs from the mainstream postgres...

Knowing that extensions are very important to you is great feedback for us. As we hear more about what users' requirements are, it helps us better plan for the future.

Regarding any differences from mainstream Postgres, you can look to how we've handled Dolt, which is production-ready. It targets MySQL, just as DoltgreSQL targets PostgreSQL, and it recently achieved 99.99% correctness according to a set of roughly 6 million tests (https://www.dolthub.com/blog/2023-10-11-four-9s-correctness/). This test is not a definitive stance that we are exactly 99.99% the same as MySQL, but it's a good general guide to how we approach our compatibility, and how serious we are in that regard.

Sadly though, the full versioning capabilities would not work as an extension to Postgres. We looked into it before we settled on our current approach. I talk a bit about it in the blog post as well. To truly allow versioning in the same capacity that Git does for source code, it required us to either fork Postgres and spend years reimplementing all of the work that we've done in Dolt just to get to where we are today, or choose the path that gets something out quickly, and allows us to have the very conversation that we're having right now.

That was actually the very reason for deciding to host an announcement that we're working on it. Many people have said that they'd like Dolt but for Postgres, however they've not said whether they need the Postgres binary specifically, the ecosystem, the syntax/wire protocol, etc. This announcement gives us the opportunity to receive that feedback.

Where are the 0.01% differences? When I'm trying to commit transactions? During select? Just not supporting some esoteric stored function syntax?

And most importantly, how does Dolt compare under heavy load, at the limits of server memory or bandwidth or CPU or disk thoroughput? You can assume an SSD and a multicore processor for purposes of answering.

Thank you very much. You are competing in a field where trust is extremely difficult to acquire - and the consequences to a lead dev for choosing Dolt[greSQL] could end his career. Nobody ever got fired for choosing the incumbent, as variations on the saying go.

It sounds like question is: why fork Dolt to make DoltgreSQL?
Dolt was built with MySQL in mind, and we're creating DoltgreSQL with PostgreSQL in mind. We've gotten interest in a "Dolt for Postgres", and so we're finally starting development on that exact thing.
Is there any compatibility between the underlying storage systems of Dolt and DoltgreSQL? Yes, the interface is slightly different, but is the storage (partially) compatible? Why aren't the two SQL dialects different interfaces to the same underlying storage?

The reason that I am asking is because I have a hard time trusting a newcomer to the database competition. And for purposes off discussion, DoltgreSQL will be a newcomer for the first ten years of its existence. Sharing the underlying storage model with Dolt (still a newcomer, sorry) would greatly increase my confidence in it.