Hacker News new | ask | show | jobs
by SwellJoe 3657 days ago
I never stop being impressed at how often people will jump to odd, unsupportable, conclusions like, "using MySQL will make this thing faster".

I've seen it so many times over the years regarding users and email configurations. I can't count the number of times I've dropped into someone's badly behaving mail configuration and found they had MySQL hosting the users, and explained it was for "performance" reasons. Somehow they didn't grasp that /etc/passwd fits entirely in memory, and the map files Postfix uses for various lookups and stuff are already a database (just one specifically designed for the task at hand) and also fit entirely in memory. Putting that MySQL layer in there is disastrous if performance matters; it is orders of magnitude slower for any case I've seen...still plenty fast for most cases, but it's ridiculous that this idea gets cargo-culted around that if you store your mail users in MySQL your mail server will be faster.

A little knowledge is a dangerous thing, is what I'm trying to say, and people who know MySQL is "fast" may not know enough to know that it's not the right tool for the job in a lot of cases...and is probably slower for many use cases. I'm pretty confident this is one of those cases. SQLite is wicked fast on small data sets, and being smaller means more of it will fit in memory; I can't think of any way MySQL could be a more performant choice for this workload.

Also, I don't even want to try to imagine shipping an installable desktop application for non-technical users that relies on MySQL!

8 comments

Also, I don't even want to try to imagine shipping an installable desktop application for non-technical users that relies on MySQL!

It's really not that hard to imagine. There is an embedded library version of MySQL called libmysqld which is made for this very purpose. Of course as you point out it's quite overkill for something that would work perfectly well with Berkeley DBs.

I've never seen it used like that, but Microsoft's equivalent SQL Server Express is a nightmare in deployment and support, because you need to administer a full-blown SQL Server on every customer PC (backups, migration debugging, …). I've no idea why people don't use SQLite instead.
_SQL Server Express_ is not the MS equivalent of SQLite.

_SQL Server Express_ is just a free (as in beer) limited version of _SQL Server_. It's still a full-blown traditional server like MySQL, Oracle and other that require services and administration tools to be installed.

Mirosoft has 2 equivalents to SQLite: The Jet database engine is pre-installed in all versions of Windows and allows creation of .mdb databases (those used by older versions of MSAccess). You don't need to include any dll file to use it.

The other is _SQL Server Compact_ which, like SQLite, is an embedded engine that you can bundle with your application by including a library.

All of these embedded databases are able to do multi-user writes to some extent. Jet is actually quite good if careful with locking (emphasis on careful). SQLIte is a de-facto standard because it's simple, performant, cross-platform and flexible.

While it can replace full-blown databases in some cases, it's far from being always true. There are still many cases where using something like SQL Server Express may make more sense, for instance if you want to offer a path to your customer for drop-in replacement of the database based on the growth of their needs over time. Not saying it's not a costly lock-in, but it's an easy one to sell.

It should be noted, of course, that both Jet and SQL Server Compact are essentially considered "deprecated" technology and are not recommended for new development efforts. The compact database format Microsoft recommends these days (and which comes bundled with the Universal Windows Platform SDK these days) is now actually SQLite.
SQLite is popular enough for Windows applications that SQLite had to rename its temporary files to "etilqs_..." (SQLite backwards) to avoid getting unnecessary bug reports from naive users (https://answers.microsoft.com/en-us/windows/forum/windows_7-...)
Certianly from my experience the main reason for using it is that once it becomes to large for a desktop machine it's trivial to migrate to a 'proper' MsSql server.

I also don't find the admin side much of a faff. Provide your users a backup/restore option within your app and ensure you install your own instance.

SQL Express has some rather shitty default configuration when you install it, that you usually have to change to actually use it for anything useful.
I've done also some tests, and so far in most of cases SQLite3 has been fastest option and it's also trivial to manage compared to other options. SQLite3 is absolutely awesome. Until you'll need concurrent writes. It's also important to notice that configuration options will make a big performance difference. http://www.sami-lehtinen.net/blog/sqlite3-performance-python... http://www.sami-lehtinen.net/blog/database-performance-tests... http://www.sami-lehtinen.net/blog/sqlite3-performance-testin...
SQLite is a practical option if concurrent writes are not required. It is not meant for that, and that is also one of the reasons why it is both fast and simple, since without concurrency, a whole set of really complex problems goes away. SQLite is great when the communication and arbitration between two services takes place at the application level, as is the case with DNS AXFR requests, for example, and each of those services may have its own private data store. In such a situation, an immediate benefit is realized by having a domain specific, uniform data manipulation tool (SQL) without having to write custom data manipulation code. This is especially well suited to scenarios where configuration management is performed via OS packaging.
Yup, I did some benchmarking (for a webapp which runs a separate process for each customer) and SQLite had the highest TPS and lowest memory usage by a large margin.

Probably because it runs in the same address space as each server process and there is no message passing (TCP/unix socket) overhead.

> I never stop being impressed at how often people will jump to odd, unsupportable, conclusions like, "using MySQL will make this thing faster".

That is a good opportunity to ask questions and learn about the person, their knowledge and their use case.

Yeah sometimes the answer is they don't know how stuff works underneath and are just following a blog post or a trend or heard someone talk about it at meetup.

Programmers love programming tools / frameworks / languages. There are so many new shiny things out there. New ones pop up every month. New language, new build system, new db, new UI library, new message queue etc. Interstingly over the years, lots of those have had great performance improvements. Maybe they took advantage of new kernel API or some new algorithm and so on. That feeds the hype -- put new shiny things in and performance doubles.

So people have been trained to expect all the new things to just magically be better. But like said, you have to probe and find out the reason they propose the change, only then you can only see if it is justified or not (one of the resonable responses could be "I've actually measured with a production load and it does look better"),

Its mostly about concurrency, sqllite serialises all writes, for an embedded database in a single user application that is ok, but for a web application that may have a lot of concurrent writes going on its a performance disaster area. Note that for good perfomamce on a write heavy web application even mysql can be a problem unless you use innodb. As myisam has the same write table locking behaviour.
I'm a little ignorant of the situation on the sqlite side, so this may be a dumb question...

Say you have an MVC web app, could you have asynchronous calls to a thread performing the actual DB writes? From your app's perspective, concurrent writes would be placed into a queue that performs them serially.

Are there any functional issues to this approach? Thanks in advance for your thoughts or info.

That might work, as long as you don't ever need to read the data you've just written (logging?).

Otherwise you'll end up serving stale data that might be several minutes out of date.

Ah, longer caching is the factor with web I always forget after focusing on apps / games. Thanks for pointing it out.
So many times I see people handing out a root password to edit /etc/password. Sure, it may be crazy for performance but, even a trivial number of users and aliases are more easily managed with a remote mysql client. Everyone that has used postfix in a production environment knows the pain otherwise, the rest feel free to vote down.
There are practically countless ways to delegate management of mailboxes and users without granting root. That's not even a hard problem to solve, and certainly not one that justifies introducing a hugely complex additional variable to the equation.

Anyway, I'm not saying "never use MySQL for mail users" (though, I think the percentage of deployments where it makes sense is closer to none than it is to one), I'm just trying to make the point that MySQL is, in some folks minds, a magical solution to performance problems. Often, it not only introduces needless complexity, it won't even improve performance. It's a classic example of "when all you have is a hammer, everything starts to look like a nail". MySQL is a very fine hammer. It just isn't the right tool for every job.

The article we're talking about is another case where a little knowledge is a dangerous thing. A desktop app serving one user with a tiny data set (as I understand it, we're talking about the metadata for a person's music collection) is exactly the right workload for SQLite. I'd be shocked if a naive port to MySQL were faster (though they acknowledge that there's room for query optimization), and not at all surprised if it were slower. And, I know it'll require more memory and disk space for the same working set.

> So many times I see people handing out a root password to edit /etc/password.

Shouldn't user groups give a bit more accountability there?

The next step up is userdb though, not MySql
Why MySQL and not LDAP?
> Also, I don't even want to try to imagine shipping an installable desktop application for non-technical users that relies on MySQL!

Amarok music player on KDE linux desktop environment (at least on version 3.5) kept the music collection metadata in database. You could choose sqlite, mysql or postgress during installation if I remember correctly. It worked ok, because of the package system.

I guess if you used separately installed databases there would be some conflicts, though.

Not faster perhaps but with sqlite3 I had locking issues and after I switched to mysql I don't. It's really that simple. The app was used by a small office of 40 people at first, then started being used by close to a hundred people and that's when the locking errors began.
That's a pretty big difference in scale. The app in the original article seems to be a desktop application, not a client/server system with many users. There's plenty of use cases where MySQL (or PostgreSQL) is a great choice. Yours is probably one of them.
Were you using wal mode with sqlite? It helps a lot with locking.
I don't even know what it is so I wish I had looked it up first but now it's too late. Also mysql will aide in later replicating the application between different locations to ensure availability for each office.
Yeah the WAL basically makes "concurrent processes reading/writing to the same database" work magically, where as without you'll get all sorts of issues and timeouts. I don't know why it isn't enabled by default, or at least more prominently advertised as an option.
Initially, WAL mode was off by default because older versions of SQLite do not support it, and it is a property of the database file. Thus, a database created in WAL mode would be unreadable by older SQLites. But WAL has been available for 6 years now, so it might be reasonable to make it the default. We will take your suggestion under consideration. Thanks.
Kodi media player uses mysql and runs on most platforms. As an end user mysql is completely invisible
No, Kodi uses SQLite internally (I've fixed garbage in its databases with the sqlite CLI tool before). It does also support external MySQL, though.