Hacker News new | ask | show | jobs
by lopsidedBrain 2140 days ago
https://sqlite.org/limits.html

> SQLite was originally designed with a policy of avoiding arbitrary limits. [...] Unfortunately, the no-limits policy has been shown to create problems. Because the upper bounds were not well defined, they were not tested, and bugs were often found when pushing SQLite to extremes.

1 comments

Thanks -- that makes total sense, all for guaranteed tests.

Though I shudder to imagine the effort that goes into testing a 281 TB database.

Do they purchase all those hard drives and make one enormous RAID concatenated disk set? Or is there a way to more cheaply virtualize that by combining a bunch of max-sized 16 TB AWS EBS instances? What types of errors are even likely to come up at that point -- errors in SQLite's internal logic, or errors in the operating system or drivers?

The TH3 test harness for SQLite (https://www.sqlite.org/th3.html) supports a virtual filesystem in which we can create test database files that appear to be very large but that don't actually contain much data or use much space.

We also have a simple utility program in the SQLite source tree (https://www.sqlite.org/src/file/tool/enlargedb.c) that lets you create a massive database file using a sparse file (https://en.wikipedia.org/wiki/Sparse_file) on systems that support that kind of thing.

If they have customers who need that kind of database, I'm sure they will be willing to help out with the testing. The setup doesn't have to be too crazy. Storage appliances with 20-30 hard drives are fairly common, actually. Even for individual humans, not just corporations. LVM allows you to easily create a logical volume out of multiple physical volumes, e.g. https://www.redhat.com/sysadmin/creating-logical-volumes.

You might find it interesting to look at the actual sqlite source commits where this change was introduced: https://sqlite.org/src/timeline?r=larger-databases It turns out the number comes from having a max of 2^32 pages in their database. Their default page being 4 kB each: https://www.sqlite.org/pgszchng2016.html

Working backwards, they must have raised it to 64kb: `python -c 'print(1024 * 64 * 232)'` produces 281,474,976,710,656.