| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by TekMol 999 days ago

I can't remember what exactly was slow, but some characteristics of the workload:

Some of the tables were so big that neither the table itself nor the indexes fit into memory.

The important tables where rather narrow. Like 5 integer fields or so. But long (tens of millions of rows? hundreds of millions? Can't remember).

The queries were a continous mix of selects, updates and inserts.

Some of the selects were based on multiple criteria (SELECT ... FROM ... WHERE x=... AND y=...).

Some of the updates were too (UPDATE ... WHERE x=... AND y=... SET ...).

None of the queries took very long. The machine was doing hundreds of queries per second if I remember correctly.

The whole process took weeks (billions of queries, if I remember correctly).

2 comments

sgarland 998 days ago

It'll probably take me longer than you'll maintain interest in this thread, but I'll reply when done anyway.

I've created 3 tables with an auto-incrementing unsigned bigint as the PK, and then 4 unsigned int fields with random values. Each table has 25,000,000 rows (and has unique values, modulo the PK). Once they're loaded, and I come up with some queries, I'll create some indices, and then reduce the buffer pool to < index size to mimic those conditions.

I won't be doing billions of queries, though :p

link

djantje 998 days ago

That could be of the transactions, innodb will keep your data reads intact (of that is not of importance, if the update is more important than the result based on the current state of the data, for example with UPDATE SET field = field + 1)

You could maybe have matched with a change in the transaction isolation level for innodb

link