|
|
|
|
|
by leif
5146 days ago
|
|
"and if there's enough data, the relevant b-tree pages will most certainly be hot" That is not the case if you have indexes. Most secondary indexes are high-entropy (if they weren't, you probably wouldn't be storing them, is an emotional proof), so inserts on larger-than-memory data sets almost always incur a disk I/O, in a b-tree, at least. Shameless plug: I have a talk about how to deal with this: http://www.youtube.com/watch?v=q6BnG74FZMQ |
|
First up, in his example we're talking about 10k rows with a size of 35 bytes. At 8050 bytes of available space per page, that gives a fan-out of 8050/35 = 230. 10k/230*8kb = 350kb. At 350kb of inserts (hobt data, he doesn't go into secondary indexes), memory is completely irrelevant - the only force going on here is latency on writing the log to disk.
If we had a huge data set (as in, did not fit in memory, at all) with high cardinality - sure, we'd have a lot of cold leaf level pages. With no further info on his case, I can only assume most of the hobt and secondary indexes will fit in memory. At worst we'll have to read a cold leaf level page into memory to perform the addition in-memory.
As is there's no mention of even a clustered index, causing all of this to be heap inserts which is arguably one of the fastest insert methods there are (barring certain very special cases).