Hacker News new | ask | show | jobs
by VladRussian2 4582 days ago
>The log writes don't saturate the array itself--but the log file has a limit to how many blocks can be appended--even on fast arrays

yes, the issue usually isn't the transaction log append speed. Instead, it happens too frequently that the log is configured to be too small. A log file switch causes a flush of accumulated modified datablocks of tables and indexes [buffer cache flush in Oracle parlance] from RAM to disk. With small log file size, the flush happens too frequently for too small amounts of modified data - this is where GP mentioned random IO bites in the neck.

1 comments

I think you're talking about an insert buffer, not a transaction log, and in that case, no matter how big your insert buffer is, it will eventually saturate and you'll end up hitting the performance cliff of the B-tree. You really need better data structures (like fractal trees or LSM trees) to get past it.
no, i'm talking about transaction log ("redo log" in Oracle parlance). Switching log files causes checkpoint (ie. flush - it is when the index datablocks changed by the inserts you mention will finally hit the disk )

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUES...

MS and DB2 have similar behavior.

Oh ok, now I see what you're saying, it's still similar to an insert buffer in that case. B-tree behavior is still to blame for this, and if you make your log file bigger it lets you soak up more writes before you need to checkpoint but you'll either have even longer checkpoints eventually, or you'll run out of memory before you have to checkpoint.

We also checkpoint before trimming the log, but our checkpoints are a lot smaller because of write optimization.

>even longer checkpoints

yes, that is the point as big flush instead of many small ones would take either the same or, usually, less time than cumulative time of small flushes because of IO ordering and probability of some writes hitting the same data block.