|
|
|
|
|
by isosphere
688 days ago
|
|
Was thinking the same thing when I saw those zeros in the checksum field. Perhaps the consequences are significant. Here's a benchmarking exercise I found: https://www-staging.commandprompt.com/uploads/images/Command... With a tidy summary: > Any application with a high shared buffers hit ratio: little difference.
> Any application with a high ratio of reads/writes: little difference.
> Data logging application with a low ratio of reads/inserts, and few updates and deletes:
little difference.
> Application with an equal ratio of reads/inserts, or many updates or deletes, and a low
shared buffers hit ratio (for example, an ETL workload), especially where the rows are
scattered among disk pages: expect double or greater CPU and disk I/O use.
> Run pg_dump on a database where all rows have already been previously selected by
applications: little difference.
> Run pg_dump on a database with large quantities of rows inserted to insert-only tables:
expect roughly double CPU and disk I/O use. |
|
There is CPU specific code in the PG source in src/include/storage/checksum_impl.h
It is written as a plain nested loop in C. So performance is fully dependent on the compiler being able to parallelize or vectorize the code.
I would not be surprised if manually written SIMD code would be faster.