| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by isosphere 688 days ago

Was thinking the same thing when I saw those zeros in the checksum field. Perhaps the consequences are significant.

Here's a benchmarking exercise I found: https://www-staging.commandprompt.com/uploads/images/Command...

With a tidy summary:

> Any application with a high shared buffers hit ratio: little difference. > Any application with a high ratio of reads/writes: little difference. > Data logging application with a low ratio of reads/inserts, and few updates and deletes: little difference. > Application with an equal ratio of reads/inserts, or many updates or deletes, and a low shared buffers hit ratio (for example, an ETL workload), especially where the rows are scattered among disk pages: expect double or greater CPU and disk I/O use. > Run pg_dump on a database where all rows have already been previously selected by applications: little difference. > Run pg_dump on a database with large quantities of rows inserted to insert-only tables: expect roughly double CPU and disk I/O use.

1 comments

silvestrov 688 days ago

On my M1 mac "dd ... | cksum" takes 3 seconds while "dd | shasum" (sha1) takes 2 seconds. So cksum might not be the best tool for performance checking.

There is CPU specific code in the PG source in src/include/storage/checksum_impl.h

It is written as a plain nested loop in C. So performance is fully dependent on the compiler being able to parallelize or vectorize the code.

I would not be surprised if manually written SIMD code would be faster.

link

anarazel 687 days ago

The bottleneck isn't at all the checksum computation itself. It's that to keep checksums valid we need to protect against the potential of torn pages even in cases where it doesn't matter without checksums (i.e. were just individual bits are flipped). That in turn means we need to WAL log changes we don't need to without checksums - which can be painful.

link

Joe_Cool 687 days ago

Interesting. I guess M1 doesn't have the 'crc32' "acceleration" that is included in SSE4.2.

link

loeg 687 days ago

M1 has CRC32 acceleration intrinsics.

https://dougallj.wordpress.com/2022/05/22/faster-crc32-on-th...

https://github.com/corsix/fast-crc32?tab=readme-ov-file#appl...

link

silvestrov 687 days ago

So when using these intrinsics an Intel Core i7 can do 30 GB/s but the performance check linked above (by isosphere ) says only 300 MB/s, i.e. 1%

Something is amiss here.

If a CPU can do 30 GB/s then a CRC check should not have any real performance impact.

link

loeg 687 days ago

I don't know where you're getting 300 MB/s from.

link

silvestrov 687 days ago

Page 5 of https://www-staging.commandprompt.com/uploads/images/Command... says "This system can checksum data at about 300 MB/s per core."

It lacks page numbers. Page 5 is first page with gray box at the top of the page.

link