|
|
|
|
|
by rramadass
2459 days ago
|
|
This is great! These are some concrete and non-trivial architectural techniques for "Systems Programming" :-) I have had opportunity to work on/with some of these techniques on Fast Network Protocol/Security Appliances and so have some familiarity with them. However some of your hints(breadcrumbs?) are not known to me and hence i have something to research and study. Thank you. PS: Can you add some more details on the above techniques? Like System/Library/API calls to look into, books/papers/articles to read etc? |
|
Speaking of breadcrumbs, if the ring buffer has fixed-size entries, a reader can come in later and start reading old entries first, say halfway back. This is helpful if you want to start a new reader and then kill an old one, and not skip any entries.
It helps if the ring has power-of-two size, and the head pointer/index is 64 bits and increases monotonically. Then the high bits are easily masked off on each use, so that arithmetic on pairs of positions is simpler.
For variable-sized entries, an array of N "breadcrumbs", past positions near 1/Nth indices, allows jumping in at earlier positions. If traffic is low enough, you might be able to buffer a whole day's traffic, and get random access starting from breadcrumbs; otherwise, you can log old entries to a sequential file, and also log the breadcrumbs, translated to file offsets, as a global index.
Downstream processes can each sequentially log an individual field of each record, with a breadcrumb index to enable full records to be reconstructed. Often these column logs can be compressed, with enormous efficiency, between breadcrumbs: 98% compression may be easy to achieve for slowly-changing or limited-alphabet values.
Lz4 and Zstd are excellent compression engines. Lz4 really shines for fast decompression. There is no excuse for zlib/gz compression anymore.