|
|
|
|
|
by galeforcewinds
3166 days ago
|
|
Two big pitfalls seem to hit almost all of these "low-level optimization of how the database handles..." type posts: the importance in optimizing the system, and the use of other technologies. Write activity impacts database query caches. Many applications do not require realtime-accurate results from the database and generate a substantial number of the same paginated queries. For these cases, it is very important to consider higher-level caches within the system -- CDN, page-level caches, page block-level caches, etc. as caching your 99% traffic pattern will provide DB platform headroom to support your 1% traffic pattern. Where the results you are paginating are based on any sort of matching (SQL WHERE), most read applications see a sizable benefit in integrating a search platform. Data selection for display is handled in the search layer, and underlying data retrieval for display happens either from the search layer or via the backing database using inexpensive lookups via primary key. One of the key considerations not covered in the article is the need for result consistency when paginating, e.g. if the underlying data changes. It is the need for this consistency, not the desire for performance, that I see as the primary reason to include primary key identifiers or timestamp values in your pagination strategy. |
|
If folks knew how to get the most out of their databases, 90% of them wouldn't even need the other systems and the complexity of the system as a whole would be reduced.