| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by password4321 636 days ago

Is anyone willing to share general guidance on where to draw the line when it comes to using DB configuration to speed things up ( almost "buy") vs. basically doing things manually ("build")? In my limited experience it often falls to app developers because competent DB admins are all getting paid much more to work elsewhere (as mentioned above, it is important to know the DB).

My canonical example is large volumes of data that accrue over time with the most recent accessed most often, where the DB admins can partition things or do partial indexes to keep access fast, but the app developers can move records into a separate archive table sometimes behind the scenes while still supporting things like (eventual) search of the whole data set. (A note here that it feels like a tool could do a lot of the initial heavy lifting to automate splitting one table into many when it makes sense -- perhaps when limited by a cloud DB's missing features)

Another management option sometimes accommodated by the DB vs. doing manually is to store all large blobs/files in their own separate database (filesystem?!) for a different storage configuration etc.

I imagine it can go as far as basically implementing an index manually: one massive table with just an auto-incrementing primary key but tons of columns then setting up a table with that ID and a few searchable columns (including up to going full text search/vectors I guess).

Edit: one useful tip manually implementing the Materialized View pattern with MSSQL 2016+: use partition switching as well explained and implemented by https://github.com/cajuncoding/SqlBulkHelpers?tab=readme-ov-... (incidentally the most commercially useful out-SEO'd tiny-star-count library I've ever found, focused on bulk inserts into MSSQL using .NET). I think this is a good example of drawing the buy/build line in the right place with the automation of the partition switching.