|
|
|
|
|
by jamesblonde
479 days ago
|
|
We are seeing more and more specialized query engines.
This is a query engine specialized for training pipelines. It is not general purpose - it is for providing batches of training data at workers. It uses Ray for parallelization. The kind of queries you need are random reads (to implement shuffling across epochs), arrow support (zero copy to Pandas DataFrames), and efficient checkpointing. |
|
The hierarchical model is applicable to many problems and actually in part why moving off mainframes is challenging because IMS is so much more efficient than the relational model for applications like airline tickets.
There have been several efforts to leverage object stores in the way they did that I am aware of but it was a hard sell.
The hierarchical model really only works for many to one relationships, and it's integrity model differs and is not as DRY.
There are lessons to learn here but it requires some relearning.
When you have a shopping cart, having data local to the server handling the transaction is also a benefit.
Codd's relational model has advantages, but has held back some efforts because we are just use to dealing with the painful parts that we often don't consider other options.