Hacker News new | ask | show | jobs
by plain4 2173 days ago
I didn't finish ready the article because it didn't give a succinct summary of what predictive databases are. But at a glance this seems to be a SQL interface to an AUTOML system. Is that a correct summary? I don't get the distinction between ML and predictive databases. It seems predictive databases use ML.
2 comments

Here's an early article on predictive databases:

https://aito.ai/blog/introducing-a-new-database-category-the...

The big difference of the predictive queries / databases compared to trained models is that predictive databases are prepared / optimized to allow making arbitrary predictions without pretrained model. So you can basically as to predict any X based on any A, B and C and expect a more or less immediate answer.

The benefit of not having pre-trained models relates to workflows and architecture, as described in the article. The disadvantage of having such instant generic prediction capability is that it's technical hard to implement, as described in the 'Quality' chapter.

Seems like "predictive database queries" is more about the queries and less about the database, and there's nothing relational (or RDBMS/SQL) about it.
The title implies that it's going to replace ML models. But it seems that it still uses ML models, but provides a different interface. It also seems that it's using some AutoML training system, so that in theory little ML expertise is required to use the system.
> in theory little ML expertise is required to use the system

Maybe I'm just a cynical data scientist, but this is how we get people using and interpreting models that they don't necessarily understand the complexities of. If some data violates an underlying assumption or has some complexities around representation and meaning then there's nothing really stopping someone getting a model that appears to fit correctly but gives answers that are meaningless or just wrong.

It's not just different interface, but different workflow.

In traditional ML you need to a) define model to do prediction A -> B b) train the model, which may take minutes c) then do the predictions form A -> B, which takes (1, 10, 100) microseconds

With predictive queries, you: a) Ask prediction for any X based on any A, B and C and expect answers in (1, 10, 100) milliseconds

You basically trade throughput and latency to get higher productivity, faster iteration and simplified overall sysstem