| HN Mirror

The strategy is similar to Madlib. I tried to deploy Madlib a year or so ago, and it didn't have the features I wanted and development seemed somewhat slow/stalled with vmware deprecating support for pymadlib. I got a wild hair last week, and wanted to see how hard it would be to build something comparable, except leveraging the work of the python ML ecosystem instead of reimplementing algos from scratch, and I think we've been able to cover a significant amount of ground quickly.

Untrusted extensions are not necessarily unsafe, but the main difference is you need to be a database superuser to install them (and hopefully vet them). This is really a problem for most hosted database services like AWS RDS, Azure etc, but not for people who are running their own Postgres instances. In the future I think we'll need a solution that can spin up a replica with superuser permissions to install the extension on, which would also have safety and scalability advantages compared to installing it on the primary.

It looks like we have some similar thinking along how easy it should be to use ML in orgs, vs the current reality. I'd love to hear more about dsKube and what you're learning on with that approach.