|
Looking to get feedback for a code-first platform for data: instead of custom frameworks, GUIs, notebooks on a chron, bauplan runs SQL / Python functions from your IDE, in the cloud, backed by your object storage. Everything is versioned and composable: time-travel, git-like branches, scriptable meta-logic. Perhaps surprisingly, we decided to co-design the abstractions and the runtime, which allowed novel optimizations at the intersection of FaaS and data - e.g. rebuilding functions can be 15x faster than the corresponding AWS stack (https://arxiv.org/pdf/2410.17465). All capabilities are available to humans (CLI) and machines (SDK) through simple APIs. Would love to hear the community’s thoughts on moving data engineering workflows closer to software abstractions: tables, functions, branches, CI/CD etc. |
It mentions "Serverless pipelines. Run fast, stateless Python functions in the cloud." on the home page... but it took me a while of clicking around looking for exactly what the deployment model is
e.g. is it the cloud provider's own "serverless functions"? or is this a platform that maybe runs on k8s and provides its own serverless compute resources?
Under examples I found https://docs.bauplanlabs.com/en/latest/examples/data_product... which shows running a cli command `serverless deploy` to deploy an AWS Lambda
for me deploying to regular Lambda func is a plus, but this example raises more questions...
https://docs.bauplanlabs.com/en/latest/commands_cheatsheet.h... doesn't show any 'serverless' or 'deploy' command... presumably the example is using an external tool i.e. the Serverless framework?
which is fine, great even - I can presumably use my existing code deployment methodology like CDK or Terraform instead
Just suggesting that the underlying details could be spelled out a bit more up front.
In the end I kind of understand it as similar to sqlmesh, but with a "BYO compute" approach? So where sqlmesh wants to run on a Data Warehouse platform that provides compute, and only really supports Iceberg via Trino, bauplan is focused solely on Iceberg and defining/providing your own compute resources?
I like it
Last question is re here https://docs.bauplanlabs.com/en/latest/tutorial/index.html
> "Need credentials? Fill out this form to get started"
Should I understand therefore that this is only usable with an account from bauplanlabs.com ?
What does that provide? There's no pricing mentioned so far - what is the model?