Hacker News new | ask | show | jobs
by weiznich 569 days ago
You get the same safety guarantees with diesel by using the [`embed_migration!`](https://docs.diesel.rs/2.2.x/diesel_migrations/macro.embed_m...) macro and running your applications on startup. Diesel decouples that as there are situations where you might want to work with a slightly different database as defined, for example if you don't control the database.
1 comments

I don't get the impression that there's any interaction or data sharing between the embedded migrations and the query generation part, that would validate the queries against the schema as defined by the migration -- is that wrong? The documentation doesn't seem to mention it at least.
That depends on the workflow. The query checking is based on the information in your `schema.rs` file. That file is usually generated via a CLI tool provided as part of diesel in your local development workflow. It's generally assumed to match the migrations in the repository, but you also can use the CLI tool to verify that this is actually the case. To get the full way around: The migrations then describe the database state that's used to define your `schema.rs` file, which in turn are embedded into the binary via the `embed_migration!` macro. If you now have the requirement to make sure that the database matches the provided schema file you can ensure that at build time by:

* Running your migrations in a build.rs file against a test database

* Running `diesel print-schema --locked-schema` afterwards to make sure that the provided `schema.rs` file matches the state of the database

* Use the `embed_migrations!` macro to embed and run the migrations on application startup

What rust-query does is just the "same" as these steps outlined above. Arguably they do all in one step, but that has the disadvantage that it forces rust-query to always have the full control over the database. Other than that there is no verification at any point happening that the schema actually matches what's declared there, they just apply migrations as well and reasonable assume that the database will match the declared state. This does not guard against cases where you for example manually modify the schema later or something like this.

For diesel we want to make this more robust at some point by providing an actual check function as part of the `schema.rs` files that allows to verify that the declared schema matches the actual database state. That one then could be called at different points, depending on the users requirements. If you are interested in such a feature I suggest reaching out to us in the diesel support channels.

Oh thanks! So is this the expected workflow?

1. Write initial db migrations

2. Start a test db, apply the migrations, generate schema.rs via the CLI tool

3. Write code based on schema.rs

4. Write new migrations

5. Test the changes on test db instance, re-generate schema.rs via the CLI tool

6. Update code based on the new schema.rs

When I first looked at Diesel I thought the safety was built into the code itself, but if it comes from out-of-code procedures using the CLI and a real database that makes a lot more sense. I don't think I got a good picture of how all these tools and libraries came together from the Diesel docs themselves. So, in this sense it seems very similar to SQLx.

My take is that this is a "database-first" approach vs rust-query's "code-first" approach to safety. I think the benefits of a code-first design are that you don't need any additional CLI tooling, manual procedures outside of code, to test migrations, or a running database at any point in development.

I guess from some perspective, writing raw SQL with tests that run against a real database is perfectly safe too. I think it's a spectrum of how database-first or code-first the database layer is.

> that has the disadvantage that it forces rust-query to always have the full control over the database

I think this goes for any database layer, doesn't it? If you write migrations for Diesel and then someone goes and makes manual changes to the production database you're similarly out of luck.

The rust-query author also mentions checksumming, but I think at the end of the day people always have the ability to go in and hollow out assumptions - but IMO the code-first approach which discourages any manual database interaction draws a clear line.

> So is this the expected workflow?

It's impossible to write what is the expected workflow, because that heavily depends on your requirements. Overall there is certain functionality that exists in diesel and that can be combined in different ways to build different kind of workflows.

For example: If you have a database that is controlled by someone else you won't want to use any migration functionality at all, you would want to use only `diesel print-schema` there to generate the `schema.rs` file for you. Similarly different workflows consisting of any of the existing parts are possible.

> So, in this sense it seems very similar to SQLx.

There is an important difference here: Diesel provides a operate CLI to generate rust code for you, instead of connecting to a database from the "compiler" (or reading files). That's really important as you don't have any non-deterministic proc-macros, which are really not that great with the rust compiler (and officially something that's at least in some grey zone in terms of support).

> My take is that this is a "database-first" approach vs rust-query's "code-first" approach to safety. I think the benefits of a code-first design are that you don't need any additional CLI tooling, manual procedures outside of code, to test migrations, or a running database at any point in development.

This brings me back to the non-standard workflow point raised before. You also can have a code first approach with diesel. The cli tool supports generating the migrations for you from the given `schema.rs` file and a up and running database. So you basically would write the `schema.rs` file in that case and the tool generates SQL to move the database to that `schema.rs` state.

> > that has the disadvantage that it forces rust-query to always have the full control over the database > > I think this goes for any database layer, doesn't it? If you write migrations for Diesel and then someone goes and makes manual changes to the production database you're similarly out of luck.

My point here is more: rust-query really forces you to have control over the database. Diesel is totally fine with not being able to control migrations or whatever. For running code you only need to provide a schema.rs file, which might be generated by running migrations or which might be hand written or which might be generated from an existing stable database.

> The rust-query author also mentions checksumming, but I think at the end of the day people always have the ability to go in and hollow out assumptions - but IMO the code-first approach which discourages any manual database interaction draws a clear line.

That's likely only about migrations, not about the actual database state. It can help to make sure that migrations are really the same as used to setup the database, but it won't help with cases where you change the database manually. I don't think it's even meaningful to check on each database interaction that the schema hasn't change so far, as that would be quite expensive.

Author here: I just wanted to point out that rust-query does in fact read the schema from the database and compares it to the expected schema. There is no checksumming or comparison of migrations. It is checked efficiently that the schema hasn't changed at the start of every transaction by just checking the `schema_version` sqlite pragma.

https://news.ycombinator.com/item?id=42283030

I just want to point out that reading the value of this pragma is not the same as verifying that the schema hasn't change, as this does not guard you against manual modifications of either the value of the pragma (after all I can just do a `pragma set schema_version=42`) or the schema itself (I can also manually change the schema, without changing the value of the pragma).

So while this is a nice way to verify which of your migrations have been applied this doesn't give any guarantees around the schema at all. Essentially this is the same as what diesel does with reading the `__diesel_migration_version` table.