Little Bobby Tables would like to have a word with you...
I honestly wonder if you could make a functioning database API that either does not accept strings as arguments, or can detect string concatenation and reject it. Not just a builder pattern to greatly discourage it, but a straight up exception on bad input type. Bind variables or GTFO.
The book "Building Secure and Reliable Systems" from Google's series on SRE actually talks about two examples of this in C++ and Go which forbid using anything but string literals in the query string of an SQL API.
In Go, the solution was very tidy: it aliases string to an unexported internal type that consumers cannot instantiate. String literals can be coerced to that type, but variables that already have type information associated with them are rejected at compile time.
The C++ solution was a bit more complicated and involved templates.
You can probably do it trivially in any language with operator overloading or a very expressive type system
However you would also probably want a language with some syntactic sugar that let users use your special string type easily otherwise the burden on users will be too high
> However you would also probably want a language with some syntactic sugar that let users use your special string type easily otherwise the burden on users will be too high
In the instance under discussion in that section of the book I'm referencing, allowing the user to instantiate the special string type was precisely the behavior intended to avoid. The idea being that in the following three examples, the first succeeds and the rest fail at compile time:
Query("SELECT * FROM Users WHERE id = ?;")
id := "6"; Query("SELECT * FROM Users WHERE id = " + id + ";")
id := 6; Query(fmt.Sprintf("SELECT * FROM Users WHERE id = %d;", id))
But the output is actually effectively emitting special string types (or flags on the main string type — whatever) in the background transparently (either at compile-time or runtime). And that’s possible due to a language design with sufficiently expressive typing with good syntactic sugar.
I’m talking about the alternative world where people use tainted strings and have to manually taint or untaint strings explicitly everywhere and that is nuts. You have to do that when your language has nothing to help you with. If you have to resort to that, I think you need to give up.
Sure you can. Just make your transport protocol only support taking in a stored procedure name and parameters for DMLs, and some typed representation for DDLs.
But while that prevents people from concatenating strings to form DML queries as a whole, it obviously doesn't prevent the kind of concatenation wereHamster mentioned.
I would argue the opposite, actually. The storage model or database layer should be in a single repo. And thus there would be less code to grep through.
That doesn't work because that repo needs to expose an API that accesses that table, now you have to prove the API is never called, and you're back to searching the entire code base for that function call (hopefully you've used meaningful function names instead of DRY ones...)
Add: log.info("get_user_by_user_id has been deprecated, please use the alternative API.")
Into the repo method, push up the change, redeploy the projects that depend on the library and wait a while and grep the logs to see if it ever gets used.
Dude, that just tells you if the table is used on the happy path. It does not tell you if it's only used by a customer whose godson is the CEO's favorite kid. Or if customers only use it right before a federal deadline for filing forms, or at the end of the quarter when it's time to generate reports. Which you didn't collect for 2 months.
The other responder liked monorepos for this. It’s not perfect, but it does help with making sure you had all of the code to search from, and once in a while the commit history adds additional clues.