|
|
|
|
|
by cerved
1860 days ago
|
|
While schema creation SQL can be a be a bit unwieldy, I'm not sure I appreciate which part is the problem or what you're trying to achieve. Obviously you can SELECT * INTO FROM .. if you're just temporarily inserting data. I'm not sure I see the value in automatically importing arbitrary data into a schemad database object. I think it's too complicated to be carried out by the database and should probably be done by some other piece of software and under human supervision/guidance. Are you suggesting that INSERT/UPDATE statements also have the ability to modify objects? That sounds like it would add complexity without much gain as opposed to just running and alter table query. I'm not saying that the process of writing schema modifying queries is painless but I'm not sure I'm convinced that we can have the cake and eat it For your example I personally would've AWKED into a predefined schema or inserted the data as JSON |
|
Okay, bad example. I should have specified: Insert into an existing table, add the missing columns automatically. There are some special-cases where most database engines can automatically generate schemas, and "SELECT... INTO" is one of those few. Usually only allowed with an empty destination table.
> I'm not sure I see the value in automatically importing arbitrary data into a schemad database object.
There are lots of use-cases for this. You might not have them, but other people do all the time. Just about any large-scale enterprise software needs to be extensible in the field, for example. Think the likes of SAP or Siebel. I've seen similar problems crop up in CMDBs, job and ticket management software, etc... Famously, Jira is slow precisely because it is so bad at handling this kind of extensibility efficiently.
> Are you suggesting that INSERT/UPDATE statements also have the ability to modify objects?
Not necessarily, although that could be an option. What I mean is that changes to the table schema should be made using insert/update/delete statements, where the only "data" is things such as the column names, types, constraints, etc...
> That sounds like it would add complexity without much gain as opposed to just running and alter table query.
It would dramatically reduce complexity, removing an entire language from database engines, along with all the associated vendor-specific syntax, quirks, and limitations.
In fact, the same "schema" that is used in the wire protocol could be directly equal to the actual schema, and its update language. So if you get back a query result (with data), and want to create a copy of that schema elsewhere (e.g.: a local cache database), then you just take the "header" from the result set and "insert" it into the destination database schema. No conversion, no escaping, nothing.
Seriously: Try this as an exercise. Use Java or C#, write a select statement from a query (that has join, views, etc...), and then write the code that generates a table to cache this data in a separate, local database instance.
Do it. Sit down and give it a good go. You won't appreciate how hard this is until you do!
Now write the code to update the cache table dynamically if the source table changes. Assume that table has a petabyte of data. (That's why it's a table, it's a cache for something that's far too big to fit in memory!)
Now write the code to do all of the above with foreign constraints.
Good luck!
> I'm not sure I'm convinced that we can have the cake and eat it
There is a reason NoSQL databases exist and are wildly popular. It's not that "schemaless" is truly better so often, it's more that modifying the schema in most DBMS offerings is so fiddly that it's essentially impossible to do programmatically.
> or inserted the data as JSON
NoSQL in a nutshell! You haven't solved the problem, you've given up and resorted to schemaless tables instead, exactly the "bad" example above with all of its limitations and issues.
I've seen some good attempts at solving this problem, but they barely scratch the surface.