I think one of the biggest missed opportunities in language design is the integration of powerful relational database and query models directly into a modern language. Not as a bolt-on or an ORM but as a first class part of the language in the same way as maps and arrays. Make a language that deals with data relationally and where relational queries are a core part of the language and relational query execution is baked into its runtime.
If persistence hooks were also baked in then you'd have something a little bit like stored procedures in databases but far more powerful and with a modern syntax. Couple this with a distributed database layer supporting either eventual consistency built on CRDTs or synchronization via raft/paxos and you'd have an amazing application platform.
It's always seemed dumb to me that data, which is in the very center of everything we do, feels like a bolted-on second class citizen from the perspective of pretty much all programming languages and runtime environments. "Oh, you want to work with your data? Well we didn't think about that..." Accessing the data requires weird incantations and hacks that feel like you're entering a 1970s time warp into a PDP-11 mainframe.
Instead the language and runtime environment should be built around the data. Put the data in the center like Copernicus did with the sun.
Data access is of primary concern to all programming languages, using either memory or disk. All files on disk can be considered a form of database. Reading and writing files is standard in all languages.
Once you start getting fancier in your files, and the data grows large, you need special ways to read it. A Postgres database can be considered a single big file on disk. It is the Postgres server that is required to access the file in the most efficient way to store and randomly access enormous amounts of general data.
SQLite is interesting in that there is no server, it's just a special library that enables efficient random access of a single file, which can be thought of as a black box that only SQLite knows how to interpret.
Unless you mean, making something like SQL built directly into the language as a first class citizen. Mumps did something like this https://en.wikipedia.org/wiki/MUMPS
SQL has type safety. It's primary purpose is to have typed schemas. I can't imagine why you would say this, other than some pedantic reason. SQL implementations like Postgres will happily throw errors when your types are off.
Testability - you use a general purpose language to execute SQL. Again, I don't know what you mean.
Composability - I suppose, but remember SQL is a language to retrieve data. I reuse fragments everywhere in a general purpose language.
The query planner optimizes it. Why would you want a compiler to optimize SQL? The nature of your data affects how it is optimized! The declarative statement to retrieve data must be interpreted based upon the nature of that data. You can't pre-optimize without knowing something about your data, in which case, you are basically storing some of the information outside of the database.
I don't understand why anyone would prefer SQL to that for anything beyond a simple SQL query. And it's not just my opinion: industry at large uses Spark for production with complex queries. SQL is for analysts.
SQL is just a DSL; it is not the only or primary API for Spark, and there's nothing magical about it. If you ditch it you can get your type safety, composability, and testability back, like so:
See those case classes that neatly encapsulate business objects? Add to that functional transforms that concisely express typical operations like filtering, mapping, and so on, you get something that is simply superior to SQL.
If persistence hooks were also baked in then you'd have something a little bit like stored procedures in databases but far more powerful and with a modern syntax. Couple this with a distributed database layer supporting either eventual consistency built on CRDTs or synchronization via raft/paxos and you'd have an amazing application platform.
It's always seemed dumb to me that data, which is in the very center of everything we do, feels like a bolted-on second class citizen from the perspective of pretty much all programming languages and runtime environments. "Oh, you want to work with your data? Well we didn't think about that..." Accessing the data requires weird incantations and hacks that feel like you're entering a 1970s time warp into a PDP-11 mainframe.
Instead the language and runtime environment should be built around the data. Put the data in the center like Copernicus did with the sun.
Why has nobody done this? Has anyone even tried?