Hacker News new | ask | show | jobs
by monero-xmr 880 days ago
All data query languages eventually reduce themselves into SQL, or something equivalent to it.
2 comments

Ok? Even if that were true — and I’m not entirely sure how you would even prove that — SQL still lacks type safety, testability, and composability.
SQL has type safety. It's primary purpose is to have typed schemas. I can't imagine why you would say this, other than some pedantic reason. SQL implementations like Postgres will happily throw errors when your types are off.

Testability - you use a general purpose language to execute SQL. Again, I don't know what you mean.

Composability - I suppose, but remember SQL is a language to retrieve data. I reuse fragments everywhere in a general purpose language.

So use a better language, and let the compiler optimize it??
The query planner optimizes it. Why would you want a compiler to optimize SQL? The nature of your data affects how it is optimized! The declarative statement to retrieve data must be interpreted based upon the nature of that data. You can't pre-optimize without knowing something about your data, in which case, you are basically storing some of the information outside of the database.
I mean don't use SQL at all. Use a real programming language like Scala, and let Spark (or Flink etc.) do the translation and optimization: https://www.databricks.com/glossary/catalyst-optimizer

I don't understand why anyone would prefer SQL to that for anything beyond a simple SQL query. And it's not just my opinion: industry at large uses Spark for production with complex queries. SQL is for analysts.

Now I’m totally confused. SQL is a syntax for querying data. Spark SQL is SQL. You are talking about a different implementation of a server.

SQL is for analysts? Everyone uses SQL.

How familiar are you with Spark and the like? This is what it looks like:

https://spark.apache.org/examples.html

SQL is just a DSL; it is not the only or primary API for Spark, and there's nothing magical about it. If you ditch it you can get your type safety, composability, and testability back, like so:

https://medium.com/@sergey.kotlov/unit-testing-of-spark-appl...

See those case classes that neatly encapsulate business objects? Add to that functional transforms that concisely express typical operations like filtering, mapping, and so on, you get something that is simply superior to SQL.

Any ORM provides the same features for any SQL database. There is nothing special going on here. If perhaps the database autogenerated a bunch of classes, maybe that’s interesting? I think some projects have introspected a database and created all the boilerplate language classes before.

There is nothing magical about wrapping database objects in language classes. This has been happening forever.

https://docs.sqlalchemy.org/en/20/orm/quickstart.html#select...

Nothing magical about using a function call rather than raw SQL.