Hacker News new | ask | show | jobs
by Justsignedup 1240 days ago
My biggest issue with sql has and always will be the lack of definitions.

There's no way to express data structure knowledge in sql, only relations and keys. Discoverability can be quite lacking. And every time I want to join table A to B I have to re-define everything, because SQL doesn't store that.

ORMs help. They help because they encode relationships in meaningful ways. A has many Bs, so A.B works, and I don't need to repeat this join logic every damn time. But ORMs have down sides too. Sometimes they generate queries that are really sub-optimal.

I think my favorite ORM usage was with Hibernate. I KNOW!!! THAT DEVIL! But honestly we wrote SQL in hibernate, and then invoked it to populate our data. Yes it was a bit more work than RoR's "order.items.where("price > 40")" but when complex things happened, it was always easier.

I've always looked at nosql as "absolutely, sounds great, how do you represent relationships?"

5 comments

You can sort of fake the joins between two tables and avoid the ceremony with `natural join` assuming you've named the columns the same. (This doesn't work if your team has followed a normal pattern of "table1.id" with a FK in "table2.table1_id".) It's more like a hack and it falls apart as soon as someone starts mucking with column names [what evil person would ever do a thing like that...] or happens to share unrelated columns with the same name across two tables. So yeah, probably don't use it, except maybe to save yourself some time writing OLAP queries but increase your time spent debugging those same queries. :)
> There’s no way to express data structure knowledge in sql, only relations and keys.

Relations, keys, and (though you forgot them) constraints express…quite a lot of data structure knowledge.

> And every time I want to join table A to B I have to re-define everything, because SQL doesn’t store that.

SQL stores that if you tell it to; the usual way being view CREATE VIEW.

A database system could implement a feature that automatically uses the foreign key to join to a table. Maybe some RDBMS out there does this.

For example, you have a many to one relationship between posts and users. Instead of this:

  select *
  from user as u
  join post as p
  on p.user_id = u.user_id;
You could do:

  alter table post
  add constraint fk_user_id foreign key (user_id) 
  references user.user_id;

  select *
  from user as u
  left referent join post as p;
Any good sql auto-completer will also look up the foreign key information and auto-generate the on clause for you as well. Redgate SQL Prompt (mssql only) is one of the best tools out there for this reason
SQL already covers this use case with the `using` keyword. But you need to specify the shared column name. If you didn't need to specify the column then adding a new foreign key between the tables would make existing queries ambiguous and break backwards compatibility. See:

https://www.postgresql.org/docs/current/queries-table-expres....

> SQL already covers this use case with the `using` keyword.

No, “using” doesn’t automatically join by foreign key, it joins by explicitly-provided column name which must be identical between the two tables.

> If you didn’t need to specify the column then adding a new foreign key between the tables would make existing queries ambiguous and break backwards compatibility.

Not if you specified with constraint name rather than the column name (which also works for multi-column foreign keys without having to reiterate all the columns, being much more DRY than current SQL USING.)

Better, with that approach you could also allow fully implicit joins, using the foreign key constraint name as if it were an row-valued field in the referencing table.

E.g.:

  CREATE TABLE employees (
    id BIGSERIAL PRIMARY KEY,’
    name VARCHAR,
    manager_id BIGINT CONSTRAINT manager REFERENCES employees
  );
would let you do:

  SELECT name as employee_name, manager.name as manager_name
  FROM employees
as syntax sugar for:

  SELECT ee.name as employee_name, mgr.name as manager_name
  FROM employees ee 
  LEFT JOIN employees mgr ON (
    employees.manager_id = mgr.id
  )
wow i absolutely did not know this. amazing.
Using is a bad standard. It doesn't do the obvious thing (that is following foreign keys), it requires a heavy amount of bad conventions, it fails for the not too rare case where you need more than one FK between the same two tables, and if you go strictly with the standard (good for Postgres that it doesn't), it breaks the table-aware semantics of column names.

Meanwhile, the obvious natural way to make joins isn't available.

Hi, just curious - could you or someone else be more specific about the way in which an ORM encodes relationships that SQL doesn't?
An ORM allows you to encode a relationship once then use it for many different queries, I think that's the idea.
But a view also does that? Like, if you want to assemble information about a user from several different tables, you can have a view that does the join for you.
Sure, but people never use those. Also it gets back to the discoverability issue. The ORM documents relationships with the rest of the model code, if you have a poorly named view in a large, complex schema it may be hard to find. You could reinvent the wheel easily.

Like everything with SQL, you can solve the problem but sometimes the solution isn't elegant. People want elegance.

This is an accurate description of how I think of this problem. And it would be tough to describe all this:

- table a joins b yay a view

- table b joins c yay a view

- table c join d yay a view

- table a joins to d. well technically it can, but are you really gonna write all the permutations of views for every possible join?

ORMs encode that nicely so I can easily walk the relationships and get to the query I need.

Essentially all of the actual information is encoded in the foreign key constraints (in combination with uniqueness constraints.) What ORMs provide that SQL doesn’t isn’t much encoding of information (they usually have facilities to distinguish data tables from pure join tables, and to distinguish 1:1 and 1:M, (M>0) relations from 1:(0-1) and 1:M, (M>=0) ones, so they do encode some additional information), but provide convenient syntactic shorthands for the client to use the encoded relationships.

This would be easy to add to SQL, as syntax sugar, https://news.ycombinator.com/item?id=34587412

> Sure, but people never use those

Why not? I use views all the time, and they seem pretty elegant and simple to me.

> Discoverability can be quite lacking.

A richer type system would help there, but to my knowledge this isn't offered by any major relational DBMS.

https://www.postgresql.org/docs/current/sql-createtype.html

Postgres allows you to define custom types at least; making some enums or composites might give you some measure of sanity..?

Thanks.

The Ada programming language is known for encouraging use of specialised numerical types, where the range can be expressed and automatically checked at runtime, and the compiler can help protect against nonsensical comparisons or arithmetic. At a glance I think Postgres range types offer something similar. Presumably strong typing could also be achieved with Postgres composite types (of just one member).

It would be useful to have such checks so that a nonsensical comparison like WHERE person1.height = person2.net_worth would result in an error. Presumably it could also be helpful for autocompletion.