| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by whatever1 858 days ago

The trial and error is the worst part.

In traditional languages, you can print iteration by iteration the intermediate result and understand if there is something wrong.

In SQL you sample output, and you keep changing the query until you think you get it right. And then 2 years later someone else finds that the query was wrong all this time.

5 comments

jalk 858 days ago

Common Table Expressions (CTE) do help a little, as you can query each “table” and inspect the output. Debugging a giant query with deeply nested sub queries is very painful indeed

link

Ma8ee 858 days ago

So do table variables and temp tables.

link

remus 858 days ago

> The trial and error is the worst part.

I don't know about anyone else, but I do this kinda naturally when writing SQL queries. Usually start with a base table, query the first 100 rows to see what the data looks like, start joining on other tables to get info I need, querying as I go to check join conditions, perhaps build out some CTEs if I need to do some more complex work, query those to check the format of the data ... And so on.

It doesn't feel that different to any other programming in that sense. Querying is printing.

link

fifilura 858 days ago

> you can print iteration by iteration the intermediate result

You would not be able to do that with a multi-threaded/multi-process application.

And this is the reason why e.g. Trino/Presto is so powerful together with SQL.

Instead of telling the computer how to go by to get your result, you tell it what result you want and let it do it in the best way.

The most up-front way of telling a computer "how" is a for-loop. And SQL does not have it. It may seem limiting, but avoiding explicit for loops gives the freedom to the computer. If it sees it fit to distribute that calculation over 200 distributed CPUs it can do that. With an imperative language you need to tell the computer exactly how it should distribute it. And from there it gets really hairy.

link

whatever1 858 days ago

In development I don't need it to be multi-threaded. 1 thread is fine, as long as I can explain, step-by-step, how the calculations produced the output.

link

fifilura 858 days ago

If you don't need threads in development OR production, you might as well do SELECT * from users and do the join in your imperative code.

If you need threads in production I think you will end up getting rid of your for loops anyway (or possibly, if you really want to, end up in a mutex/semaphore quagmire).

I must say, though, that there are other benefits with a declarative approach than just avoiding threading issues. But I guess it takes some getting used to.

I would say that the same "I cant step through my code" argument also goes for functional style code.

link

Ma8ee 858 days ago

> If you don't need threads in development OR production, you might as well do SELECT * from users and do the join in your imperative code.

Except that it most likely will be orders of magnitude slower. Most databases are very good at what they are doing.

link

fifilura 858 days ago

Yes. Kind of my point to. But the OP missed the possibility to step through the code.

link

vrighter 857 days ago

sure you can. set the concurrency limit to 1. If you're debugging the logic and not some race condition then this works perfectly fine. Remember to profile afterwards though

link

Ma8ee 858 days ago

Trial and error is usually a bad idea in all kinds of programming.

link

r00fus 858 days ago

I mean, I never build a query from front to back. Usually I build it FROM -> JOIN -> WHERE -> SELECT.

link

mycall 858 days ago

Start off with SELECT * then once the joins are working, filter * down to the essentials.

link