Show HN: Visualize SQL Queries | HN Mirror

Y	Hacker News new \| ask \| show \| jobs

Show HN: Visualize SQL Queries (animatesql.com)

259 points by bvm-101 1546 days ago

My co-worker and I were debugging a SQL issue; having not seen SQL in two years, I embarrassed myself by confusing union vs. join. After this episode, I tried refreshing my SQL memory, but there are few websites that animate SQL for you. Most of them just have a series of images to help you visualize. There are a few tools that are quite good and robust (especially for large/complex use cases) but require installation and are too complex for my simple purpose.

So, just created a small tool to help visualise SQL. Most of the animations are just my understanding of how SQL works. Would love to know what you think? Do you also visualise some of the queries like that in your head? Any feedback would be gold. Btw you can also edit queries and see different results (but its a bit limited).

Have fun ;)

13 comments

bob1029 1545 days ago

We have started looking at something along the same axis of "improving understanding of your queries". Our product has nearly 10k SQL queries that need to be managed for each logical installation.

By converting a SQL query into an AST, you can start applying business logic to the actual syntax of the query. Put another way, you can query you queries. You can also run reports across all SQL to determine things like "show me everything in the product which references this table & column", or "Which queries reference a specific magic string constant?". More advanced reports can be made too, such as "Which queries join tables A, B & C together?"

We haven't taken it to the next step yet, but hypothetically we can go from AST back into SQL and start doing some super crazy shit like patching hand-written queries programmatically. Once something is in AST form, you are basically working with playdoh that another tool like LINQ (and a bit of recursion) can trivially cut through.

gavinray 1544 days ago

  > but hypothetically we can go from AST back into SQL and start doing some super crazy shit like patching hand-written queries programmatically.

And you've stumbled into rewrite-rules and the realms of query-planners/optimizers!

If you enjoy this stuff, it's really fun to learn about.

greggsy 1544 days ago

I’m not a DBA, but is there any IP around what OP and others are attempting to do? Surely anything that makes your life easier would have been patented by Oracle or the likes to eke every dollar out of the market?

hermitdev 1544 days ago

The animated step by step is new to me, but any DB worth using is going to include tools that explain the execution of any query you give to the DB. It's known as a "Query Plan". And, despite the name, is not limited to queries. Query plans aren't friendly to the uninitiated, but give you far more technical information you need to actually tune the query and ensure things like "am I actually using the indices I have on this table?".

I'm not a DBA, but I have over 20 years experience using RDBMSes of Oracle, DB2, SQL Server, Sybase, etc.

v-erne 1545 days ago

Just out of curiosity - where do You get those queries from ? By which I mean - Are they static templates, and You get them from source code, or they are dynamic and You gather them from logs (and risk that some rare queries will be left out) ?

bvm-101 1544 days ago

Believe it or not, did not know there were tools to parse SQL to AST before this project. I was trying to parse sql manually XD.

ComodoHacker 1545 days ago

I'm not sure a procedural visualization is appropriate for a declarative language like SQL. It can be misleading both on underlying concepts (sets and relations) and actual query execution.

I'd more like to see a visualization it terms of tuples, sets and operations on them leading to result set.

rockostrich 1545 days ago

I guess this could be a teaching tool if you focus on just selecting keywords from the dropdown, but in general most SQL engines' EXPLAIN features are leagues more useful than this.

bvm-101 1544 days ago

Idea was to supplement the many websites that do explain SQL this way. But yes, I agree with that statement.

flarg 1545 days ago

Did you ever see http://revj.sourceforge.net/ ?

edallme 1545 days ago

Even more powerful (though an academic prototype): https://queryvis.com/

gavinray 1545 days ago

Neat tool too, thanks!

hackernewds 1545 days ago

very tough to comprehend..

zozbot234 1545 days ago

Can't you also do this with Microsoft Access or LibreOffice Base? I think it's called QBE, or Query By Example.

dsmmcken 1544 days ago

Really cool for joins. It would be nice to be able to adjust the speed after you already started an animation, and maybe play/pause. I also expected to be able to type into the duration box, maybe make it just a label rather than an input that can get focus.

FR10 1545 days ago

Thank you for building this! I now understand the LEFT JOIN story shared a couple days ago.

Natfan 1544 days ago

Bug report: The buttons used to change the animation duration do nothing, probably because the buttons have the `disabled` attribute applied.

Benlights 1544 days ago

You can only change them before a animation runs. If you are in middle of an animation hit Stop/Reset and then adjust.

ieee2 1545 days ago

LIMIT not working. Try:

SELECT whiteRating, blackRating FROM ChessGames WHERE gameWinner='black' LIMIT 2 OFFSET 8;

bvm-101 1545 days ago

You're the best :)! Let me fix that.

23202586 1545 days ago

Very nice ! Would be happy to see more complex queries in the future, did you plan any update ?

bvm-101 1544 days ago

Initially, the idea was to not have keywords selection at all. Wanted to allow for typing any query and just visualise. But it was a bit too complex for the timeline. Was looking to plan this out shortly, but there are so many good suggestions here, I feel might need to overhaul a few other parts too.

hackernewds 1545 days ago

Promising. Unusable on mobile since the examples span multiple pages

gozeloglu 1545 days ago

In overall, it seems good, I'll use it while studying SQL.

jve 1545 days ago

This may help you mentally map in brain what happens. But this is all table scanning and doesn't include indexes, which organize data in a tree. When you do SQL, you should consider whether table will contain thousands or millions of rows. If millions, you have to think how that JOIN or WHERE will take indexes into account.

jiggawatts 1545 days ago

In a way, it is even deceptive, because naive programmers might be lead to think that repeated(!) table scanning is what real database engines do.

It's actually very rare that this happens. For example, SQL Server will create a temporary index on the fly if one is missing. It can create B-Tree, Hash, and Bitmap indexes which might be unexpected for some people because only B-Tree indexes can be created "permanently".

So in some ways database engines do even more than just use statically defined indexes.

tluyben2 1545 days ago

> It's actually very rare that this happens

So I have a little consultancy gig for a few decades now where I spend a few days a month optimising bad software for performance (it is what I like; I don’t do anything else but ‘make shit faster’). I can tell you that the the past 10 years 99% of optimisations I did are fixing MySQL queries and indexes that table scan. I had projects that literally have table scanning queries over 50% of the queries ran. The result, as you know but apparently is not very common knowledge, is that these sites and apps run to a grinding halt (after incurring bizarre bills on aws rds; I moved many app from $100k/month bills to $10/month) when even a little traffic comes in.

Or; table scans should be rare but are not.

Edit; removed ‘time’ as that was not a good way of expressing this

Brakenshire 1545 days ago

How do you build your intuition about creating queries in a way to avoid this?

Is it a matter of having a conceptual model of relational algebra and the way the different db engines work, or is it more an accumulation of heuristics over time for what probably will cause problems, and an iterative process of using EXPLAIN, adjusting the query and seeing what happens?

tluyben2 1545 days ago

I am old :) But experience is a big thing; I can sniff by just skimming the table definitions where probably something is very wrong. In uni in the 90s I studied both relational theory and formal methods and I had to spend a lot of time figuring out and fixing complexity; if you take a university level book on big O complexity and work through it, you will have a good feeling what software can and cannot do and in what way. That has not really changed; we have more efficient and more cache, we have improved algorithms, but things that cannot be looked up in O(1) are still dangerous and possibly can incur enormous IO even with only a few million records. Naive developers see that things are blazingly fast locally on their laptops and that’s it. I have met, especially in the last few years (In my bubble this is getting worse, quite fast), quite a lot of lead devs that actually do not know what an index is for and so I see entire dbs without any or only on the id field. I know people (for some reasons) do not like ORMs that create tables and indexes, but it would prevent many rookie mistakes if they did.

jiggawatts 1545 days ago

I can’t remember the last time I came across a non-CotS database schema that has secondary indexes in a significant number. Like more than half a dozen for a hundred tables or more.

I’ve never seen a database use “advanced” features like clustered columnstore or even just page compression.

I just have an email in my inbox from this morning from a small vendor that “doesn’t recommend” columnstore for a database containing 10 TB of numeric metrics in one table.

That would compress to a few gigabytes and query times would go from minutes to milliseconds.

But they “don’t support it”.

Which I now translate to: “we haven’t even flipped through the manual and when we googled it in a panic we didn’t understand it.”

This is how your data is being managed at huge enterprises and government agencies around the world.

pfarrell 1545 days ago

First realize that SQL is not a procedural language, you are only describing the result set. The data store will then create an execution plan which is the actual code that gets run. Learn to read the explain in your data store of choice (very few swe do this). If you have access to a database administrator in your company, befriend and learn from them. Read about how databases store and retrieve data: from sql to data pages. Measure measure measure. Learn about different types of indexes and their trade offs. Remember that “it depends” is the answer to almost every db question and that you should be thinking through all you codes interactions. That is the path to mastery of dbs.

zasdffaa 1545 days ago

IME just run the DB, pick up and look at the query plans for the most common/time consuming, then add indexes. That's 80% of it. So...

> Is it a matter of having a conceptual model of relational algebra and the way the different db engines work

...no....

> or is it more an accumulation of heuristics over time for what probably will cause problems

...no....

> an iterative process of using EXPLAIN, adjusting the query and seeing what happens?

...that's more like it!

Once you understand the underlying data structures, all the magic goes away. As it should.

tluyben2 1545 days ago

Yeah, that works. My process is very different, but your advice is better as it hangs more on actionable tactics than learning intuition.

> Once you understand the underlying data structures, all the magic goes away. As it should.

Once you actually understand them, I feel you don’t need explain in most cases; you will simply ‘see’ why certain queries or definitions or structures are bad.

gunshowmo 1545 days ago

Would love to read about some examples / strategies in a blog post some day!

zasdffaa 1545 days ago

> For example, SQL Server will create a temporary index on the fly if one is missing

err, I may be misunderstanding but can you explain why you feel this? I have never (IME) seen MSSQL do this and it wouldn't make sense because constructing an index needs a table scan plus a lot of work on top. Just doing a hash join is simply the better option.

I mean it would be nice at times but there are traps to this which is why (again AFAICS) it's not done and would be unsafe to do without a lot of info about resources and future expected queries which the query planner just doesn't have.

Happy to be set right on this.

jiggawatts 1544 days ago

It won't create an index you can see in the GUI or query via "sys.indexes" or anything like that.

It's a temporary object, much like a temporary table that exists only in the scope of the query.

As another comment mentioned, this is what a hash-join does internally: it builds a temporary "hash index" of one input, and then uses it to look up rows while scanning through the other input.

If you looks at the query plans in SSMS, you'll occasionally see bitmap indexes as well.

The equivalent of a standard B-Tree index that you would create permanently is the "Index Spool" operator. You'll also see "Table Spool", which is basically a temporary heap.

The example in the original article was the equivalent of this loop:

    foreach( a in table_a ) {
        foreach( b in table_b ) {
            if ( a.id == b.aid ) ...
        }
    }

That's hideously inefficient. Most databases will automatically do something like:

    var a_hash = new Hashtable( a.row_count )
    foreach( a in table_a ) {
        a_hash.add( a.id, a )
    }

    foreach( b in table_b ) {
        if ( a_hash.lookup( b.aid )) ...
    }

The clever part in all of this is that you can do this two ways: build a hashtable of "a" and lookup "b" rows in it, OR build a table of "b" and lookup "a" rows in it. They're equivalent, but the performance can be wildly different.

RDMBS query planners have the job of figuring out which to pick. Even if you think you can outperform the database by writing code like the above in Java or C# or whatever, you won't write out every combination and have the statistics available to choose. The database engine can and does.

SQL Server can do both steps in parallel across all CPU cores which is a topic of several PHD-level research papers. For example, hash tables can have performance issues if the same key occurs too often (e.g. many NULL columns). Balancing this across multiple cores is... complicated.

jve 1545 days ago

> Just doing a hash join is simply the better option.

Maybe he meant just that - hash join has to populate hashtable before joining.

Regarding B-Tree index, perhaps he thought Automatic Index management - https://docs.microsoft.com/en-us/sql/relational-databases/au... - but it must be enabled explicitly. But I don't feel like it is "on-the-fly". Rather in-background.

Cthulhu_ 1545 days ago

I'm going to be that guy and just say that I wince at the use of emoji instead of words. It's not very accessible and I surely hope nobody actually uses emoji word replacements professionally.

chockchocschoir 1545 days ago

I don't like it either. I got good news for you and bad news. The good news is that emoji usage seems higher in open source/side-projects than in professional environments. The bad news is that yes, some people, even professionally, do use emojis instead of words. It is horrible, but I'm afraid it's only gonna get more popular as the population of young people become... Not young and joins the professional workforce.

0des 1545 days ago

You mean stands beside the professional workforce.

Natfan 1544 days ago

Are you claiming that "young" people can't join the professional workforce? Are they joining some other, "unfessional" workforce? Is there an age limit at which you're Old(tm) and then are able to be a true professional?

surfTide 1545 days ago

Think of emojis as Egyptian hieroglyphs - we've been there, done that.

On the other hand, emoji are far more international - everyone knows what a thumbsup means ....

zozbot234 1545 days ago

> everyone knows what a thumbsup means ....

In some places, a thumbs up means "Up yours!" Not very friendly at all.

surfTide 1545 days ago

Mea Culpa - not my intention :)

I guess the victory/peace sign would also be a bad example (and that amongst english speakers).

klohto 1545 days ago

I’m sure the 3 developers from Iran/Iraq who don’t know English and would be offended by that are fine.

0des 1545 days ago

It does have an air of "yeah fuck you buddy" sometimes to get a thumbs up

qorrect 1545 days ago

<thumbs_up/>

bvm-101 1544 days ago

I hope at least one emoji is universal...<3

samatman 1545 days ago

Are you under the impression that screen readers can't read... emoji?

You can call them unprofessional, that's like, your opinion, man, but "not accessible"? Admit it, you made that up because you don't like them and you hoped it'll stick.

robofanatic 1545 days ago

some emojis are fine like this one -> :-)

teh_klev 1545 days ago

That's an emoticon ;)