Hacker News new | ask | show | jobs
by 542458 1751 days ago
Personally I feel that it took me a while to get really comfortable with more complex joins. There’s an problem they used in the study that required joining a table with itself, and honestly I would probably take a while to come up with that answer, if at all.
1 comments

A basic approach(probably what they are going for in a basics study) would be something like this off the top of my head

    select c.cid, c2.cid
    from customer as c
    inner join customer as c2 on c.street = c2.street
    where c.city <> c2.city
though that has reflective duplicates say (1, 5) would also have (5, 1) in the output. So I'm not sure if that's "allowed"
There are lots of cases where a join with yourself is applicable, although they are mostly superseded by window functions these days.

For example normalisation (join with a groupby/sum of yourself) or rank (join each row with all rows that have lower value than yourself and count those rows).

But as I mentioned above. A good start is to sketch that out in excel. You will realize that what you need is another column (e.g. total sum for this id). And from that you can work yourself backwards to figure out what is the table you need to join with to create that column.

I'm not sure if you are replying to the wrong person but the question has nothing to do with a total sum of ids...

the question was: "List all pairs of customer IDs who live on a street with the same name but in a different city." listed under self-join

that said i haven't wrangled with raw sql in a spell so the reading on window functions is interesting.

1. GP mentioned hairy self joins 2. You replied with an example 3. I pointed out that there are other examples.

That said, I have a feeling your duplicates can be fixed by adding the requirement that c.cid < c2.cid

Not sure a window function would help in this particular situation, but they are there to help in more mundane examples.

I see how it can be read that way yeah. they mentioned the study used a self join problem that they couldn't likely figure out so I looked at the paper and quickly did it off the top of my head. the less than predicate does indeed do the trick.