Hacker News new | ask | show | jobs
by fifilura 1755 days ago
There are lots of cases where a join with yourself is applicable, although they are mostly superseded by window functions these days.

For example normalisation (join with a groupby/sum of yourself) or rank (join each row with all rows that have lower value than yourself and count those rows).

But as I mentioned above. A good start is to sketch that out in excel. You will realize that what you need is another column (e.g. total sum for this id). And from that you can work yourself backwards to figure out what is the table you need to join with to create that column.

1 comments

I'm not sure if you are replying to the wrong person but the question has nothing to do with a total sum of ids...

the question was: "List all pairs of customer IDs who live on a street with the same name but in a different city." listed under self-join

that said i haven't wrangled with raw sql in a spell so the reading on window functions is interesting.

1. GP mentioned hairy self joins 2. You replied with an example 3. I pointed out that there are other examples.

That said, I have a feeling your duplicates can be fixed by adding the requirement that c.cid < c2.cid

Not sure a window function would help in this particular situation, but they are there to help in more mundane examples.

I see how it can be read that way yeah. they mentioned the study used a self join problem that they couldn't likely figure out so I looked at the paper and quickly did it off the top of my head. the less than predicate does indeed do the trick.