SQL Pivot: Converting Rows to Columns | HN Mirror

Y	Hacker News new \| ask \| show \| jobs

	SQL Pivot: Converting Rows to Columns (databricks.com)
	77 points by dmatrix 2789 days ago

6 comments

danso 2789 days ago

I vaguely understand (on a theoretical level) why the syntax for pivoting in databases is necessarily convoluted, but it does underscore to me how spreadsheets have a great advantage for everyday data-exploring, at least when it comes to ease of pivoting.

(by "everyday", I mean working with datasets small enough to fit within a spreadsheet's limits, and aren't yet important enough to need the reliability/regularity that comes with importing it to a database)

nerdponx 2789 days ago

The ability to use Excel as a "front-end" for other data sources is underrated. For all its warts, Excel is a great data manipulation and exploration tool.

BariumBlue 2789 days ago

Due to system limitations, I made a stats tracker for my team in AccessDB. Due to the awfulness of access, I made an excel&vba front-end that would push data to and pull data from access.

It's been working surprisingly well so far, with no reported issues

bhandziuk 2789 days ago

I'm curious what you found to be the brickwall for you when using Access?

BariumBlue 2788 days ago

The existing "solution" was a gigantic excel where people would put it times and names.

In Access, when you enter times, you have to use a specific format, (the datetime group specified by the Windows default) rather than just "23:40" or w/e. I've seen a hack where the hour and minutes are separate boxes, but it wasn't pretty. Just this issue was enough to make Access unappealing.

Access's interface is a bit unintuitive, and given that most people I worked with weren't particularly technically capable, I preferred to present them with the familiar Excel interface (and a few extra buttons), rather than teach them an application they've never used before.

Plus, the existing workflow already used Excel, so I hijacked their workflow and their routine wouldn't be modified much at all (introducing a completely new system would cause a lot of friction and pushback, but one or two extra buttons was fine).

bhandziuk 2788 days ago

Huh. Next time you can set the date format to "mm/dd/yyyy hh:nn" and you can enter 24 hour times with a date.

But I understand wanting to not disrupt an existing workflow too much.

karkisuni 2789 days ago

I wish there was something like this in postgres. tablefunc is a lot more complicated than this syntax.

netcraft 2789 days ago

There are a couple ways to pivot in postgres, here is one that uses json functionality https://gist.github.com/ryanguill/101a19fb6ae6dfb26a01396c53...

forkandwait 2789 days ago

I would like something with simple functionality, where I can just specify the table name, the category column, and the data column and get a new table transposed ibto columns. I would be happy to do the aggregation outside the pivot process. PROC TRANSPOSE in SAS would be a good model.

GorgeRonde 2788 days ago

Dynamic row to column pivotation/transpose in Postgres made simple.

https://github.com/hnsl/colpivot

nevi-me 2789 days ago

I had a quick look at some online material on implementations of pivot, as well as what I consider my go-to reference [0] for Apache projects' SQL support (Apache Calcite).

Looks like each DB has a proprietary way of pivoting, and I couldn't find anything on pivot in Calcite. Does anyone know if the SQL impl of pivot was created specifically for Spark, or whether it'd be upstreamed to Calcite? The benefit would be that other projects that rely on Calcite for SQL support would also have pivot capabilities. I'm thinking of Apache Beam.

- [0] https://calcite.apache.org/docs/reference.html

reilly3000 2789 days ago

This article enlightened me on the underlying logic, as well as some different implementations. https://modern-sql.com/use-case/pivot

nevi-me 2789 days ago

I was lazy to add its link, that's where I learnt about the proprietary implementations of various databases.

I wonder if we'll ever get an ANSI-2019 SQL version that tries to merge what various DB vendors have branched off and done. Maybe it already exists? https://www.whoishostingthis.com/resources/ansi-sql-standard...

rxin 2789 days ago

Apache Spark doesn't use Calcite.

IshKebab 2789 days ago

This is very nice. I've currently had to resort to MariaDB's dynamic columns for my purposes (much nicer than a traditional entity-attribute-value solution) but this looks even nicer. Is this a Spark-only thing?

51Cards 2789 days ago

mySQL needs this as well. We do a lot of pivot style reports for clients and while we have it down to a bit of a science now the resulting SQL is a debugging mess and not terribly efficient. (though it still runs fairly quick)

m0dest 2789 days ago

I'd love to see this in Presto. It would make my life 1000x easier.

gildas 2789 days ago

Off topic: the GPDR banner on this site is a good example of dark pattern. Scrolling into a webpage does not mean I consent collecting personal info about me.

gdhbcc 2789 days ago

It's a good example of GDPR non-compliance too.

dspillett 2788 days ago

I didn't even get a banner, so presumably I'm assumed to have consented without having taken any action.

ediardo 2789 days ago

I can't see the banner...

gildas 2788 days ago

Downvoting won't make it more compliant...

Edit: I thought we could discuss about dark patterns on HN. I was obviously wrong. Sorry for that...

dspillett 2788 days ago

The downvotes on the original post are probably because the comment is being seen as off topic. Sometimes comments like that are accepted, sometimes not, I can't say I've noticed much of a patter there other than they are far less welcome on HN than, for example, Reddit.

I usually play safe and reserve side-topic comments to a post-script after something more immediately relevant (and if I don't have something to say on the thread topic, keep them to myself).

The downvotes on this one are going to be because of complaining about downvotes!