Hacker News new | ask | show | jobs
by _dain_ 1422 days ago
Can you give a more detailed description of what you were trying? I can't know for sure, but it sounds like Excel can easily handle what you've described, if you use it right.

>the entire paradigm revolves around knowing the shape of your data in advance.

How exactly do you program without knowing the shape of your data in advance? You need to know your database columns, or your JSON schema, etc.

>(I was using Google Sheets, but I don't think Excel would have been much different).

It would have been very different, because Excel has tables and Powerquery and Google Sheets doesn't.

>since it's a variable number of rows returned, it is difficult to then operate on that data without filling your formulas down for some indeterminate number of rows.

Were you using dynamic array formulae? They can handle the old problem of needing to fill down formulae to an arbitrary depth. Or again, tables.

Programmers routinely underestimate Excel. Unlike most Microsoft products, it has improved year on year over the past few decades. There are heaps of great power-user features they keep introducing. The skill ceiling is very high .. not as high as proper software engineering, but still damned high.

It also really annoys me when I see Linux/FOSS partisans tell Windows normies "oh you can do everything you can do in Excel in LibreOffice Calc" -- no you fucking well cannot. (And I use Linux on my personal computers full time).

2 comments

> It also really annoys me when I see Linux/FOSS partisans tell Windows normies "oh you can do everything you can do in Excel in LibreOffice Calc" -- no you fucking well cannot. (And I use Linux on my personal computers full time).

It seems like your argument is that alternatives don't have PowerQuery. That might be true (I don't even know what it is), but isn't that like saying Linux can't compete with Windows, because it doesn't have Internet Explorer? I mean, it doesn't, but there are excellent alternatives that can accomplish exactly the same task.

Unless you can come up with an example task that can't be completed, then it seems like it's just a matter of opinion which is the better solution.

As far as I know both Google Sheets and LibreOffice have SQL and Pivot Tables, and -- believe it or not -- Lotus 1-2-3 had "/Data queries" in 1989. Naturally, the queries possible in 1-2-3 were limited, but you really could query large tables for things like "[Date] <= #date(2017,6,1)", which is the first result I got from typing "Power Query example statement" into Google.

Do Sheets and LibreOffice have Solver? No. They don’t.
Lotus 1-2-3 had a Solver (two actually, one called "Solver" and one called "What-If") in 1989. Yes, LibreOffice has one too.

Does Microsoft tell their customers they invented these features? Just a few days ago I saw a Windows developer who thought Microsoft invented conditional breakpoints.

LibreOffice does have an solver. but i have no idea how it compares to others.
>It seems like your argument is that alternatives don't have PowerQuery. That might be true (I don't even know what it is), but isn't that like saying Linux can't compete with Windows, because it doesn't have Internet Explorer? I mean, it doesn't, but there are excellent alternatives that can accomplish exactly the same task.

It's true that Google sheets and LibreOffice don't have Powerquery, and that's a big pain. But the worse thing is that they don't have tables. As in, the "format as table" button in Excel. As in, the bread and butter of anyone who gets serious work done in Excel.

Maybe it's a problem of naming -- "format" makes people think it's just about aesthetics, but actually it imparts real semantic structure onto a rectangular grid of data. It also isn't the same thing as pivot tables, with which they are often confused. It gives the grid a name that you can refer to in formulae, and the columns are named too, with their names living inside the table namespace ("structured references" is what Microsoft calls it). The table automatically expands its boundaries when you start typing a column header to the right of the current columns, and likewise it expands to comprise the row beneath it if you type values into that row. And it has smart indexing: there's special syntax to refer to "this table" and "this row" in formulae.

So you can have say, a table named "ExpensesTable", labelled "Date", "Type of Expense" and "Amount" in columns A:C. Then you can type "Tax" at the top of column D, it will expand the table to include a new blank column for Tax. Then in D2, type

    =[@[Amount]] * 0.2
and it will automatically fill down the Tax column with 20% of the value of the Amount columns. Then in a cell outside the table, do

    =sum(ExpensesTable[Amount])
to get the total amount of expenses. These are both simple examples; you can do more complex and interesting things involving multiple columns, ranges of columns, joins, etc. The point is the semantic structure that makes your spreadsheet more than just a rectangular soup of cells, so you don't have to claw through endless cryptic "G70:$K100" cell references. If we add a new row or column, we don't have to alter any formulae at all; the bounds are automatically resized on the cell arrays that the column names refer to. Think of it like a mutable resizable dataframe. It's the core data structure of an efficient, scalable, maintainable Excel document.

More about structured references: https://support.microsoft.com/en-us/office/using-structured-...

Also the "You Suck At Excel" talk by Joel Spolsky: https://www.youtube.com/watch?v=0nbkaYsR94c

And no, I have no idea why the eggheads at Google don't implement this for Sheets. Maybe Microsoft has a patent on it? Wouldn't surprise me. But this is why you'll have to pry Excel out of spreadsheet jockeys' cold dead hands -- the alternatives don't have this basic thing.

You need to come up with a task that cannot be accomplished, remember you said "no you fucking well cannot [do everything]", but so far I've seen no examples.

I mean, isn't this just a button that adds some named ranges for you?

You can replicate the exact example you gave with named ranges. If there is something it can do that named ranges can't, then please use that example instead. Similarly, if you think there is something that "Power Query" can do that SQL cannot, then please show that.

I literally use Lotus 1-2-3 for UNIX (I'm not kidding! http://123r3.net).

So far, all of the examples I've seen you give could have been done in 1989 on a VT100 terminal connected to SystemV. You could even write a quick macro in that generates the named ranges from column headers with one keystroke, it would be really trivial.

>I mean, isn't this just a button that adds some named ranges for you?

No, named ranges don't automatically expand when you add new rows, and they aren't automatically created when you add new columns. And they don't remain in groups, e.g. you can't make a reference like Namedrange1:Namedrange3, but in a table you can do Column1:Column3. Named ranges exist in a global namespace; column references exist in a per-table namespace. The table syntax makes columnwise operations clearer to express in formulae. Let's say you want to refer to the cell in the same row as the current cell, but in a different column: how do you do that if everything is just a named range? You need to do some kind of juggling with indexing and lookups, or else fall back to alphabet soup A1/R1C1 style referencing, because a named range is only good if you want to do an operation on every cell in the range. But that's often not what you want! In tables it's as simple as [@[other column]].

You would know this if you actually read the documentation or watched the video I posted. Or I could just repeat myself again (maybe I will write a macro to automate such tedium).

>Similarly, if you think there is something that "Power Query" can do that SQL cannot, then please show that.

Grab data from a csv file, a JSON file, a SQL database, and an Excel sheet, and combine them all together using a normie-friendly GUI.

Your question doesn't even make sense, it's like a type error. SQL and PowerQuery are not competing technologies, they're complementary.

>You could even write a quick macro in that generates the named ranges from column headers with one keystroke, it would be really trivial.

Yeah and you can also make Dropbox by getting an FTP account, mounting it locally with curlftpfs, and then using SVN or CVS on the mounted filesystem.

Spreadsheets are the only remaining programming system that people not inducted into the Programming Cult use.

I'm happy that you like this syntax, but the claim you made was "no you fucking well cannot [do everything]", not "excel syntax is more fucking beautiful".

I appreciate your advice, but I don't need to watch a video on R1C1 syntax, I literally maintain a spreadsheet :)

It seems like your real claim is that you really like the way Excel does it, nobody can argue with that.

I have to agree with the OP (_dain_) here. Excel has evolved a lot in the last few years, first the whole Power Query and Power Pivot revolution and now all the functional stuff brought on by Simon Peyton-Jones and his crew like LET expressions and the functional constructs like LAMBDA, MAP, FILTER, ...

There's very little you can't do neatly and efficiently in Excel anymore. Yes you can in principle do those same things in Google Shets, but at what cost of readability?

I don't think it's worth spending much time getting into these arguments because the people arguing against Excel clearly don't know modern Excel very well.

Your comment about FOSS is spot on. While I'm very aware that Google Sheets is not OSS, it felt much more amenable to me than Excel (and I'm sure Excel's online free version isn't particularly fantastic anyway, though it may be better than Sheets from what people are saying here).

> How exactly do you program without knowing the shape of your data in advance? You need to know your database columns, or your JSON schema, etc.

This was a bit overloaded in my opinion, as in spreadsheets world, "shape" includes the number of rows, hence my comments. I know that the column layout needs to be known.

> Were you using dynamic array formulae

I looked into it, but couldn't figure out how to handle them without introducing a massive amount of formula duplication. The best I could figure out how to do was to do a single large FILTER (which is dynamic array) and doing a fill down on my other transformation formulas from there. I blacked out the rows past the end of the FILTER using conditional formatting rules (which felt very stupid to do, but I couldn't find anything better).

> The skill ceiling is very high

I don't doubt you, but if you can't discover the functionality, it might as well not exist. Admittedly I was clearly using the inferior tool, but in my searching for solutions I much more readily found Google's documentation over Excel's.

I also realize I'm not in the position of being forced into a corner; as most of us on this forum could, I just wave my magic wand and write the software to solve my problems. I imagine those who don't have that ability available to them will do "crazier and crazier" things to figure out how to accomplish their work in Excel, and therefore will learn much better ways than I have in my little experience with it.

----

I was building a tool to track the completion of finding parts for a given Lego set. You enter the set ID, it pulls the parts list for that set (Rebrickable nicely offers their database as a set of CSVs https://rebrickable.com/downloads/) and formats it nicely for consumption.

I wrote another comment that also answers yours for the most part: https://news.ycombinator.com/item?id=32365128

tldr your problem with Excel is that you don't grok tables. They eliminate the need to know the number of rows when writing your formulae.

I don't actually have Excel installed on the machine I'm using to type this, so I can't put my money where my mouth is like the vim guy did[1]. But I'm fairly sure you can achieve your goal with table references and liberal use of the XLOOKUP and FILTER functions. It'll get a little hairy since you have to go from Set -> Inventories -> Inventory Parts -> Parts, so maybe a bit of nesting. But I think doable. The LET function also helps to reduce formula complexity, it lets you make lexically-scoped variables inside your formulae. Use "data validation" to make a dropdown menu for the set names.

[1] https://stackoverflow.com/a/1220118