Hacker News new | ask | show | jobs
by diarmuid_glynn 1166 days ago
We don't have a formal grammar yet. We'll put one together and add it to the GitHub repo tomorrow.

Informally: each row in the sheet is a new line, and each cell is separated with a pipe (|). Cells can contain either values (various number formats supported) or formulas. Example:

    ```equalto
    **Item**       | **Cost**
    Rent           | $1500
    Utilities      | $200
    Groceries      | $360
    Transportation | $450
    Entertainment  | $120
    **Total**      | =SUM(B2:B6)
    ```
4 comments

Why not just use GFM?

    Item           | Cost
    -------------- | -------------
    Rent           | $1500
    Utilities      | $200
    Groceries      | $360
    Transportation | $450
    Entertainment  | $120
    **Total**      | `=SUM(B2:B6)`
https://github.github.com/gfm/#tables-extension-

Getting rid of the vertical divider is nice but I'd rather think of it as a tiny modification to GFM than a distinct language.

Putting the in backquotes could make it look more like a formula and also it could be required for formulas to prevent accidentally invoking it.

It really is nice to have the horizontal bar gone, though. I think I might make my own format based on it. I tried to get rid of the bar but saw that you can't. In fact the only way you can have everything on one side of the bar is to only have a header (thead) when it would often be useful to only have a body (tbody).

FWIW original markdown requires pipes at the start and end of each row but not GFM.

I hadn't reviewed GFM's table extension previously, thanks for sharing.

At first sight, I think GFM's table extension and Sheet Markup have different goals. While the table extension is intended for displaying a single table of data, Sheet Markup for defining an interactive spreadsheet, including things like formulas. Such a spreadsheet might not really be a single "table" as such, it might be multiple separate logical tables. Also, I suspect that we will in future want to extend Sheet Markup with additional features which would be "even further" from what GFM's table extension supports.

But thanks, certainly food for thought!

I'm working on internal DSLs for markdown. I think it's a pretty powerful language and often it can be used in a different way rather than changed. For instance, a badge with a link could. A nice thing about a true internal DSL is that they are supported because it's the same language, being an internal DSL rather than an external DSL.

External DSLs of course give you full flexibility, as you are no longer constrained by the language. https://javieracero.com/blog/internal-vs-external-dsl/

My past work and this gave me the idea to do something that sits between an internal DSL of markdown and an external DSL - to allow tables without row dividers, but put them in fenced code blocks with a different language name so they don't get displayed wrongly by existing markdown tools, instead displayed as code. And because this is the only difference, to make it display as a table using existing gfm tools, an empty header could be added, since normally it's not desirable to have the whole thing as a header.

Here's an empty header that at least on https://loilo.github.io/gfm-preview/ shows up shorter than a normal line:

    []()|||
    -|
    Rent | $1500 | paid
    Utilities | $200 | unpaid
Though it isn't md I think I will have md in the name of the extension, much like jsonl has l in the name but a jsonl file with two or more lines of data isn't a single valid JSON document.

Edit: here's one that displays on GitHub:

    []()|[]()|[]()
    -|-|-
    Rent | $1500 | paid
    Utilities | $200 | unpaid
I see. I'm a big fan of DSLs, but the internal vs. external distinction is not something I've seen articulated before.

For now, I'm treating Sheet Markup as an external DSL, which can be embedded in a Markdown document using a fenced code block. But there are certainly benefits (and costs) to developing an internal DSL for spreadsheets along the lines of what you're suggesting.

B2:B6 is a bit out of context? perhaps also add SUM(*Cost*)?
Seeing it in a context without a UI to show you the row and column numbers felt a bit awkward to me too...

... and it made me wonder if some other syntax would work better, which made me think that maybe something like this would work?

    =sum(column | 2+ | above)
         targets whole column "stream"
                  second item and later
                       above this cell
         pipes manipulate the target "stream"
I'm waffling between rx-like and unix-like for terms though. Or something else. But a much more relative-and-whole-sheet-focused language seems like it could be a lot nicer than pinning cell IDs everywhere.
Note that you can represent much more complex spreadsheets using Sheet Markup, and while in the above example it might be clear what SUM(COST) would mean, in a more complex spreadsheet, with multiple different tables, it might be ambiguous.

As for why one would possibly ever want to use Sheet Markup for mode complex spreadsheets, one use is as a way to interact with an LLM. We've started to see some interesting results using GPT-4 to analyze various kinds of spreadsheets that have been encoded in Sheet Markup.

Pandoc also supports[1] pretty featureful ways to declare tables, may be worth looking into.

[1] https://pandoc.org/MANUAL.html#tables

Pandoc tables remind me of the reStructuredText tables, which I used back in the day: https://docutils.sourceforge.io/docs/user/rst/quickref.html#...

Very powerful, but I found it challenging to remember the syntax since I was only using them intermittently. Still, it could indeed form the basis of a more advanced spreadsheet markup syntax, supporting things like merged cells (which Sheet Markup does not, and probably never will, support).

Thanks, that would be great.