Hacker News new | ask | show | jobs
by psacawa 1510 days ago
Since no one seems to know about it, jq is described in great detail on the github wiki page [0]. That flattens the learning curve a lot. It's not as arcane as it seems.

The touted claim that is fundamentally stateless is not true. jq is also stateful in the sense that it has variables. If you want, you can write regular procedural code this way. Some examples [1]

The real problem of jq is that it is currently lacking a maintainer to assess a number of PRs that have accumulated since 2018.

[0] https://github.com/stedolan/jq/wiki/jq-Language-Description

[1] https://github.com/fadado/JBOL/blob/master/fadado.github.io/...

6 comments

> It's not as arcane as it seems.

The issue with jq is that I use it maybe once a month, or even less. The syntax is "arcane enough" that I keep forgetting how to use it because I use it so sporadically.

In comparison awk – which I also don't use that often – has a much easier syntax that I can mostly remember.

Not entirely convinced by the zq syntax either though; it also seems "arcane enough" that I would keep forgetting it.

Bingo.

There are at least a dozen tools and languages and syntaxes that I've used sporadically over the years - awk, sed, bash, Mongo, perl, etc. I don't use them often enough to remember exactly how they work, and so I always have to spend a few hours reviewing manuals or old code repos or an O'Reilly book.

But if I do end up using it for a few days in a row, it starts to make sense, and I improve each time I use it.

But not with jq.

It just does not make sense to my brain, no matter how many times I've had to use it. Every single time I need to use it, it requires finding some Stack Exchange or blog and just copying and pasting. Even after seeing the solution, rarely do I then really understand why or how it works. Nor can I often take that knowledge and apply it to similar problems.

About the only other syntax or language that gives me such problems is Elastic Search DSL.

Same for me... everytime I have to lookup the basics... and I love awk,perl and xpath/xslt.
I wonder if someone tried to use plain JS as a filtering language? It would be more verbose but it would be easy to remember. For example:

   [1,2,3] | js "out = 0; for (const n of this) out += n"
That would print "6". `out` would be a special variable you write to to print the result, and `this` would be the input.
Not quite that, but ramda-cli[1] which I've created solves this problem, at least for me, by offering the familiar set of functions from Ramda, and you can create pipelines with those to do operations on your data.

[1]: https://github.com/raine/ramda-cli

I've used trentm's json (formerly known as jsontool) package from npm as my default tool for command-line manipulation of JSON for many years now. It provides CLI arguments for passing JavaScript code for filtering and executing on input. I have resisted investing the time into becoming fluent in jq because I've found that many of the common use cases I have are readily handled by jsontool.

https://www.npmjs.com/package/json

Edit: added more information

A few of the tools listed here seem to work like that, or roughly similar: https://ilya-sher.org/2018/04/10/list-of-json-tools-for-comm...

I didn't check any of them out though.

My hope was to one day add JS eval support to https://github.com/SuperpowersCorp/refactorio but as you can tell by the timestamps I haven't found any time to work on it in the last 4 years.
That's a really interesting suggestion, similar to how AWK uses $0, $1 etc.
Interesting, for me it's the exact opposite.

I've tried a couple of times to get into awk, but still find the syntax arcane.

I don't know; I wouldn't presume to tell you what you do or don't find arcane, but once I understood the somewhat unusual flow of awk ("for every line, check if the line matches this condition, and if it does run this block of code") I found it's quite easy to work with. It's "arcane" in the sense that it has an implicit loop and that it's a specialized language for a very limited class of problems, but I found that for this limited class of problem it's surprisingly effective.

  > an implicit loop
As an occasional awk user, I'd love if you expand on this. Maybe it will help clear things up for me. You're not referring to the fact that awk operates on every line independently, are you?
My mental image of awk has always been something along these lines:

    for line in readfile()
        for block in script:
            if block.match(line)
                run_block(block)
            end
        endfor
    endfor
Where the "for line in readfile()" is the "implicit loop", and the blocks are the "condition { .. }" blocks.

The actual flow is a little bit more complex and has some exceptions e.g. (BEGIN/END), but this is about the gist of it.

Thanks. Yes, I agree that my mental image is pretty much the same but it's nice to see it expressed in Python modulo end keywords ))
To expand on the other reply, there are a couple more implicit loops. There's a loop over all of the command line arguments/files, then a loop for every line in each of those files, then there is kind of a loop over the whitespace delimited fields of each of those lines. The main thing that helped me understand AWK was that every block in a script is just a pattern/action pair. When I saw snippets like

  ... | awk '{print $2}' 
I thought there was all this confusing syntax, but something like

  awk '/pattern/ {print}'
was more clear to me. In the first case, the empty pattern matches every line of the input, and the action is simply to print the second field of each line. Patterns can vary in complexity from the empty pattern to long chains of logical operators and regular expressions, such as /pattern/ in the second example. The outer quotes are just to prevent the shell from eating your dollar signs or other special characters. In a standalone AWK script you can write it like

  /pattern/ {
    print
  }
which also makes it look more like another language.

If you can get your hands on a copy of The AWK Programming Language, it's a pretty quick and pleasant read that helped everything make more sense to me. I do most of my data analysis for my research using AWK and really enjoy working with it.

  > The AWK Programming Language
I see it's public domain and discussed here on HN: https://news.ycombinator.com/item?id=13451454

I'll go over it, thank you very much for the suggestion.

Same issue. However, I do successfully rely on using ctrl-r a lot to search prior invoked commands. And have a few core aliases that I've cobbled together....
Here because.... I didn't know of ctrl-R. What a life changer (although I had an alias for "hg" to "history | grep" :) )
Please check FZF [1] and it’s integration with ctrl-r. It’s a huge productivity boost and I cannot live without it.

[1] https://github.com/junegunn/fzf

I think I actually HAVE fzf installed, thus my first experience w/ Ctrl-R was even better! I really haven't fully grok'ed fzf yet though, probably should look up some guides.
There's also McFly[1] that does interactive history search. [1]: https://github.com/cantino/mcfly
This sounds quite awesome, it's a pity that the realization is quite 'alien' to unix way of things. IMO shell commands history should be a database by default, not just a single file that gets auto-appended with both awesome commands and crap you don't really want to get stored for later.

Thanks for linking this project, will try it, it may be a game changer.

I use both awk and jq infrequently enough that I tend to struggle with anything non-trivial. I think zq would be the same.

> Not entirely convinced by the zq syntax either though; it also seems "arcane enough" that I would keep forgetting it.

I think this is the main thing. I’d prefer a streamlined CLI tool where you passed in some JS code and it’d just run it on the input (with the same slurp/raw args as jq). Could just be npm with underscore.js autoimported.

This is ironic - I use `awk` so infrequently, I have no idea how to use it without reading its man page or using Google. But I use `jq` often and find it simple.
Sadly very few authors seem to acknowledge or even know that github wiki pages are not indexed by search engines so if it wasn't for third-party sites like github-wiki-see.page (which could stop working at any time) their contents would be undiscoverable by the very same people they are usually intended...
What? That's crazy! Does Github block indexing?
There's more details on https://github-wiki-see.page/ and https://github.com/github/feedback/discussions/4992#discussi...

> we have also introduced an x-robots-tag: none in the http response header of Wiki pages

> Abusive behavior in Wikis had a negative impact on our search engine ranking

> GitHub is currently permitting a select criteria of GitHub Wikis to be indexed

https://github.com/robots.txt

I don't see anything here about wiki specifically but maybe one of the rules hits wiki pages?

They've moved from robots.txt to blocking by headers.
Here's podcast interview with the creator of jq about what he's been working on at Jane Street: https://signalsandthreads.com/memory-management/
I didn't realize jq was missing a maintainer, it's one of my most used CLI tools.
It really is a fundamental problem where lots of these important projects aren't maintained simply because the reality is the maintainers can't beat the economics of a lot of rich freeloaders having no real short term incentive to compensate these maintainers..
> can't beat the economics

This makes it sound like this is some antagonistic relationship where the OSS maintainer loses. But the idealistic scenario that you are alluding to[1] is about a developer who develops free OSS in their free time. And then, yes, very few end up paying or donating anything. But how is a predictable chain of events a loss? What is the “economics” of it?

[1] Some OSS developers do it as their day job.

This is unrelated to the argument but using references that aren't references made that really confusing to read.

In any case, what I meant by the "economics" of it is that in general a person can only afford to work for free for so long before they need to pay bills, eat, have and/or acquire a standard of living that isn't poverty. If they have a day job where they are writing this software in their free time, how long can they do this before burning out?

You say that this is unrelated yet your follow-up reinforces my initial impression.

How does one afford to work for free? One has a day job. How does someone who volunteers for search-and-rescue afford it? That’s obviously a ridiculous question—they are volunteers so they necessarily must do something from nine to five. Or be independently wealthy.

But how does one avoid burnout as a double-worked programmer? I think we have ourselves to blame on that point since we have put the double-worked programmer on a pedestal. So we can either:

1. Not work on things both professionally and in our free time; or

2. Force ourselves to do just that because we gain something extrinsic from it that we might need, like simply keeping up with the Joneses (having an answer for “where’s your private GitHub” in interviews…)

When I said "this is unrelated" I was commenting SOLELY on your writing style.

> How does one afford to work for free? This is exactly my point. The work isn't done for free, the person is spending their own money and time which takes away from a limited pool of resources they own. If they're insanely rich, they could probably "afford" to do this work until they die.

But you're making a mistake in your reasoning relating "volunteers", "free work", and "day jobs." Here is what I think you are missing in this assessment: A worker for a company/day job works with an obligation via contract for compensation for their time from their employer. A volunteer works without a contractual obligation of compensation for their time from the community that benefits from their work. In this latter case while there is no contractual obligation for a society/community to compensate the volunteer, it does not forbid it. Does someone who works as a volunteer search-and-rescue deserve to be compensated? I'd say yes, in fact, they do. They are providing a service.

Now I'll get ahead of the next possible argument. "But there's not enough work or compensation for them to make a living!" This is two parts:

1. For not enough work - This is only true because of the example chosen and our human tendency to draw broad analogies. There can definitely be enough work in multiple domains (and especially in software) but also what about volunteer firefighters? 2. Not enough compensation - This is because people with the means to compensate the work, simply are not doing that. And it's not a good faith argument to tell me that in the original case enough people with enough money aren't using the project to compensate it's continued development and maintenance.

To sum up all of the above: Yes, work like this is volunteer work, and it says a lot that societies and communities do not compensate this work. Simply because they don't compensate that work doesn't mean it's not able to be compensated. And there are key differences between this relationship of work and compensation that make it different from the colloquial "work" as in a day job. As this is an entity reserving your time under contract.

Now for burnout as a double-worked programmer. I think you're right on these two points. Obviously the second situation is not ideal. If someone wants to do it, let them. There are plenty of open source projects still maintained by stretched thin developers. Is this a tenable solution long term? No in the vast majority of cases and that's my point!

Pity he quit before Github opened up sponsorships.
How would Github sponsorship help pay for the time of someone at a finance company earning who knows where between 200-500k?
My bad. I guess earning ~$100k is in-sufficient.

https://changelog.com/posts/i-just-hit-100000-per-year-on-gi...

Cool, I think I read that also! But, yeah it is on the low-end for a skilled engineer working for US companies, and also I'm thinking it's on the high-end / an outlier for github sponsors.
It’s not though, because in this case the (ex-)maintainer works at a Wall St firm.
This is exactly my point. Will he quit the paying job to work for free? How long could he maintain this for free with no other job, or with a job and additional free hours without running out of money or burning out?
But everyone can't realistically live a wealthy life off FLOSS tools either. The people who write these things are usually very talented and will make killer pay anywhere they work. Usually a cool thing is when companies sponsor them to work X% for them and Y% on the FLOSS tool.
In this case it doesn't seem too critical? It means jq remains stable, which is probably what should happen once a tool like this gets a lot of users.
I found it hard to approach at first, but I think it was just the lack of material that worked through simple examples step by step.

I ended up writing my own guide to it, that in my unbiased opinion makes it easier to get the point where in-depth examples and language descriptions are easier to understand.

Edit: Oh, wow, it's even mentioned in this article. Maybe I should read before commenting.

https://earthly.dev/blog/jq-select/

I discovered jq after I wrote my own (extremely limited) version of it. I need it quite often, and yet I've never managed to get up the activation energy to learn enough for it to be useful. I need to have some notion of the computation model before anything is going to make sense to me. I hate learning things in completely disparate pieces that I need to memorize in hopes that someday it will just click together and I'll derive the underlying principles.

Your guide was great for this. It stepped me through enough of the bare basics in a way that the underlying model was obvious. It didn't get me nearly far enough for many of the tasks that I need jq for, but it got me started and that's all I really needed. Everything additional that I need to learn becomes obvious in retrospect—"of course there's an operator for this, there kind of has to be!".

Thank you!

From that page:

> The jq documentation is written in a style that hides a lot of important detail because the hope is that the language feels intuitive.

Yeah, not so much boys! Also, that disclaimer should really be at the top of the manual, with a link to the wiki, rather than vice-versa, as it is now.

The wiki is like secret information -- "oh, hey, here's the page that actually tells you how it works!"