Hacker News new | ask | show | jobs
by randusername 82 days ago
I'm not GP, I use jq all the time, but I each time I use it I feel like I'm still a beginner because I don't get where I want to go on the first several attempts. Great tool, but IMO it is more intuitive to JSON people that want a CLI tool than CLI people that want a JSON tool. In other words, I have my own preconceptions about how piping should work on the whole thing, not iterating, and it always trips me up.

Here's an example of my white whale, converting JSON arrays to TSV.

cat input.json | jq -S '(first|keys | map({key: ., value: .}) | from_entries), (.[])' | jq -r '[.[]] | @tsv' > out.tsv

4 comments

    <input.json  jq -S  -r '(first | keys) , (.[]| [.[]]) | @tsv'
    <input.json  # redir
    jq
    -S           # sort
    -r           # raw string out
    '
    (first | keys) # header
    ,              # comma is generator
    (.[] |           # loop input array and bind to .
    [                # construct array
     .[]             # with items being the array of values of the bound object
     ])           
     | @tsv'        # generator binds the above array to . and renders to tsv
oh my god how could I have been doing this for so long and not realize that you can redirect before your binary.

I knew cat was an anti-pattern, but I always thought it was so unreadable to redirect at the end

it seems smart until you accidently type >input.json and nuke the file
That sounds like a mistake which would be easily to make at the end of the line, unless you are contrasting input stream redirect against cat regardless where it's written on the line?
You can use sponge for that.
Here's an easier to understand query for what you're trying to do (at least it's easier to understand for me):

    cat input.json | jq -r '(first | keys) as $cols | $cols, (.[] | [.[$cols[]]]) | @tsv'
That whole map and from entries throws it off. It's not a good use for what you're doing. tsv expects a bunch of arrays, whereas you're getting a bunch of objects (with the header also being one) and then converting them to arrays. That is an unnecessary step and makes it a little harder to understand.
Thanks for sharing, this is much better, though I actually think it is the perfect example to explain something that is brain-slippery about jq

look at $cols | $cols

my brain says hmm that's a typo, clearly they meant ; instead of | because nothing is getting piped, we just have two separate statements. Surely the assignment "exhausts the pipeline" and we're only passing null downstream

the pipelining has some implicit contextual stuff going on that I have to arrive at by trial and error each time since it doesn't fit in my worldview while I'm doing other shell stuff

I totally agree, it did take me a while to come to terms with the syntax of assigning variables specifically due to that pipe at the end. I guess sometimes we just have to know the quirks of the relevant tooling we use. I used to use PHP heavily in the 4 and 5 days, and kinda got used to all the quirks it had. So during reviews, I would pick up a lot of issues some of my colleagues did not.

Interestingly some things do use a semicolon in jq, specifically while, until, reduce and some others I can't remember right now.

Honestly both of those make me do the confused-dog-head-tilt thing. I'd go for something sexp based, perhaps with infix composition, map, and flatmap operators as sugar.
I find it much harder to remember / use each time then awk
Trying to make a generic pipeline for json arrays because you don’t know the field names?
Why the f would they want to hardcode the field names?
Because usually I am dealing with data I know not anonymous data
Doesn't have to be anonymous to be variable
You know what I meant