Hacker News new | ask | show | jobs
by spinningslate 1320 days ago
Didn't know about this, the HN dividend pays out again!

When wrestling with sed/awk in trying to parse results of a shell command, I've often thought that a shell-standard, structured outpout would be very handy. Powershell[0] has this, but it's a binary format - so not human-readable. I want something in the middle: human- and machine-readable. Without either having to do parsing gymnastics.

jc isn't quite that shell standard, but looks like it goes a long way towards it. And, of course, when JSON falls out of fashion and is replaced by <whatever>, `*c` can emerge to fill the gap. Nice.

[0]: https://learn.microsoft.com/en-us/powershell/

3 comments

> Powershell has this, but it's a binary format

Well, yes - powershell passes binary objects but as you can always:

1) access their properties 2) pass them downstream 3) serialize to json/csv 4) instantiate from json/csv

I think this is both human- and machine-readable enough (even through internal format is binary, but working with Powershell you are never really exposed to it).

How do you think it can be improved?

In my opinion object io IS the best part of powershell - it allows us to ditch results wrangling with sed/awk/grep entirely. I'm super interested if there's an even better way forward.

Shell would benefit from Content-Type/Accept headers. Like you can specify that cat accepts text and jq accepts Json. Then `ip a` would output corresponding type automatically.
>> Shell would benefit from Content-Type/Accept headers. Like you can specify that cat accepts text and jq accepts Json. Then `ip a` would output corresponding type automatically.

That seems unnecessary. Traditionally, shells have always used text streams. JSON is just text that follows a given convention. Couldn't what you are describing be implemented by setting environment variables or using command line flags?

For example:

PREFERRED_OUTPUT_FORMAT="JSON"

--output-format="JSON"

--input-format="JSON"

Tools that can generate and consume structured text formats are a good idea, but they should be flexible enough that they can even work with other tools that have not been written yet.

"This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface." --Doug McIlroy

I don't follow. JSON is not really readable. I don't want to see JSON output ever except for script debugging. I want to see well formatted output. But at the same time I want to be able to write something like

  ip a | filter "[].address like 192.*"
So when I'm typing `ip a` I expect to get output for human and when I'm piping it to `filter` program, I expect for those programs to exchange with JSON (and ideally `filter` should use some tabular formatting as its human-readable output).

You suggesting that I should write `PREFERRED_OUTPUT_FORMAT=JSON ip a | filter "[].address like 192.*"` but that's really verbose and error-prone. It might work for scripts, but for ad-hoc shell I don't like this approach. Ideally programs should be able to communicate between pipes for their preferred formats.

I was saying that Accept Headers or "format negotiation" concepts that are typically used in client-server communications are a bit overkill for command line tools and shell pipelines.

I agree that human-readable text formats should be the default output formats for command line tools, but that easy-to-parse structured text output formats should be easy to specify with either environment variables or command line flags.

If I am writing a script and I am using tools that support a given structured output format and use environment variables or command line flags for output configuration it could work something like this:

    #!/bin/env script-interpreter
    export PREFERRED_OUTPUT_FORMAT="JSON"
    query-cli-tool | filter-cli-tool --output-format=json | combinator-cli-tool --input-format=json | pretty-formatter-tool > output_file
This would mean that the command line tools default to human-readable formats, but can still generate JSON or some other structured text format when configured to do so.
FWIW ip already has JSON support:

    ip -j a s | jq 'map(select((.addr_info | .[].local)|startswith("192.168."))) | map(.ifname)'
“ip” has the -json option; i.e. “ip -json a” gives you straight JSON; no need for JC.
Someone on this site suggested that programs open another filehandle along with stdout and stderr (stdjson) for their json output which struck me as a way to make this work in a backwards-compatible fashion.
Even just a generic stdmeta would go a long way to defining _what_ is being output. Curl is the worst about this.

https://unix.stackexchange.com/questions/197809/propose-addi...

Nushell has this too. I‘ve tried is as daily driver for a while. It‘s not there yet, but almost. After it hits 1.0 I‘m going to switch for good and leave the duck tape solutions behind.
Shameless plug but maybe give murex a look. That’s been stable for a while now and does the same thing too.

HTTPS://GitHub.com/lmorg/murex