Hacker News new | ask | show | jobs
by bigtrakzapzap 2494 days ago
No, that's your opinion, not mine. It's a flaw because it creates escaping magic, delimiters that are in-band (spaces) and every program has to know how to parse the output of every other program, wasting human time and processing time. Structured communication, say like an open standard for a schema-less, self-describing, flexible de/serialization like protobufs/msgpack, would be far superior, more usable, more efficient and simple but no too simple to process streams of data with structure and programmability already there.

Being able to dump structured information out of an exception directly into a log, and then from a log into a database, without any loss of information or extraneous log parsing, is a clear win. Or from a command as simple as listing files ("ls") into a database or into any other tool or program. Outputing only line-oriented strings is just throwing away type information, and creates more work for everyone else, even more so than continuing to stay with lines processing tools.

4 comments

Sounds like the CLI tools, GNU and otherwise, could benefit from some kind of "--format" switch to define the output in detail. I mean something like ls --format "{ \"filename\": \"%{filename}\", \"size\": %{size} }" for a JSON-style output (or whatever the user wanted).

Or even something like --format-rfc4627 filename,size,permissions-numeric to get only those fields out in a valid JSON format.

This wouldn't remove the "every program has to know how to parse the output of every other program", but I am not convinced it is needed. For instance, how would e.g. grep know what field you want to process? And does e.g. "size" carry universally the same meaning and semantics in all programs there is and can ever be? Ditto for "permissions". And what about "blacklist".

As a completely fictious toy example:

  (ls --format-rfc4627 filename,size,permissions-numeric | json-shunt -p "filename" --cmd grep "MegaCorp") | json-print -p filename,size
The fictious "json-shunt" (for lack of better name) would pass only that input-parameter to its --cmd as an input, in this case grep command, the | part would be done for things for which the grep matched, but with all the other parameters intact. So it'd print the filenames and sizes of filenames with case-sensitive "MegaCorp", and output it in JSON.

Yes, I know there are more concise and elegant ways to express the same thing of printing out file sizes and filenames of matching filenames... Also, when scripting pipelines the verbosity doesn't matter, IMO it'd actually be a benefit to be able to read the script.

Edit: fix pre-caffeine wonky redirection and brain fart

So, if you're not a fan of the UNIX philosophy, maybe check out Powershell. Or take a look at WMI and DCOM in Windows. Eschew shell scripts in favor of strongly-typed programs that process strongly-typed binary formats, or XML, or whatever. The alternatives are out there.

"Worse is better" isn't for everyone.

Nushell does not seem like a violation of the unix philosophy, or at least the version of it that I like best.

"Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface."

Perhaps I'm wrong, isn't nushell simply adding one more dimension?

Instead of a one-dimensional array of lines, the standard is now... a two-dimensional array of tabular data. Perhaps it is not strictly a "text stream" but this does not seem to me to be a violation of the spirit of the message.

Simple line-based text streams are clearly inadequate IMO. 99.99% of all Unix programs output multiple data fields on one line, and to do anything useful at all with them in a programmatic fashion you wind up needing to hack together some way of parsing out those multiple bits of data on every line.

    maybe check out Powershell
I left Windows right around the time PS became popular, so I never really worked with it.

It seems like overkill for most/all things I'd ever want to do. Powershell objects seem very very powerful, but they seemed like too much.

Nushell seems like a nice compromise. Avoids the overkill functionality of Powershell.

Semantically, a one-dimensional list/array of “things” can be trivially transposed into a series of lines with a linebreak separator, but I don’t think the same holds true for a “list of lists” (2-dimensional data) or a “list of lists of lists” (3D data) etc. At least without a standard begin-end list delimiter that allows nesting.

Just thinking about a way that perhaps an old tool can be “wrapped” to be tricked into working with 2+-dimensional data by somehow divvying up the 2+ dimension input data into concurrent 1-dimensional streams, but this seems to require a way to represent more than 1 dimension of data without breaking existing utilities (unless there was, like, a wrapper/unwrapper layer that handled this...)

It’s worth noting that Powershell is available on Linux. Objects are pretty cool. https://docs.microsoft.com/en-us/powershell/scripting/instal...
It works pretty well on the Mac now, too.
"Worse is better" is not some absolute truth or philosophy and *NIX has won: Android, MacOS, WSL.

There's no real alternative for many profesionals, if they want to be employable.

Maybe we should accept this fact and try to make everyone's life easier than be stuck up and push people away.

There's no law that says that Unix tools can't be extended.

And heck, basic POSIX tools violate so many Unix principles...

When perl came along it killed the 'Unix one tool philosophy' dead as a doornail. And since then and people have just kinda ignored the smell coming off the rotting corpse.

I don't write complex scripts in shell anymore because it's insanity. But ad-hoc loops and crap like that... hell yeah. At least a few a day. Sometimes dozens.

People need to be reminded, I think, that shell isn't a programming language first. It's a user interface. And when I look at Powershell scripts and other things of that nature and think about living in Powershell day in and day out I don't see the big pay-off over something like fish.

'Is this going to make it so I can get my work done faster?'

'Is this going to be more pleasant to use as a primary interface for a OS?'

When I go into fish or zsh and use curl to grab json and 'jq' to get a idea of how to use the API in python or node...

versus running 'curl' in powershell against some random API I have never used before..

I get the distinct impression that 'This is fucking garbage' in that It would take me a lot longer to figure out how to use powershell in a useful way then the time I would save by doing so in the long run.

The irony is that the very attempt to be one tool for everything caused Perl's own destruction. Perl 5 is still used by some veterans for small scripts but who wants to use Perl 6?

Unix follows the KISS principle, and that is key for success. Albert Einstein said: "Keep things as simple as possible but not too simple". In that sense Unix and Posix are well done. However, that doesn't mean that good ideas like Nushell are not welcome.

I think the failure of Perl 6 was caused by a lack of community building and implementation leadership, not by trying to be too many things at once.
Yeah, I tried using Powershell as my shell and that's when I found out Powershell is more about the scripting language than an optimized shell used for everyday use. I was confronted with this almost immediately because one of the things I rely on most in Bash is 'dirs -v', pushd and popd. I have dirs aliased to dirs -v and I usually have 15-20 paths on my stack. I'll leave implementing the same functionality in Powershell as a user exercise.
Im confused... Why doesn't Push-Location (aliased to pushd by default), Pop-Location (aliased to popd by default), and Get-Location -Stack (trivially aliasable to dirs) not work? You can even have multiple location stacks and, if you care about the registry, you can also use the same commands to handle registry navigation as well.
‘dirs -v’ shows an enumerated list of directories. If I follow that up with pushd +11 I will be taken to the 11th path listed in the prior dirs command. As far as I know this isn’t implemented out of the box in PS
What about JSON vs protobufs? Is there a schemaless system that you use?
If GNU decided tomorrow that all utilities need to have a --json output argument then that would make me a very happy person.
Better than nothing, but the problem with that is that you can't construct a deep, lazily evaluated pipeline. JSON can't be outputted until all the data is collected.
There's a streaming format which is used for this: JSON Lines [1] AKA newline-delimited JSON [2].

[1] http://jsonlines.org/

[2] http://ndjson.org/

That still limits us too much due to the rather severely limited available data types (although arbitrary types could be serialized into binary/string data, I guess...)
Funny that you should mention this. I just hit that problem yesterday. The lack of a binary type is a problem. This is the same thing that hit XML. Unfortunately (or fortunately), the best solution is still the MIME stuff.
AFAIK protobufs are not schema-less.
Yeah, those sentences are too close together.

I'm trying to elicit more of a response from the comment author. I think they make some good points. I would like to learn more about their ideal system.

I was just looking at MessagePack yesterday (researching best ways to encode an event stream) and was very impressed.