Hacker News new | ask | show | jobs
by jstimpfle 3094 days ago
"cat" could never work of course, or the general idea of a terminal that has only a pair of input/output streams must be given up. I don't think people are willing to do that.

A sensible protocol to draw raster graphics would be nice, though. (And it might exist, I don't even know). I don't care whether I need to type "cat" or the name of a terminal-based image viewer.

3 comments

> "cat" could never work of course

We have an idea in that direction, where each program in a pipeline can optionally specify a file type, and the last file type hint in the pipeline determines how the stdout is rendered. So if you have "cat image.jpg", then "cat" can indicate a file type of image/jpg, causing stdout to be interpreted as an image. But if you have "cat image.jpg | grep foo" (just making something up here), then grep can indicate a file type of application/octet-stream because it cannot guarantee what the file type is.

The tricky part, from what I can see, is explaining the pipeline topology to the terminal, so that it knows that "grep" comes after "cat", even though command parsing happens in the shell only. That's still a big unknown. A previous project in this area (TermKit) mandated that applications specify a "Content-Type" header (like in HTTP) in their stdout, but that would break most existing programs, so it's not an option for us.

I would not recommend that. That would be the source for endless bikeshedding and could never satisfy all people. (See the mess that HTTP / REST are).

It's not a problem to have this handled in the shell such that the output is rendered in such a way as to please the terminal. You can do this manually (e.g. append a "| display-jpeg-in-terminal" command to the pipeline), or use a magic bytes based file viewer.

Look at "run-mailcap" or its shortcuts like "see". It does basically that, and can also be fed through standard input (which covers the piping situation).

if shells had some more types than a stream of bytes, it'd make sense for 'image' type data to be displayed as images. the pipes you mention are then simply type casts or conversion functions.
I don't know what you think is missing. "Displaying 'image' type data as images" totally works. Are you aware of magic bytes (which I mentioned above)?
never used them, but i'm thinking more in terms of powershell (pretty sure displaing images isn't something it does though). if images in particular work ok then that's one thing that i can scratch off my list of annoyances with ssh.
If you want a few tips how to type less, feel free to write me an email describing your problem.
There are already terminals and cat implementations that can display 24-bit images just fine in the terminal, by using special escape codes that each encode a pixel. No changes to the IO model needed.

https://github.com/saitoha/PySixel/blob/master/README.rst

Sure - this is what I mean by "protocol". In the case of the image format you linked even cat does work, because the image format contains pre-rendered escape sequences. This does not work in general (e.g. for JPEG or other formats) - where you need a program that does the transformation into escape sequences.
While plain `cat` does not work, using escape sequences and providing support in the terminal emulator allows already to display images inline. See the iTerm2 implementation here: https://www.iterm2.com/documentation-images.html

Making plain `cat` work too should not be too big of a problem with some small support from the running shell. iTerm2 also knows when a command is starting to run/exits, parsing the output it retrieved from STDOUT in between by matching it against some magic numbers and displaying the content accordingly is not too difficult.

Then it's not cat anymore. The purpose of cat is to output the inputs unchanged and concatenated. If you need magic, that's a task for another program.
`cat` is doing exactly that. It is up to another program, namely the terminal emulator to interpret its output.
The terminal emulator does not (and should not, according to my strong opinion) interpret the output of different programs differently.

In fact it's not currently technically possible for a terminal emulator to know which program is writing into it. It maybe many programs simulatenously, or it may be the kernel asynchronously, etc... If you want to change that you would have to make the architecture tremendously more complicated and less flexible.

And the same goes for shells. You don't want to complicate the architecture just so that a lawyer could attest you that "it was cat" who drew the image und some weird interpretation. Nothing would be gained, and nobody would be able to understand what's happening anymore.