Hacker News new | ask | show | jobs
by koterpillar 1128 days ago
Does Sixel have a spec? I.e. how large are the pixels, whether there's a newline after the image (couldn't find how to turn that _off_)?

Recently discovered Kitty's graphics protocol (https://sw.kovidgoyal.net/kitty/graphics-protocol/) which has more features or at least more documented ones :)

3 comments

> Does Sixel have a spec?

DEC invented Sixels back in the ’80s, and they were serious about their docs, so the corresponding chapter of the VT3xx manual[1] is probably as good as it gets.

> I.e. how large are the pixels [...].

Historical implementations likely assume the relation between pixels and character cells that’s implied by the geometry of the DEC fonts. I’ve seen a lot of arguing about adapting this to the modern world, but I don’t know if a consensus has emerged.

[1] https://vt100.net/docs/vt3xx-gp/chapter14.html

No, there is an escape code that queries the window size in pixels:

    "\x1b[14t"
Combined with the escape code that queries the window size in character cells ("\x1b[18t"), you can calculate the number of pixels per character cell (the "pixel size").
Are these escape codes actually implemented in the average terminal? I recently tried to get e.g. alacritty to tell me this stuff but I don't even know how you're supposed to red back the response.
Yes, every xterm-compatible should have them.

You just send the particular query (e.g. ‘CSI 14 t’) and the terminal sends back a response in the defined form¹. Of course you'll want raw mode, echo off, etc. Normally a library like curses does this for you. If you want to see, https://gist.github.com/kpschoedel/6a87ec2157ce2140be69193d1... (I just whipped this up to answer the question; don't expect production quality)

¹ https://invisible-island.net/xterm/ctlseqs/ctlseqs.html#h3-F...

Thank you so much! I will incorporate this to my sixel feature branch of a tui matrix client. Memes in the terminal!
Which matrix console client? I only know gomuks and I really miss proper image support there.
Most implementations I've seen use an ioctl to query those particular bits. That's implemented quite reliably, since the same ioctl is used for character size as window size. Some implementations just set the character size to zero though.
Ioctl doesn't work over a serial port. The escape code queries are more general.
‘CSI 16 t’ reports the character size in pixels directly.
How about the cursor position? The spec talks about "Sixel Scrolling Mode" but I couldn't find any way to display a sixel image inline: https://stackoverflow.com/questions/70647549/displaying-a-si...
What the hell, ouch. This looks to be very buggy across the board. Here’s a test:

  LF='\n'; ESC='\e'; CSI=$ESC'['; DCS=$ESC'P'; ST=$ESC'\\'
  CUF=$CSI'%dC' # cursor forward
  SIXEL=$DCS'q%s'$ST # sixel image
  printf "before $SIXEL$CUF after$LF" \ 
    '#0;2;0;0;0#1;2;100;100;0#2;2;0;100;0#1~~@@vv@@~~@@~~$#2??}}GG}}??}}??-#1!14@' \ 
    2
With a freshly launched terminal on my machine, I get:

- in XTerm (xterm -xrm "XTerm*decTerminalID: vt340"), "before ", "HI", " after", that is to say exactly what you want, out of the box;

- in Foot, "before ", "HI", newline, some spaces, "after";

- in Contour, "before ", "HI", enough newlines to clear the screen (?..), no spaces (?!..), "after".

OK, sez I, let’s just save the cursor position (DECSC, ESC 7) before the image and restore it (DECRC, ESC 8) afterwards, then skip over it; that is,

  DECSC=$ESC'7'; DECRC=$ESC'8' # add to definitions
  printf "before $DECSC$SIXEL$DECRC$CUF after$LF" # change format string
In XTerm, this (rightly) makes no difference. In Foot and Contour however, you still end up a line resp. a screen below where you started, if now with the correct horizontal position.

So it seems to me like what you want should work by default, except it doesn’t.

It should be possible to instead just treat the whole thing as a framebuffer overlay (by computing or directly asking for the character cell size, as Kirill Panov rightly admonishes me is possible with XTWINOPS) without touching the cursor; that’s what the “sixel scrolling” setting (DECSDM) is supposed to do. Then you can just manually move the cursor forward however many positions after you’re done drawing.

Except apparently the DEC manual (the VT330/340 one above) and DEC hardware contradict each other as to which setting of DECSDM (set or reset) corresponds to which scrolling state (enabled or disabled), and XTerm has implemented it according to the manual not the VT3xx[1,2,3]—then most other emulators followed suit[4]—then XTerm switched to following the hardware[5,6] (unless you and that’s what I’m seeing on my machine right now. So now you need to check if you’re on XTerm ≥ 369 or not[7]. And also for other terminals’ versions, because apparently that’s a thing now[8,9].

Again, ouch.

P.S. DEC had an internal doc for how their terminals should operate (DEC STD 070) [10]. It does not document DECSDM at all.

[1] https://github.com/wez/wezterm/issues/217#issuecomment-86449...

[2] https://github.com/hackerb9/lsix/issues/41

[3] https://github.com/dankamongmen/notcurses/issues/1782

[4] https://github.com/arakiken/mlterm/pull/23

[5] https://invisible-island.net/xterm/xterm.log.html#xterm_369

[6] https://invisible-island.net/xterm/ctlseqs/ctlseqs.html#h3-T...

[7] https://github.com/dankamongmen/notcurses/commit/0918fa251e2... (the correct version cutoff is 369 not 359, the patch contains a now-fixed bug)

[8] https://github.com/dankamongmen/notcurses/issues/2204

[9] https://github.com/dankamongmen/notcurses/blob/master/src/li... (look for mentions of invertsixel or invert80)

[10] http://www.bitsavers.org/pdf/dec/standards/EL-SM070-00_DEC_S...

> [10] http://www.bitsavers.org/pdf/dec/standards/EL-SM070-00_DEC_S...

Nice. I wish I'd had that years ago when the maintainer of a then-popular virtual terminal got very angry at me for suggesting that DECCOLM (set 80/132 columns) should not change the number of lines.

It's interesting to read the discussion about Sixel support in Kitty [1], where the pros and cons of Sixel are considered in relationship with Kitty. In particular, I find this comment [2] by the maintainer of libsixel particularly intriguing:

> After I took over the maintainership of libsixel I unfortunately decided it cannot support the security demands of Kitty, it is too insecure internally. I need to write a Rust library or something.

[1] https://github.com/kovidgoyal/kitty/issues/2511

[2] https://github.com/kovidgoyal/kitty/issues/2511#issuecomment...

Kitty is the epitome of NIH. They don't do modifyOtherKeys either.
> Kitty is the epitome of NIH.

Sorry to be that guy, but what is a "NIH"? All I know is the https://www.nih.gov/ :)

My apologies — I dislike seeing unexplained acronyms myself. As detaro answered before me, it's ‘not invented here’, the tendency to reject existing solutions for a sense of control.
Check out xterm author comments about the history of MOK: some people tried to present his works as theirs.