Hacker News new | ask | show | jobs
by delta_p_delta_x 801 days ago
Sadly, like most 'hello world' deep dives, the author stops at the `write` syscall and glosses over the rest. Everything before the syscall essentially boils down to `printf` calling `puts` calling `write`—it's one function call after another forwarding the `char const*` through, with some book-keeping. In my opinion, not the most interesting.

What comes after the syscall is where everything gets very interesting and very very complicated. Of course, it also becomes much harder to debug or reverse-engineer because things get very close to the hardware.

Here's a quick summary, roughly in order (I'm still glossing over; each of these steps probably has an entire battalion of software and hardware engineers from tens of different companies working on it, but I daresay it's still more detailed than other 'tours through 'hello world'):

  - The kernel performs some setup setup to pipe the `stdout` of the hello world process into some input (not necessarily `stdin`; could be a function call too) of the terminal emulator process. 
  - The terminal emulator calls into some typeface rendering library and the GPU driver to set up a framebuffer for the new output. 
  - The above-mentioned typeface rendering library also interfaces with the GPU driver to convert what was so far just a one-dimensional byte buffer into a full-fledged two-dimensional raster image:
    - the corresponding font outlines for each character byte is loaded from disk;
    - each outline is aligned into a viewport; 
    - these outlines are resized, kerning and font metrics applied from the font files set by the terminal emulator;
    - the GPU rasterises and anti-aliases the viewport (there are entire papers and textbooks written on these two topics alone). Rasterisation of font outlines may be done directly in hardware without shaders because nearly all outlines are quadratic Bézier splines.
  - This is a new framebuffer for the terminal emulator's window, a 2D grid containing (usually) RGB bytes.
  - The windowing manager takes this framebuffer result and *composits* with the window frame (minimise/maximise/close buttons, window title, etc) and the rest of the desktop—all this is done usually on the GPU as well.
    - If the terminal emulator window in question has fancy transparency or 'frosted glass' effects, this composition applies those effects with shaders here.
  - The resultant framebuffer is now at the full resolution and colour depth of the monitor, which is then packetised into an HDMI or DisplayPort signal by the GPU's display-out hardware, depending on which is connected.
  - This is converted into an electrical signal by a DAC, and the result piped into the cable connecting the monitor/internal display, at the frequency specified by the monitor refresh rate.
    - This is muddied by adaptive sync, which has to signal the monitor for a refresh instead of blindly pumping signals down the wire
  - The monitor's input hardware has an ADC which re-converts the electrical signal from the cable into RGB bytes (or maybe not, and directly unwraps the HDMI/DP packets for processing into the pixel-addressing signal, I'm not a monitor hardware engineer).
  - The electrical signal representing the framebuffer is converted into signals for the pixel-addressing hardware, which differs depending on the exact display type—whether LCD, OLED, plasma, or even CRT. OLED might be the most complicated since each *subpixel* needs to be *individually* addressed—for a 3840 × 2400 WRGB OLED as seen on LG OLED TVs, this is 3840 × 2400 × 4 = 36 864 000 subpixels, i.e. nearly 37 million pixels.
  - The display hardware refreshes with the new signal (again, this refresh could be scan-line, like CRT, or whole-framebuffer, like LCDs, OLEDs, and plasmas), and you finally see the result. 
Note that all this happens at most within the frame time of a monitor refresh, which is 16.67 ms for 60 Hz.
5 comments

> The display hardware refreshes with the new signal (again, this refresh could be scan-line, like CRT, or whole-framebuffer, like LCDs, OLEDs, and plasmas), and you finally see the result.

Nice explanation but you stopped at the human's visual system which is where everything gets very interesting and very very complicated. :)

[1] https://en.wikipedia.org/wiki/Visual_system

Assuming there's a vaccuum between the screen and your eyes (perfectly spherical), of course.

Otherwise, its gets very interesting and very very complicated. :)

thank you for that. I startled the dog with that laugh :)
If you're interested in dives that deep, you might like Gynvael Coldwind's hello world in Python on Windows dive [1]. Goes through CPython internals, Windows conhost, font rasterization, and GPU rendering, among others.

[1]: https://gynvael.coldwind.pl/?id=754

Most of this stuff is irrelevant to the program itself, e.g you could've piped the output to /dev/null and none of this would happen.
Fair enough.

However, the point of a hello world program is to introduce programming to beginners, and make a computer do something one can visibly see. I daresay this is made moot if you pipe it into /dev/null. I could then replace 'hello world' in the title with 'any program x that logs to `stdout`', and it wouldn't reach HN's front page.

In the same vein is this idea of 'a trip of abstractions'—I don't know about you, but I always found most of them very unsatisfying as they always stop at the system call, whether Linux or Windows. It really is afterwards where things get interesting, you can't deny that.

It also skips what happens before _start, for example how a process is born (execve on linux is pretty weird), how the program is loaded into memory (including binfmt_* and the allmighty binfmt_misc), relocations, exception handling frames, sections, and ELF loader in general, allocation of OS resources (including necessary malloc) and probably much more.
I'm surprised this didn't get more votes. I thought it was fantastic!