| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by tptacek 164 days ago
	This is a tangential point (this post is not really about TUIs; sort of the opposite) and I think lots of people know it already but I only figured it out last week and so can't resist sharing it: agents are good at driving tmux, and with tmux as a "browser", can verify TUI layouts. So you can draw layouts like this and prompt Claude or Gemini with them, and get back working versions, which to me is space alien technology.

5 comments

frumplestlatz 163 days ago

I’ve actually got an MCP server that makes it really easy for Claude to generate key events, wait for changes / wait for stable output / etc, and then take PNG screenshots of the terminal state (including all colors/styling) — which it “views” directly as part of the MCP tool response.

Wish I could open source it; it’s a game changer for TUI development.

link

frumplestlatz 163 days ago

If anyone wants to do this at home, this is a great base to work from:

https://github.com/memextech/ht-mcp

link

Kerrick 163 days ago

I've had a lot of good luck with AI Agents also being able to read and meaningfully interprete .txt and .ansi dumps from RatatuiRuby's TestHelper module and its `assert_snapshots` [0].

For example, it was able to read the ANSI and figure out that a snapshot had changed because an upstream bug properly rendered bold and underline when it hadn't before.

[0]: https://git.sr.ht/~kerrick/ratatui_ruby/tree/trunk/item/lib/...

link

heliumtera 163 days ago

Yeah, text was king yesterday, will be tomorrow

link

agavra 163 days ago

This is spot on, I understand very little about how terminal rendering works and was able to build github.com/agavra/tuicr (Terminal UI for Code Review) in an evening. The initial TUI design was done via Claude.

link

eterps 163 days ago

Would love to hear more about this approach.

link

tptacek 163 days ago

It's actually really easy in Claude Code. Get a TUI to the point where it renders something, and get Claude to the point where it knows what you want to render (draw it in ASCII like this post proposes, for instance).

Then just prompt Claude to "use tmux to interact with and test the TUI rendering", prompt it through anything it gets hung up on (for instance, you might remind Claude that it can create a tmux pane with fixed size, or that tmux has a capture-pane feature to dump the contents of a view). Claude already knows a bunch about tmux.

Once it gets anything useful done, ask it to "write a subagent definition for a TUI tester that uses tmux to exercise a TUI and test its rendering, layout, and interaction behavior".

Save that subagent definition, and now Claude can do closed-loop visual and interactive testing of its own TUI development.

link

electroly 163 days ago

Can you explain tmux's contribution here? I'm confused why this process wouldn't work just the same if CC directly executed the program rather than involving tmux. Are you just using tmux to trick the program under test into running its TUI instead of operating in a dumb-stdout mode?

link

tptacek 163 days ago

It allows Claude to take screenshots and generate keyboard inputs. It's like TUI Playwright.

link

mrstackdump 163 days ago

Maybe I'm not understanding it (totally possible!) but could Claude just do that by reading standard out and writing to standard in?

link

tptacek 163 days ago

I had a really hard time getting anything like that to work (you can't just read stdout and write stdin, because you're driving a terminal in raw mode), but it took like 3 sentences worth of Claude prompt to get Claude to use tmux to do this reliably.

link

rsanheim 163 days ago

Also many CLIs act differently when invoked connected to a terminal (TUI/interactive) vs not. So you’d run into issues there where Claude could only test the non-interactive things.

link

alehlopeh 163 days ago

So by screenshots you mean tmux capture-pane, not actual screenshots. So in essence it is using stdout, just not Claude’s own.

link

wakawaka28 163 days ago

"In essence" but terminals do stuff to render stdout that you do not want a LLM to have to replicate, I think. If your TUI does stuff in fullscreen or otherwise with a bunch of control codes, that is simple work for a terminal but potentially intractable for a LLM.

link