| Hey HN! I built Termitty because I was tired of the hacky solutions I kept writing for SSH automation. While building my workflow automation SaaS, I had this seemingly simple requirement: workflows needed to execute tasks on remote servers, not just in browsers. Selenium made the browser part elegant - you could wait for elements, handle dynamic content, maintain session state. But for SSH? I was stuck in the stone age. I ended up writing this monster of a custom driver that: - Maintained persistent SSH connections in a pool
- Kept track of working directories between commands
- Tried to parse ANSI escape codes with regex (don't do this)
- Had a "wait_for_prompt" function that was just `time.sleep(2)` with extra steps The breaking point was when I needed to automate a workflow that involved: 1. SSH into a production server
2. Start a database backup
3. Navigate an ncurses configuration menu
4. Detect when the backup finished (from a progress bar!)
5. Verify the backup file My "solution" was embarrassing - screen scraping terminal output with regex, hoping the prompt detection worked, and praying the timing lined up. It was brittle, unmaintainable, and I knew there had to be a better way. So with Claude Opus 4's help, I built what I wished existed - Termitty brings Selenium's patterns to terminal automation The key insight was that terminals are stateful UIs, just like web pages. So Termitty maintains a virtual terminal buffer - it actually understands what's on screen, tracks cursor position, handles ANSI codes properly. You can find text on the terminal screen just like finding DOM elements. Some cool stuff that emerged: - Full terminal emulation - It maintains a complete VT100/ANSI terminal state
- Session recording - Records everything to JSON/asciinema format with a slick replay UI
- Interactive shell - Persistent shell sessions that can control vim, top, installers
- Real wait conditions - No more sleep()! Wait for specific output, prompts, or patterns Once I had structured terminal state, AI integration became obvious - now LLMs can actually "see" the terminal and make intelligent decisions about what to do next. What SSH automation pain have you encountered? I'm especially curious about edge cases I haven't hit yet! P.S. - Check out the terminal player in the docs. You can record a session and share it with beautiful playback UI - it's like Loom for terminals! |