Hacker News new | ask | show | jobs
by yuntian 336 days ago
A generative operating system that directly predicts screen images based on mouse and keyboard inputs, powered by an RNN for state modeling and a diffusion model for image generation.

See my tweet for more details: https://x.com/yuntiandeng/status/1944802154314916331

1 comments

i like how most of your demo video is clicking through various firefox and google popups.
Pretty realistic, actually.