Hacker News new | ask | show | jobs
by haliyat 755 days ago
Also, FWIW, I saw a bunch of demos of “token sequence learning” that did a lot of the applications that people have been so excited about with LLMs: producing text descriptions of video and images, text summarization with question answering, etc. Those demos were a little janky and limited and obviously only at the academic paper with impressive video demo stage which is a far cry from fast and reliable enough to be useful in production. But they weren’t categorically different from what we’ve seen with transformers and LLMs. This is one of the reasons I’m more skeptical about claims that transformers + more data and compute is all we need for AGI. After a decade plus of not just MASSIVE compute and data scaling but some fairly clever new techniques I would describe progress as incremental rather than transformational beyond those older results. Honestly, people have forgotten this now, but the biggest change that ignite the LLM hype was the UX decision to present interactions with these models in the framework of a conversation with an agent. This is a trick that goes back at least as far as Eliza and it’s effect is mainly in how it primes the user to think about and relate to the tech. That is also an area where more work can be done (conversational interfaces are not the One Solution to all computing). I recommend googling Interactive Machine Learning, which is its own sub-discipline that specifically studies this problem of how to build UX that is native to, and takes best advantage of, ML/AI techniques to produce software that people can use to accomplish real tasks.