Hacker News new | ask | show | jobs
by CephalopodMD 624 days ago
Totally agree. It took me a full week before I realized that the Strawberry/o1 model was the mysterious Q* Sam Altman has been hyping up for almost a full year since the openai coup, which... is pretty underwhelming tbh. It's an impressive incremental advancement for sure! But it's really not the paradigm shifting gpt-5 worthy launch we were promised.

Personal opinion: I think this means we've probably exhausted all the low hanging fruit in LLM land. This was the last thing I was reserving judgement for. When the most hyped up big idea openai has rn is basically "we're just gonna have the model dump out a wall of semi-optimized chain of thought every time and not send it over the wire" we're officially out of big ideas. Like I mean it obviously works... but that's more or less what we've _been_ doing for years now! Barring a total rethinking of LLM architecture, I think all improvements going forward will be baby steps for a while, basically moving at the same pace we've been going since gpt-4 launched. I don't think this is the path to AGI in the near term, but there's still plenty of headroom for minor incremental change.

By analogy, i feel like gpt-4 was basically the same quantum leap we got with the iphone 4: all the basic functionality and peripherals were there by the time we got iphone 4 (multitasking, facetime, the app store, various sensors, etc.), and everything since then has just been minor improvements. The current iPhone 16 is obviously faster, bigger, thinner, and "better" than the 4, but for the most part it doesn't really do anything extra that the 4 wasn't already capable of at some level with the right app. Similarly, I think gpt-4 was pretty much "good enough". LLMs are about as they're gonna get for the next little while, though they might get a little cheaper, faster, and more "aligned" (however we wanna define that). They might get slightly less stupid, but i don't think they're gonna get a whole lot smarter any time soon. Whatever we see in the next few years is probably not going to be much better than using gpt-4 with the right prompt, tool use, RAG, etc. on top of it. We'll only see improvements at the margins.

2 comments

I think LLMs are more iPhone version one and missing a lot of "all the basic functionality"

Re "all the basic functionality" if you are talking intelligence then you are kind of talking the things humans can do and LLMs kind of do basic language but are not much good and common sense reasoning, spatial awareness and shape rotation, motor skills and running around, emotional intelligence, figuring what people are thinking, planning and probably some other stuff. It'll be iPhone 4 when those are covered.

chat has become a limiting factor.

its both too linear, and hard to revise. it's hard to undo parts of the conversation that poison it. it's too hard to save important bits that shouldn't be forgotten or drowned out. its not word processor like enough.

i envision the next generation of these products being multipane by default.

on the left I have a chat, in the center I have a whiteboard, and on the right I have a rendered document. throwing a clip of something onto the whiteboard makes it modifiable by chat. "store this, categorize it, summarize it, place it in the document."

whatever comes next needs to function more like onenote or obsidian. just to use an example from current events today, lets say I want to make a Parody Dossier on Walz, similar to todays Vance leak. I should be able to describe the project to chat. It builds a document structure. I tell it we are going to scrape all of the internets jokes on Walz at a bbq or other non-scandals. I should be able to quickly click through a table of contents, and "chat" with each paragraph. "This one needs fleshing out, this one needs summarization." As we scrape a reddit post, we want to incorporate not only the original post, but all the best comments. I should be a be able to "chat with the document editor" and put together a 200 page document in the amount of time it took me to write this post, just by describing what is and isnt working, and dragging and dropping.

chat, the simplicity of it, understanding of complex sentences, and multi sentence conversations was a UI paradigm leap. it went well past keyword search and the new command line of the internet. its a great first step, and a nice reset after a decade of interface stagnation, but now its ubiquity and simplicity, like the search box, is clouding peoples imagination and ability to dream up the next new interactive interface, which I expect to involve more mouse and visual relationships.

tldr: the llm is a component of the next generation interface, not the entire interface itself.