Hacker News new | ask | show | jobs
by brunorsini 1089 days ago
"There is nothing, I repeat nothing, more intimidating than an empty text box staring you in the face."

Talk about a hyperbolic opening line.

Is it really that intimidating to have an empty text box on Whatsapp or your favorite SMS app? No, as you expect to have an appropriate response coming from the other side, pretty much regardless of what your input is.

As a frequent user of ChatGPT, I've come to expect the same in there. And it works great, without me having to study any "prompt engineering". In fact, as it gets updated, I get frustrated less and less often — unlike my experience using Bard, which can be better for a few tasks but often returns opaque errors that do feel frustrating. The solution here is clearly for the model to improve, and one doesn't even need a leap of faith — just look at what OpenAI is already delivering!

Talking to a competent LLM is nothing like talking to bash or dos. I also get frustrated when I sometimes have to ask for the same thing in a slightly different way... but that's still almost always faster than searching for the right button or submenu in most creation-oriented software. Whoever is waiting for Word or Google Docs to add a "write this in business-formal email tone" dropdown menu to the UI clearly hasn't grokked the true shift we're about to go through in computing.

Incidentally, I am often using ChatGPT to help me do more advanced / rarely used tasks in software from Avid Pro Tools to Adobe Premiere. And I can't remember a single time when doing this was slower or more frustrating than reaching out to either Google or the software's own "help" section.

Of course we'll have more input options. It makes tons of sense for things like image or video generation. I bet the models will also soon be outputting more and more "interactive elements" that will aid in refining results. But I have a feeling the opening text box (or, better yet, the open ears of a friendly audio assistant) is here to stay.

3 comments

> or, better yet, the open ears of a friendly audio assistant

It’s interesting you mention this. I’ve been wondering this for a while now - there have been made leaps recently in LLMs, speech synthesis and speech recognition. There are sophisticated language models, computer voices that are hard to distinguish from real humans, and software that can reliably understand even the worst recording of someone speaking. Yet still, those three components have not yet been integrated in a next generation Alexa yet. But why? It doesn’t even sound particularly complicated (on the scale of all the prior art necessary).

It makes it seem creepy and hides that it’s a robot. The limitations are easier to understand when someone knows it isn’t a human.
BingChat already takes voice input and gives voice replies, but still requires the push of a button in the UI to start, it still can't run as a voice assistant in the background.
Principal–agent problem! Previous generation assistants have been frozen in time by managerial capitalism. This is evident in literally all the incumbents that matter in the western world: Google, Amazon, Apple, Microsoft and Samsung.

It took founder-led OpenAI to kick everyone in the butt. Thankfully the wheels are moving again to get to what you're describing, an inevitability in the very near future.

Who are the principal and the agent here?
Sam Altman is the principal. The agents are MAMAA middle managers, who are often also smart (though I'm clearly biased here, having been one of those in my previous life) but highly incentivized to be obedient and risk-averse.
> Is it really that intimidating to have an empty text box on Whatsapp or your favorite SMS app? No, as you expect to have an appropriate response coming from the other side, pretty much regardless of what your input is.

Yes, very much yes - sure, I can expect to get an appropriate response, but that doesn't change the fact I don't know what to write about to start the conversation. Empty text box does indeed scare me - if I want to write something, but have no good idea what to write (or more than couple competing ideas that feel equivalent), my mind simply goes blank.

Yes, this applies to ChatGPT too. There's million of things I want to bounce off GPT-4. But when I have the time, none of those things come to mind.

Surely the author and I are not the only ones in this. There's a reason the "fear of empty page" is a term amongst writers. There's a reason you may occasionally hear of the "fear of empty text editor" in context of programming.

Of course, if I know what I want, then it's all fine - except, I find myself constantly constrained by my own typing speed. Doubly so now, with the recently improved response time of OpenAI's GPT endpoints.

Do you go to google.com when you have nothing to search for?

When you do, does the current opening UI feel that inadequate?

There's a simple reason they haven't added tons of UI elements for things like advanced search operators: the vast majority of the queries and the vast majority of users simply don't need them to get what they want from the tool.

FWIW, they have no reliable way of measuring what "vast majority of users" "want from the tool", or when they get it. Users navigating to one of the search results and abandoning the search may mean they found what they want - or it may just mean they gave up.
"There is nothing, I repeat nothing, more intimidating than your blind date across the table staring you in the face."