Hacker News new | ask | show | jobs
by acrooks 602 days ago
I've built a couple of experiments using it so far and it has been really interesting.

On one hand, it has really helped me with prototyping incredibly fast.

On the other, it is prohibitively expensive today. Essentially you pay per click, in some cases per keystroke. I tried to get it to find a flight for me. So it opened the browser, navigated to Google Flights, entered the origin, destination etc. etc. By the time it saw a price, there had already been more than a dozen LLM calls. And then it crashed due to a rate limit issue. By the time I got a list of flight recommendations it was already $5.

But I think this is intended to be an early demo of what will be possible in the future. And they were very explicit that it's a beta: all of this feedback above will help them make it better. Very quickly it will get more efficient, less expensive, more reliable.

So overall I'm optimistic to see where this goes. There are SO many applications for this once it's working really well.

4 comments

I guess I'm confused there's even a use case there. It's like "let me google that for you". I mean Siri can return me search results for flights.

A real killer app would be something that is adaptive and smart enough to deal with all the SEO/walled gardens in the travel search space, actually understanding the airlines available and searching directly there as well as at aggregators. It could also be integrated with your Airline miles accounts and all suggested options to use miles/miles&cash/cash, etc.

All of that is far more complex than .. clicking around google flights on your behalf and crashing.

Further, the real killer app is that it is bullet proof enough that you entrust it to book said best flight for you. This requires getting the product to 99.99% rather than the perpetual 70-80% we are seeing all these LLM use cases hit.

The airline booking + awards redemption use case is a mostly solved problem. Harcore milage redemption enthusiasts use paid tools like ExpertFlyer that present a UI and API for peeking into airline reservation backends. It has a steep learning curve, for sure.

ThePointsGuy blog tried to implement something that directly tied into airline accounts to track milage/points and redemption options, but I believe they got slapped down by several airlines for unauthorized scraping. Airlines do NOT like third parties having access to frequent flier accounts.

While the strategy to find good deals / award space is a solved problem, the search tools to do so aren't. Tools like ExpertFlyer are super inefficient: it permits you to search for maximum one origin + one destination + one airline per search. What if you're happy to go to anywhere in Western Europe? Or if you want to check several different airlines? Then all of a sudden your one EF search might turn into dozens. And as you say, pretty much all of the aggregator tools are getting slapped down by airlines so they increasingly have more limited availability and some are shutting down completely.

And then add the complexity that you might be willing to pay cash if the price is right ... so then you add dozens more searches to that on potentially many websites.

All of this is "easy" and a solved problem but it's incredibly monotonous. And almost none of these services offer an API, so it's difficult to automate without a browser-based approach. And a travel agent won't work this hard for you. So how amazing would it be instead to tell an AI agent what you want, have it pretend to be you for a few minutes, and get a CSV file in your inbox at the end.

Whether this could be commercialised is a different question but I'm certainly going to continue building out my prototype to save myself some time (I mean, to be fair, it will probably take more time to build something to do this on my behalf but I think it's time well spent).

Yes I think this points to the need for adaptiveness which remains humans edge.

We don't need PBs of training data, millions of compute, and hours upon hours of training.

You could sit down a moderately intelligent intern as a mechanical turk to perform this workflow with only a few minutes of instruction and get a reasonably good result.

Ah, but I think you're overlooking one major factor. Convenience. A lot of the spontaneous stuff we do ("hey why don't we pop down to x tomorrow?", or "do you fancy a quick curry?") are things you're not going to book with a Turk. BUT you definitely would fire up a quick agent on your way to the shower and have it do all the work for you while you're waxing your armpits. :) Agentic work is starting super slow, but once the wrinkles are worked out, we'll see a world where they're doing a huge amount of heavy lifting for the drudge stuff. For an example see Her - sorry! :)
Underrated comment.
Yes, that seems to be the larger challenge. The search tools I have used will work for a while until they don't. Real cat & mouse game.

Hence the "adaptive" part of my comment.

It really needs to be a client side agent.

Haiku 3.5 wol be here soon, and will before long support tool use and vision, so that should help a lot with cost.
It’s running in the browser but connected to a VM, right? When you say crashed, what did it do?
time is also a huge factor on this one, should be a nice metric

god the future is here haha