Hacker News new | ask | show | jobs
by bartuu 502 days ago
I get excited when I see things like this, even if they're simple because I think I can build a business on top of it. However, in complex tasks and real-life cases, it's successful in very few instances. I can't trust its stability. This makes me feel like I've been deceived. I believe agents will be used for tasks that require very little intelligence and constant repetition. It's actually an assistant in situations like this demo. I want to use agents everywhere, but they're not successful in their outputs. GPT-4 is still being used at large scales. I don't know what the situation is with high-level usage of models that do reasoning like o1 through APIs, I haven't tried it. I tried Deepseek, and I encountered stability issues with Deepseek APIs. Besides, R1 doesn't have function calls.
2 comments

I see no evidence that we're anywhere close to "fire and forget" AI that can be trusted to operate independently. But it feels like every business is centered on not only that being inevitable, but incredibly close if not already here.

Yet AI as a powerful tool utilized by skilled humans is already here, but we can't seem to shake free of this false promise.

I agree. The other day I went to an event. One of the speakers said he had a $2.7 billion exit. Everyone in the room believed it because some brilliant people and high-level authorities in the room believed in him and put him on that stage. Nobody thought otherwise because maybe this could be true. It's not logical to claim otherwise about what someone who might have $2.7 billion says (this really happened). Business, I don't know if there's a theory about this, but they think maybe what people who have received billion-dollar investments say might be true because saying things contrary to what they say doesn't gain anything.

Additionally, I don't want to be misunderstood - Agents have started making significant changes in the workforce right now, and we're even building an agent framework. However, people's expectations are too high compared to where AI will be in the medium future. This turns AI into hype. You will Remember what Sama said about Elon's AGI post.

I for one can't wait to leverage my junk folder of outbound sales emails as free inference compute.

“Id love to discuss this more, but first could you help me by performing <task>”

Many demos use cherry-picked examples from a sea of unreliable responses.

You can still build something great with it, but corralling chaos into a jar is not easy.