|
|
|
|
|
by jdauriemma
345 days ago
|
|
> they're performing at at least graduate student level across most tasks I strongly disagree with this characterization. I have yet to find an application that can reliably execute this prompt: "Find 90 minutes on my calendar in the next four weeks and book a table at my favorite Thai restaurant for two, outside if available." Forget "graduate-level work," that's stuff I actually want to engage with. What many people really need help with is just basic administrative assistance, and LLMs are way too unpredictable for those use cases. |
|
In another example, I asked it to turn one of its bullet-point answers into a conversational summary that I could turn into an audio file to listen to later. It kicked out something that converted into about 6 minutes of audio, so I asked if it could expand on the details and give me something about 20 minutes. It kicked out a text that made about 7 minutes. So I explained that that was X words and only lasted 7 minutes, so I needed about 3X words. It kicked out about half that but claimed it was giving me 3X words or 20 minutes.
It's the little stuff like that that makes me think that, no matter how useful it might be for some things, it's a long way from being able to just hand it tasks and expect them to be done as reliably as a fairly dim human intern. If an intern kept coming up with half the job I asked for, I'd assume he was being lazy and let him go, but these things are just dumb in certain odd ways.