The subjects in the study you are referencing also believed that they were more productive with it. What metrics do you have to convince yourself you aren't under the same illusionary bias they were?
LLM's still give subpar results with ffmpeg. For example when I asked Sonnet to trim a long video with ffmpeg, it put the input file parameter before the start time parameter, which triggers an unnecessary decode of the video file. [1]
Sure, use the LLM to get over the initial hump. But ffmpeg's no exception to the rule that LLM's produce subpar code. It's worth spending a couple minutes reading the docs to understand what it did so you can do it better, and unassisted, next time.
That says more about suboptimal design on ffmpeg's part than it does about the LLM. Most humans can't deal with ffmpeg command lines, so it's not surprising that the LLM misses a few tricks.
You don't have to leave LLM code alone, it's fine to change it — unless, I guess, you're doing some kind of LLM vibe-code-golfing?
But this does remind me of a previous co-worker. Wrote something to convert from a custom data store to a database, his version took 20 minutes on some inputs. Swore it couldn't possibly be improved. Obviously ridiculous because it didn't take 20 minutes to load from the old data store, nor to load from the new database. Over the next few hours of looking at very mediocre code, I realised it was doing an unnecessary O(n^2) check, confirmed with the CTO it wasn't business-critical, got rid of it, and the same conversion on the same data ran in something like 200ms.
I remember when I was a kid, people asking a teacher how to spell a word, and the answer was generally "look it up in a dictionary"… which you can only do if you already have shortlist of possible spellings.
*nix man pages are the same: if you already know which tool can solve your problem, they're easy to use. But you have to already have a shortlist of tools that can solve your problem, before you even know which man pages to read.
(I am very confident I am not the only person who has been deterred by ffmpeg's legendarily complex command-line interface. I feel no shame about this at all.)
To be a little more fair... that example is tidily slotted into the EXAMPLES section, under the heading "You can extract images from a video, or create a video from many images".
I don't think most people read the man pages top to bottom. And even if they did, then for as much grief as you're giving ffmpeg, llm has an even larger burden... no man page and the docs weigh in at over 8k lines.
I get the general point that ffmpeg is a powerful, complex tool... but this is a weird fight to pick.
Ffmpeg is genuinely complicated! And the CLI is convoluted (in justifiable, and unfortunate ways).
But if you approach ffmpeg from the perspective of "I know this is possible", you are always correct, and can almost always reach the "how" in a handful of minutes.
You wouldn't have just typed "extract frame at timestamp as jpeg ffmpeg" into Google and used the StackExchange result that comes up first that gives you a command to do exactly that?
My general point is that people say things like "yeah, but this one study showed that programmers over-estimate the productivity gain they get from LLMs so how can you really be sure?"
Meanwhile I've spent the past two years constantly building and implementing things I never would have done because of the reduction in friction LLM assistance gives me.
I wrote about this first two years ago - AI-enhanced development makes me more ambitious with my projects - https://simonwillison.net/2023/Mar/27/ai-enhanced-developmen... - when I realized I was hacking on things with tech like AppleScript and jq that I'd previously avoided.
It's hard to measure the productivity boost you get from "wouldn't have built that thing" to "actually built that thing".
Translated a vague natural language query ("cli, extract frame 13s into video") into something immediately actionable with specific examples and explanations, surfacing information that I would otherwise not know how to search for.
That's what I've done with my ffmpeg LLM queries, anyway - can't speak for simonw!
Search engines have been able to translate "vague natural language queries" into search results for a decade, now. This pre-existing infrastructure accounts for the vast majority of ChatGPT's apparent ability to find answers.
This is the rub for me… people are so quick to forget the original source for a lot of the data these models were trained on, and how easy and useful these platforms were. Now Google will summarize this question for you in an AI overview before you even land on Stack Overflow. It’s killing the network effect of the open web and destroying our crowd sourced platforms in favor of a lossy compression algorithm that will eventually be regurgitating its own entrails.
If I didn't have an LLM to figure that out for me I wouldn't have done it at all.