| The conclusions are very optimistic given the results. The LLMs: * Failed to properly understand and respond to the requirements for component selection, which were already pretty generic. * Succeeded in parsing the pinout for an IC but produced an incomplete footprint with incorrect dimensions. * Added extra components to a parsed reference schematic. * Produced very basic errors in a description of filter topologies and chose the wrong one given the requirements. * Generated utterly broken schematics for several simple circuits, with missing connections and aggressively-incorrect placement of decoupling capacitors. Any one of these failures, individually, would break the entire design. The article's conclusion for this section buries the lede slightly: > The AI generated circuit was three times the cost and size of the design created by that expert engineer at TI. It is also missing many of the necessary connections. Cost and size are irrelevant if the design doesn't work. LLMs aren't a third as good as a human at this task, they just fail. The LLMs do much better converting high-level requirements into (very) high-level source code. This make sense (it's fundamentally a language task), but also isn't very useful. Turning "I need an inverting amplifier with a gain of 20" into "amp = inverting_amplifier('amp1', gain=-20.0)" is pretty trivial. The fact that LLMs apparently perform better if you literally offer them a cookie is, uh... something. |
But the bottom line is that it's a task that a novice could have solved with a Google search or two, and the LLM fumbled it in ways that'd be difficult for a non-expert to spot and rectify. LLMs are generally pretty good at information retrieval, so it's quite disappointing.
The cookie thing... well, they learn statistical patterns. People on the internet often try harder if there is a quid-pro-quo, so the LLMs copy that, and it slips past RLHF because "performs as well with or without a cookie" is probably not one of the things they optimize for.