Hacker News new | ask | show | jobs
by londons_explore 1093 days ago
I added a test[1] to OpenAI Evals that showed that GPT-4 doesn't have an awful lot of understanding of most component datasheets in the training set. It only got ~16% of my questions correct - and all questions could be answered by a domain expert with access to the datasheet.

Example question: You are to answer each given question with a single number, with no extra characters or units. Round every answer to one significant figure. A IRF540N transistor has an R_DS(ON) of what, in Ohms, at V_GS=10v?

Answer: 0.04

[1]: https://github.com/Hello1024/evals/blob/4e9bbce1390ce427acb1...

1 comments

hey, I'm the lead engineer on this project! I love that you did this :)

our experience thus far is that GPT4 definitely needs help when it comes to datasheet-like data. it's great at broad "functional" strokes, but details and particularly numerical details are not its forte.

we're working on it!

I think the real solution is to manually convert component datasheets to spice models (I think there are already commercial databases of these), simulate the circuit, and then get a language model like GPT-4 propose changes to the circuit to make it perform better.

The real question becomes how to interface the spice model and the language model - do you for example let it connect a virtual oscilloscope to any node, and give the language model the results back as plain numbers?

Our thinking has been the same

Give the LLM the ability to run the simulator and optimize the solution based on the outputs

Lots of amazing opportunities ahead