| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by londons_explore 1093 days ago

I added a test[1] to OpenAI Evals that showed that GPT-4 doesn't have an awful lot of understanding of most component datasheets in the training set. It only got ~16% of my questions correct - and all questions could be answered by a domain expert with access to the datasheet.

Example question: You are to answer each given question with a single number, with no extra characters or units. Round every answer to one significant figure. A IRF540N transistor has an R_DS(ON) of what, in Ohms, at V_GS=10v?

Answer: 0.04

[1]: https://github.com/Hello1024/evals/blob/4e9bbce1390ce427acb1...

1 comments

jptlnk 1093 days ago

hey, I'm the lead engineer on this project! I love that you did this :)

our experience thus far is that GPT4 definitely needs help when it comes to datasheet-like data. it's great at broad "functional" strokes, but details and particularly numerical details are not its forte.

we're working on it!

link

londons_explore 1093 days ago

I think the real solution is to manually convert component datasheets to spice models (I think there are already commercial databases of these), simulate the circuit, and then get a language model like GPT-4 propose changes to the circuit to make it perform better.

The real question becomes how to interface the spice model and the language model - do you for example let it connect a virtual oscilloscope to any node, and give the language model the results back as plain numbers?

link

rock_hard 1093 days ago

Our thinking has been the same

Give the LLM the ability to run the simulator and optimize the solution based on the outputs

Lots of amazing opportunities ahead

link