Hacker News new | ask | show | jobs
by spmurrayzzz 749 days ago
> In fact you deliberately asked for something impossible and hold up undefined behavior as undefined like it's impugning something.

Correct, I did. This is a direct indictment on a given model's ability to plan/reason in this particular context. There are plenty of situations where models will respond with "Sorry, that's not possible". Ask GPT-4 "Tell me how to grow biological wings on a human" and it will respond with something along the lines of "this isn't currently possible, but here's a theoretical exploration of the idea"

GPT-4 gets very close on its own to the node.js question via a similar response breakdown above, provided the prompt is clear and detailed enough. But I test the open weight models in the same way to see if they have the capacity to exhibit similar reasoning or chain of thought process on their own. They usually don't without excessive prompt engineering or few-shot.

I said that I don't expect models to get this right not because I don't _want_ them to, it's because I think its an important milestone when they do. Autoregressive token prediction is unlikely to produce the real outcome im testing for here, but if it ever does thats an interesting finding.