FYI, the upcoming version of gpt4 does considerably worse emulating function calls / generating code-like strings, but gets better again if you switch to the function call API: https://twitter.com/reissbaker/status/1671361372092010497
(My guess is the same is true of gpt-3.5, although I haven't tested it.)
Are there any useful alternative models though? Most I've found weren't particularly good at following instructions or using tools in the way langchain provides them.