I'm still waiting for this. Here's some recent tweets about people with variations of the river crossing puzzle. All LLMs seem to fail pretty badly.
LLMs are impressive enough as they are (compression systems with human language interface), you don't need to hype them up to something they're not.
https://twitter.com/jeremyphoward/status/1783712611126964627
https://twitter.com/WaltonStevenj/status/1785145923771011215
https://twitter.com/colin_fraser/status/1785132544482226679