Is Vulkan the response to the fact that single cores aren't getting faster and games need to move towards multithreading? Is there an analogous solution for other systems like physics, collisions or pathfinding?
The TL;DR reason for those new 3D APIs is essentially to drastically reduce CPU work that needs to happen in the graphics driver layer in the "old" 3D APIs (but that also means that if your application isn't spending a lot of CPU time in the graphics driver, for instance because it is fillrate bound, then that application won't benefit much from moving to the new APIs).
OpenGL's original design was a "fine grained state machine" which doesn't map well to modern GPU architectures, and every time a "micro state" in that big state machine is changed the GL driver needs to translate that change into much coarser state that GPUs accept. But it turns out that many 3D application don't even need to change unique states one by one, so each frame your code translates mostly static and coarse "application rendering state" into GL's fine grained state, only to have the GL driver translate that fine grained state back into coarse GPU state.
That's just one piece of the puzzle but I think explains the motivation behind the modern 3D APIs best.
The "other" 3D-API, Direct3D already took steps to group fine grained state into coarser state (starting with D3D10 and D3D11), the problem there was that they didn't come up with a good solution for threaded rendering (generating rendering work on different CPU threads), that's the other big thing that the modern 3D APIs solve properly. You essentially build render command lists on multiple CPU threads, and then enqueue those command lists on the main thread to be processed by the GPU.
That's part of it, but there's other reasons too. Promit[1] had a nice post that described four goals of the new generation of APIs: improving validation, reducing the complexity of the driver, allowing useful multi-threading, and giving developers more control over how the available hardware is used (e.g. multi-GPU). It's not an exhaustive list, but he filled in some of the backstory quite well.
OpenGL's original design was a "fine grained state machine" which doesn't map well to modern GPU architectures, and every time a "micro state" in that big state machine is changed the GL driver needs to translate that change into much coarser state that GPUs accept. But it turns out that many 3D application don't even need to change unique states one by one, so each frame your code translates mostly static and coarse "application rendering state" into GL's fine grained state, only to have the GL driver translate that fine grained state back into coarse GPU state.
That's just one piece of the puzzle but I think explains the motivation behind the modern 3D APIs best.
The "other" 3D-API, Direct3D already took steps to group fine grained state into coarser state (starting with D3D10 and D3D11), the problem there was that they didn't come up with a good solution for threaded rendering (generating rendering work on different CPU threads), that's the other big thing that the modern 3D APIs solve properly. You essentially build render command lists on multiple CPU threads, and then enqueue those command lists on the main thread to be processed by the GPU.