The other important bit here is garbage collection.
Local and anonymous functions that capture lexical environments really, really work much better in languages built around GCs.
Without garbage collection a trivial closure (as in javascript or lisps) suddenly needs to make a lot of decisions around referencing data that can be either on the stack or in the heap.
Yes, C++ is a great example of having to make decisions that don't have good solutions without a GC or something like. See mentions of undefined behaviour in relevant sections of the standard, i.e. when a lambda captures something with a limited lifetime.
Are you saying that Haskell doesn't have lexical environments? It very much does, just as all major languages of the ML language family do.
Local and anonymous functions that capture lexical environments really, really work much better in languages built around GCs.
Without garbage collection a trivial closure (as in javascript or lisps) suddenly needs to make a lot of decisions around referencing data that can be either on the stack or in the heap.