AFAIK the first somewhat widely-known language to do this was Haskell (through libraries). I'm not 100% clear on the entire history, but I think it goes something like:
1. Initially there was no way to do effects in Haskell, everything was pure.
2. Then it was realized that IO can be modeled with monads, so the IO type and do notation were added.
3. Gradual realization that monads can be used to also constrain effects, ie you can construct a type of "stateful computations" that can read and write to a specific state, but not touch other states or write to disk or something.
4. Monad transformers are invented, which allow stacking monads on top of eachother to support multiple effects. Together with type classes, this gets us pretty close to extensible effects (the approach used in Flix, if I understand it correctly). So for example you can express that your function needs to write to a log and may exit early with an error message with the constraints `(MonadWriter w m, MonadError e m) => ... -> m resultType`, and you can then use monad transformers to build a stack that provides both of these effects.
5. Monad transformers have some issues though: they affect performance significantly and the interaction between effects is tricky to reason about. So an alternative is sought and found in extensible effects. The initial proposals were, iirc, based on free monads, but those aren't great for performance either, so ever since there has been a whole zoo of different effects and handlers implementations that all make different trade-offs and compromises, of which I think the `effectful` library is now the de facto default, and I think what it offers is quite similar to the Flix language's effect system (I'm not sure on what finer points it differs).
> I'm not 100% clear on the entire history, but I think it goes something like:
You can see my talk "A History of Effect Systems" for a synopsys of the history. I gave it at Zurihac this year. It's very close to the history you gave (though I think point 1 is not right: Haskell always had a way to do IO)
Oh I hadn't heard of Ante before. This looks very close to the language I wanted out of Rust. Haskell's module system, row polymorphism, linear types, no sepples glyph soup. That's an instant bookmark save. Will be watching that space very closely.
Koka[1] jumps to mind as a language built around effect types. Without having used Flex or Koka, I'm not sure how their effect types actually differ from the normal monad gauntlet you get in pure languages.
1. Initially there was no way to do effects in Haskell, everything was pure.
2. Then it was realized that IO can be modeled with monads, so the IO type and do notation were added.
3. Gradual realization that monads can be used to also constrain effects, ie you can construct a type of "stateful computations" that can read and write to a specific state, but not touch other states or write to disk or something.
4. Monad transformers are invented, which allow stacking monads on top of eachother to support multiple effects. Together with type classes, this gets us pretty close to extensible effects (the approach used in Flix, if I understand it correctly). So for example you can express that your function needs to write to a log and may exit early with an error message with the constraints `(MonadWriter w m, MonadError e m) => ... -> m resultType`, and you can then use monad transformers to build a stack that provides both of these effects.
5. Monad transformers have some issues though: they affect performance significantly and the interaction between effects is tricky to reason about. So an alternative is sought and found in extensible effects. The initial proposals were, iirc, based on free monads, but those aren't great for performance either, so ever since there has been a whole zoo of different effects and handlers implementations that all make different trade-offs and compromises, of which I think the `effectful` library is now the de facto default, and I think what it offers is quite similar to the Flix language's effect system (I'm not sure on what finer points it differs).