I really think the GIL is saving a bunch of poorly written multi-threaded C++ wrappers/libraries out there. If they remove it, a bunch of bugs will appear in other libraries that might not be Pythons fault.
They're not "poorly written", the fact that you don't need to do any locking in C/C++ code is part of the existing Python API. Right now when Python code calls into C/C++ code the entire call is treated as if it's a single atomic bytecode instruction. Adding extra locking would just make the code slower and would accomplish absolutely nothing, which is why people don't do it.
In order for the call into C to appear atomic to a multithreaded interpreter, all threads in the interpreter would need to be blocked during the call. That's possible to do, but you've just re-introduced the GIL whenever any thread is within a C extension.
In the unlocked case, one could use low-overhead tricks used for GC safepoints in some interpreters. One low-overhead technique is a dedicated memory page from which a single byte is read at the beginning at of every opcode dispatch, and you mark that page non-readable when you need to freeze all the threads executing in the interpreter. You'd then have the SIGSEGV handler block each faulting thread until the one thread returned from C. That's fairly heavy in the case it's used, but pretty light-weight if not used.
Nevertheless, this is still a concern to wider ecosystem, if Python libraries suddenly start to break due to underlying issues. I don't think this can be neglected.