Making troubleshooting skills a profession in itself makes reliability a property of a specific person or team and not a property of the system. The former doesn't scale.
You’ll never be able to build a (large, complex) system that is consistently, inherently reliable over time and in response to change. You want to aim for such reliability but you still need troubleshooting ability.