I know google has general sound classifiers like Yamnet, trained on youtube data but they are not very good for specific usecases. So you would have to create a custom model for you usecase.
just thinking, I've talked with a mechanic but he told me that now when they connect the car to a computer they almost always find anything wrong with a car, that and the experience they have they almost always know what's wrong.
I think sound + location could be really interesting, because you can filter parts of the car that could be making noises that are similar knowing where the mic is.
- https://www.tensorflow.org/hub/tutorials/yamnet