The other source of data mentioned in the paper is the NTT Multi-Lingual Speech Database for Telephonometry, which seems to be commercial, so presumably under a proprietary license.
No, exactly none of that data was used for training. The training was done before the demo that was asking for noise contributions. The contributions are CC0, but were never used (i.e. totally unknown dataset quality).
http://www-mmsp.ece.mcgill.ca/Documents/Data/