Hacker News new | ask | show | jobs
by woah 1289 days ago
You're probably talking about the artifacts of converting a low resolution spectrogram to audio.
1 comments

Can the spectrogram image be AI upscaled before transforming back to the time domain?
Yes it exists: https://ccrma.stanford.edu/~juhan/super_spec.html

But the issue is not that the spectrogram is low quality.

The issue is that the spectrogram only contains the amplitude information. You also need phase information for generating audio from the spectogram

Interesting, can't you quantize and snap to a phase that makes sense to create the most musical resonance?
What happens if you run one of the spectrogram pictures through an upscaler for images like ESRGAN ?