|
|
|
|
|
by mrob
2236 days ago
|
|
This isn't bad, but the note decays sound noticeably different. My guess is that the NN doesn't know that human ears have non-linear response that makes them more sensitive to errors in the decay than the attack, so it treats them equivalently. If this is the case then it might be fixable by using logarithmic scale audio samples instead of linear. The non-linearity of the ear is frequency dependent[0], but in practice I suspect it would be sufficient to pre-process the linear PCM data with x=sqrt(x) and undo before playback with x=x^2. [0] https://en.wikipedia.org/wiki/Equal-loudness_contour |
|