Hacker News new | ask | show | jobs
by heycosmo 962 days ago
The time stretch algorithm is implemented in https://github.com/audacity/audacity/blob/master/libraries/l... particularly functions _time_stretch and _process_hop. It looks to me like a classic phase vocoder with vertical phase coherence (c.f. https://en.wikipedia.org/wiki/Phase_vocoder).

The basic idea is this. For a time-stretch factor of, say, 2x, the frequency spectrum of the stretched output at 2 sec should be the same as the frequency spectrum of the unstretched input at 1 sec. The naive algorithm therefore takes a short section of signal at 1s, translates it to 2s and adds it to the result. Unfortunately, this method generates all sorts of unwanted artifacts.

Imagine a pure sine wave. Now take 2 short sections of the wave from 2 random times, overlap them, and add them together. What happens? Well, it depends on the phase of each section. If the sections are out of phase, they cancel on the overlap; if in phase, they constructively interfere.

The phase vocoder is all about overlapping and adding sections together so that the phases of all the different sine waves in the sections line up. Thus, in any phase vocoder algorithm, you will see code that searches for peaks in the spectrum (see _time_stretch code). Each peak is an assumed sine wave, and corresponding peaks in adjacent frames should have their phases match.

1 comments

The old "Change Tempo" effect still works much better for voice.