Hacker News new | ask | show | jobs
by jschafer 404 days ago
Yes exactly, I was really excited when I found out that you do not need a FFT to do speech processing.

If you look at the code of (phone/voice) codecs GSM/Speex/Opus you can see that you can estimate the spectral envelope (or the configuration of a physical tube model for the vocal tract) in time domain with linear prediction coefficients (LPC).

And it is simple, e.g. the often used Levinson-Durbin algorithm is just 22 lines of C code. It is an interesting exercise to build your own vocoder from scratch that fits in a single screen page.

Many of the code snippets I have seen (which likely have already processed your voice) are just translations of the Fortran code of the book "Linear Prediction of Speech" by Markel and Gray (1976).

1 comments

Ah yes, ladder or lattice filters. If you don't mind old fashioned mailing lists there's still a few of hanging around in MUSIC-DSP@LISTS.COLUMBIA.EDU where code gets shared.