Hacker News new | ask | show | jobs
by colinnordin 3001 days ago
I also work with non-speech audio and I'm curious: Do you use pure DFT:s as inputs to your models or do you use mel-energies or MFCC:s? What kind of models do you use? Since there is not that much variation in the sound of a chainsaw I suppose either a regular fully connected or convolutional neural network?

Love what you are doing and I would love to see a technical blog post about how you work with audio!