Hacker News new | ask | show | jobs
by nunja 2221 days ago
I experimented with the spleeter architecture quite a bit and I would say this is not suitable for real time audio processing. The reason is that the model needs at least 512 frames of audio samples to produce an output usable for source separation. This adds a ton of latency. I tried with smaller windows but the results are very bad.