|
|
|
|
|
by priansh
2474 days ago
|
|
wav2letter is pretty fast we haven't been able to break 1.1x on a t2.medium in any of our benchmarks -- what's your setup here? I definitely think it's a big step in the right direction; it's easily 100x faster than DeepSpeech for us. If I could have anything I wanted for xmas, I'd ask for a speech to text system that is fast enough to work in browser thru wasm or something. |
|
Is there a SIMD.js / WASM equivalent optimized convolution / GEMM? That's pretty much all we'd need to port this to web... well, that and maybe a language model that isn't 1GB. The wav2letter acoustic model I'm using is based on the librispeech conv_glu, which is almost entirely served by conv1d layers.
I've honestly already been considering a demo for my main project (which is mixed english / command decoding) that runs entirely in a web page, if you have engineering time to throw at your christmas wish, we should talk :P