Y
Hacker News
new
|
ask
|
show
|
jobs
by
Rust
5129 days ago
One way might be to run the audio stream through a speech-to-text engine and parse the resulting transcript.
A video recognition system could also be used to identify faces, landmarks and common objects.