Hacker News new | ask | show | jobs
by nmiodice 2243 days ago
I made a tool that enables you to run full text search against audio content and explore the results using an embedded media player.

As of now (mostly for cost reasons) I have ingested a limited set of podcasts.

Please let me know what you think!

Note: It is not yet mobile optimized!

5 comments

Great idea! Did you deploy a speech-to-text pipeline to achieve this? I always thought it would be relatively expensive to do podcast-to-text translation at scale (compared to the gains) but maybe I just didn't optimize it well enough :)
Not OP, but I've looked into AWS Transcribe [1] and at least their solution would begin to rack up quite a bit of a bill. From what I've seen, there isn't a great open source SST solution yet, although there do seem to be quite a few promising ones [2]. STT is one of the technologies I'm looking forward to most in the open source realm.

[1] https://aws.amazon.com/transcribe/pricing/?nc=sn&loc=3 [2] https://github.com/mozilla/DeepSpeech

Seems like it has plenty of potential.

Not sure if I've been unlucky but the terms I searched and found (eg "economic") did not come up in the audio for quite some time after I played the sample. Is it meant to start the audio near the search term occurrence? (which seems like the natural thing you'd expect)

Your intuition about how it should work is indeed correct. Some podcasts don't seem to produce a high quality timestamp in the underlying STT engine.

I'll be experimenting with word-level timestamps. Right now I am just getting the timestamp of larger chunks (1-3 sentences).

Thanks for the feedback!

this is awesome, I have previously been looking for something like this - where I wanted to learn more about a topic/discover podcasts on this sort of thing.

when I was looking at the cost to transcribe it just didn't make sense to do it for myself.

hope this works out well!

I've wanted this product in the past and found the economics to be similarly challenging. [1] Some podcasts already put out professionally created transcriptions which is great, but I'd need to compile them in one place to figure out who said that one direct quote I remember.

Having people transcribe 5 minutes of audio/month in lieu of a subscription cost equal to 3 minutes of professional transcription was one model I had in mind.

I'm also hoping this works out well.

[1] https://news.ycombinator.com/item?id=15826604

This is very useful! For UI using algolia or similar would be pretty neat.
Are there plans to allow users to pick their own podcast episodes?
Very strong second on this! Use case for me: Like most ppl, I often listed to podcasts while doing some other primary task (cooking, driving, etc.), and am unable to "note" the interesting snippets. When I go back to find those snippets, I am often unsure of which exact episode I heard it on, and if I do remember that, using the audio scrubber to find it is still a disaster. Would love to give it a try when you roll this out.
Yep :)