|
|
|
|
|
by conradbez
160 days ago
|
|
Thanks for checking out Couple tips on audio front: 1. gemini has native audio understanding so I would recommend checking out uploading there and playing with the prompt to get it's output matching what you are after 2. for audio over 1-hour I found chucking it into 45min segments made it easier for Gemini to give back reliable timestamps 3. you do need to check the LLM outputs for valid timestamps - it can go off the rails I'll add search with the existing vector embeddings used for recommendation system and audio waves to the feature list - great idea! |
|