Hacker News new | ask | show | jobs
by vorticalbox 1085 days ago
Feed the scene into whisper to extract the audio and then feed that into got 3.5/4 for context?