|
|
|
|
|
by neverokay
753 days ago
|
|
I’d add that I had better luck using smaller chunks (about 20 seconds) per wav file for accuracy. Whisper seems to go berserk if you pump in lengthy audio (30+ seconds). I’d be tempted to at least try breaking down the notes into one line long images (about a sentence) each and give it ago with Gemini. I haven’t tested their ocr, but even if it has errors, I bet you could just ask Gemini again to best fix the sentence. |
|