Interesting project, but why not cut/edit segment via UI?
Another option is to display the transcribed text, allow the users to delete the text (words or a sentence). Get the get cut out times based on the deleted text.
I have version 2 with a GUI i am just putting finishes touches on if that is helpful?
Just using Tkinter so nothing earth shattering but then are able to turn some options on/off, select files etc easier.