Hacker News new | ask | show | jobs
by KMnO4 812 days ago
Maybe not exactly what you’re asking, but I started doing talk therapy last year. It’s done virtually and I record the session with OBS. As soon as the recording finishes, the following happens:

- The audio is preprocessed (chunked) and sent to Whisper to generate a transcript

- The transcript is sent to GPT-4 to generate a summary, action items, concepts introduced with additional information

- The next meeting’s date/time is added to my calendar

- A chatbot is created that allows me to chat with each session, including playing the role as the therapist and continuing the conversation (with the entire context of what I actually talked about)

It’s been exceedingly helpful to be able to review all my therapy sessions this way.

7 comments

I'm sincerely happy you're finding value in this and it's a very impressive workflow. The idea of sending my therapy sessions to OpenAI sounds terrifying though.
I hope you're the patient in this scenario, otherwise this is an egregious HIPAA violation.
Might still be a violation if they're the patient? Unless therapist and their employer's consent is given and ofc dependent on the relevant jurisdictions (IANAL).
I don’t know about the legality of it, but as a comical skit it’d be hilarious: a patient gets in deep shit with their doctor for violating patient-doctor confidentiality. Sounds straight out of Curb Your Enthusiasm!

Edit: It seems it's straight out of Curb, because it is! https://youtu.be/YH55dFlF_Rg?si=kOLC5rGq5fi8tke2

This is really interesting, are your comfortable with OpenAI having your personal details in this case?
This is where having our own LLMs and stacks running locally will save and empower us IMHO
OpenAI's privacy claims are fine. I wouldn't worry about this any more than I worry about my email provider.
wow - really cool.

I'm actually the founder of an AI Meeting Bot company - and we're thinking of open-sourcing so you could run exactly this set-up locally with perfect diarization / recording while also maintaining privacy [1].

I'm currently creating code examples, and just finished the "chat with each session". Would love to know how you implemented it.

[1] https://aimeetingbot.com

Did you discover anything interesting by being able to review all the therapy sessions?
Curious on the code. ( a friend is a psychiatrist and she noticed difficulty with multiple languages and device translations).

This flow could help me improve fluency in her sessions ( eg. she has a hardware translation device (expensive) which has significant issues auto translating ), since it's missing context a lot.

Eg. When grieving is incorrectly translated between dutch-polish, it defeats a bit of the purpose of being fluent in your native language.

Reducing the error rate would help a lot.

I’d love to replicate your workflow. Any luck with speaker diarization using whisper? I’ve tried WhisperX several but it didn’t work.
I've created an AI Meeting Bot API to do just that [1].

At the moment it runs on AWS, and we're thinking of open-sourcing so you could also run it locally to maintain 100% privacy of such conversations.

You'd get speaker diarization, names on top of the recording [2].

[1] https://aimeetingbot.com [2] https://spoke-1.gitbook.io/ai-meeting-bot

Happy to get in touch and have you run it