Hacker News new | ask | show | jobs
by bombcar 1108 days ago
Google flourished because it could find forums (and blogs) and mine those, but much of that content has disappeared into Facebook and Discord (and YouTube - we must not discount how many things that would have been easily parseable blogs are now buried in livestreams and videos).
3 comments

Discord is probably the worst of all. I'm not a gamer and I hate it so much that a lot of tech content is now locked behind private Discord channels. Even Facebook is more discoverable than that
Even when you are already on discord, search and trying to read old conversations is awful on discord, because that's not at all what discord was made to do.
So I've been working on a side project to make a Youtube channel I watch have its content be more discoverable through text. I've had great results by scraping the Youtube transcription, and running that for a few passes through GPT 3.5 with some prompts to essentially act as an editor. The original transcription was often terrible in some spots. Just whole phrases or multiple words mistranscribed throughout. For almost all of them, GPT 3.5 was able to clean them up and restore the original meaning through understanding the context of the monolog and fixing obviously incorrect words or phrases.

I've watched through a sample of about 20 of the 3,000 videos I'm working through, and the corrected transcription really did an amazing job at restoring the original meaning from the spoken words that was hard to understand from the original machine transcription.

That is exactly where LLMs are useful. (People thinking of them as "AI", meaning AGI is just so wrong. Writing legal briefs??) Using them to ex post facto adjust transcripts in order to make them available and searchable is great.
>we must not discount how many things that would have been easily parseable blogs are now buried in livestreams and videos

on the flip side, would those blogs have been created at all if they weren't financially motivated by streaming/video to provide the content?

there's a lot of discussion here about internet commuities, but this comment brings to question why blogs started to die down to begin with. At least with reddit you get clout if you share stuff (useless clout, but sometimes you just want a pat on the back).