Hacker News new | ask | show | jobs
by echelon 713 days ago
The music, film, and game industries are about to be completely disrupted.

LLMs and AGI might be hogwash, but processing multimedia is where Gen AI and especially diffusion models shine.

Furthermore text-to-{whatever} models might produce slop, but Gen AI "exoskeletons" (spatial domain, temporal domain editors) are Photoshop and Blender from next century. These turbocharge creatives.

Hearing and vision are simple operations relative to reasoning. They're naturally occurring physical signals that the animal kingdom has evolved, on several different occasions, to process. This is likely why they're such a low hanging fruit to replicate with Gen AI.

2 comments

Id agree with you on the creative industry, but disagree that that the generative AI's aren't going to do the same to just about every other industry. We were at a point where we had extremely specialized models that were useful, and now we have general models that are EXTREMELY useful in almost all contexts. Text, Audio, Video, Data Processing, etc. At least in my eyes we are at the same point with LLMs as we were with computing when you had a large part of the population that was just "not into them". As if it was like choosing any other hobby. I'm sure tons of people aren't getting much utility out of the space now, but it's not because the utility isn't there.
ordinary surveillance applications with some fine or billing attached.. pure marketing where public facing materials have to be consistent but not much more than that.. and famously, anything in journalism from video creation to writing to narration.. are all also ground central in a "vocation crisis" too