| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by nonhaver 369 days ago
	this is great. i think extensions that detect generated music, speech, video, or text will become really important. im curious how light and performant these detection models can get. maybe a single extension could handle multiple media types. one concern (speaking as someone who doesnt know what these internal pipelines look like) is that suno/udio could tweak their model weights just enough to change the fingerprint, making a detector obsolete with each new release (or even more simple - maybe just apply post processing? id imagine a small reverb could diffuse the content enough to make the fingerprint difficult to detect). that turns it into a cat‑and‑mouse game. if its cheaper for them to mutate models/tweak post processing than for others to train new detectors, they could spin up a new fingerprint every day.

1 comments

qosmo 366 days ago

What kind of tweak has enough of an impact is still an open question. According to the paper it does generalize a bit between different models, but at least different architectures require retraining for coverage.