Hacker News new | ask | show | jobs
by kat_rebelo 1065 days ago
i have always augmented my more traditional music projects with avant-garde and experimental stuff. particularly focusing on found sounds, noise, and free improvisation. i record everything, but i treat it more as a journal of "free writing" than actual music output. the recordings are often formless, noisy, and chaotic and frequently contain experiments in polyrhythms or free time. over the years i've amassed an enormous amount of audio data that i treat more as an intellectual curiosity for myself than music i would show to other people.

however, with all of the debates around attribution and ownership for human creators in the age of AI art, combined with the apparently legally and ethically dubious means by which these megacorps obtain their training data, i have began to think about what are some avenues that the general public could try to protest the actions of these megacorps by discretely poisoning the well of their training data.

with the gigabytes and gigabytes of data i have generated of mostly incoherent audio, i have considered releasing this music for the first time by innocuously labeling it as a music audio dataset with the intention of trying to make it appear extremely attractive to megacorps scouring the internet for free data. my individual contribution probably couldn't amount to much, but if a concentrated mass of people did this in their respective fields, perhaps this could be a way of at least obstructing these corporations from freely capitalizing on the hard work of real artists.

1 comments

The sad reality is that one way or another free and open exchange of information is going to trend down, as people realize anything they publish will get repacked and stripped of attribution to be sold by some commercial AI they will switch to e2ee closed groups, and more active opposition will be poisoning datasets which would introduce more noise and contribute to the trend.

But I agree with the sentiment—if the blatant disregard for IP is not curbed, this sort of thing would have to be done…