| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by zw123456 566 days ago
	The Tower of Babel was a library that contained every possible combination of letters to form a 400 page book. Or something like that. It made me wonder, what if you made a content honey pot full of just random text and a chatbot vacuumed that up? Does it's data vacuum have a garbage detector?

2 comments

dTal 565 days ago

The very worst that would happen is that you make someone's training run slightly less efficient. If your data is truly random garbage, the model won't be able to make any predictions about it and thus it will not distort performance. All training data is noisy to an extent, and you've just fed it pure noise.

However, it has become clear that effective LLM training is in large matter a matter of careful curation of high quality training data. Random gibberish is trivially detectable, by LLMs themselves if nothing else, so it's unlikely that your "honeypot" will ever make it into someone's training run.

Even if you carefully crafted some more subtle poison data, it would still form only a small amount of the training set. The worst case scenario is most likely that the LLM learns to recognize your particular style of poison, and will happily recreate it if prompted appropriately (while otherwise remaining unaffected); more likely, your poison data is simply swamped.

link

A4ET8a8uTh0 565 days ago

So.. I think it already has been happening ( people attempting to poison some sources for a variety of reasons ). I was doing a mini fun project on HN aliases ( attempting to derive/guess their user's age based on nothing but that alias ) and I came across some number of profiles that have bios clearly intended to mess with bots one way or another. Some have fun instructions. Some have contradictory information. Some are the length of a small night story. I am not judging. I just find it interesting. Has vibes of a certain book about a rainbow.

link

Loughla 565 days ago

Tell me about that side project. How does that work? What does it say about me? I find that very interesting.

link

A4ET8a8uTh0 565 days ago

The idea itself is kinda simple, but kinda hard, because it relies on how the language we use, gives us away.

For example, references we put ( simpsons, star trek, you name it ), language we use ( gee whiz, yeet, gyatt) and that is used to generate an online persona tends to be something of note to our image of self - one can determine to some extent the likely generation from those

The reference itself may not automatically mean much, but it is likely that if it is present in an alias, it had an impact on a younger person ( how many of the new generation jump on an old show? so mr robot would have the exposure range of 2015 to 2019 ). If that hypothesis is true, then one can attempt to guess age if the individual given that work work, because 1) we know what year is now 2) we know when it was made, which allows for some minor inference there.

Naturally, some aliases are more elaborate than others. Some are written backwards and/or reference a popular show or popular sci-fi author. Some are anagrams ( and - I discovered today - require additional datasets to tag properly so that is another thing I will need to dig up from somewhere ). And to complicate things further, some aliases use references that are ambiguous and/or belong in more than one category ( Tesla being one of them ).

The original approach was to just throw everything into LLM and see what it comes up with, but the results were somewhat uneven so I decided to start from scratch and do normal analysis ( language, references, how digits are used and so on - it is still amazing how well that one seems to work ).

Sadly, it is still a work in progress ( I was hoping for a quick project, but I am kinda getting into it ) and I probably won't touch until next weekend since the coming week promises to be challenging.

Unfortunately, this means in your particular alias ended up as:

Alias category is_random length is_anagram generic_signal Loughla Mixed Case 0 7 FALSE FALSE

( remaining fields were empty, basically couldn't put a finger on you:D). If you can provide me with an approximate age, it would help with my testing though:D

edit: This being HN. Vast majority of references are technology related.

link

raywu 564 days ago

That is very cool…and your alias is hard for me to decipher

link

A4ET8a8uTh0 564 days ago

I have a separate - not fully implemented - section for more semi-random aliases, but it revolves around our tendency to use default settings and commonly used tools for generating them. Thus far the only thing I was able to show with it is that it is not uncommon, but no clear proxy for age.. so seems like a dead end.

link