| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by shwouchk 428 days ago

i played a lot with the recent wave of tools. it was extremely easy to get system prompts and all the internal tokens from all providers.

i also experimented with letting the llm run wild in a codespace - there is a simple setting to let it autoaccept an unlimited amount of actions. i have no sensitive private repos and i rotated my tokens after.

observations: 1. i was fairly consistently successful in making it make and push git commits on my behalf. 2. i was successful at having it add a gh action on my behalf, that runs for every commit. 3. ive seen it use random niche libraries on projects. 4. ive seen it make calls to urls that were obviously planted; eg instead of making a request to “example.com” it would call “example.lol”, despite explicit instructions. (i changed the domains to avoid giving publicity to bad actors). 5. ive seen some surprisingly clever/resourceful debugging from some of the assistants. eg running and correctly diagnosing strace output, as well as piping output to a file and then reading the file when it couldnt get the output otherwise from the tool call. 6. ive had instances of generated code with convincingly real looking api keys. i did not check if they worked.

Combine this with the recent gitlab leak[0]. Welcome to XSS 3.0, we are at the dawn of a new age of hacker heaven, if we weren’t in one before.

No amount of double ratcheting ala [1] will save us. For an assistant to be useful, it needs to make decisions based on actual data. if it scanned the data, you can’t trust it anymore.

[0] https://news.ycombinator.com/item?id=44070626

[1] https://news.ycombinator.com/item?id=43733683