|
|
|
|
|
by playingalong
598 days ago
|
|
> the team tested it on 20 million prompts given to Gemini. Half of those prompts were routed to the SynthID-Text system and got a watermarked response, while the other half got the standard Gemini response. Judging by the “thumbs up” and “thumbs down” feedback from users, the watermarked responses were just as satisfactory to users as the standard ones. Three comments here: 1. I wonder how many of the 20M prompts got a thumbs up or down. I don't think people click that a lot. Unless the UI enforces it. I haven't used Gemini, so I might be unaware. 2. Judging a single response might be not enough to tell if watermarking is acceptable or not. For instance, imagine the watermarking is adding "However," to the start of each paragraph. In a single GPT interaction you might not notice it. Once you get 3 or 4 responses it might stand out. 3. Since when Google is happy with measuring by self declared satisfaction? Aren't they the kings of A/B testing and high volume analysis of usage behavior? |
|
I sometimes do, but I almost always give wrong answer or opposite answer where possible.