|
|
|
|
|
by krackers
172 days ago
|
|
That social-rv is really interesting, apparently the target is randomly assigned _after_ submission, so it's not just remote viewing but also precognition? The popular ones on the "explore sessions" are a very close match, but if you look at other predictions by those accounts, they're less sure. It's very easy to form a connection between any two images if you allow abstracted forms of similarity, and fundamentally there are very limited themes when it comes to images (natural things, man-made things. Smooth vs sharp.). A good control test might be to have LLMs produce output instead, and score that. |
|
And this is worsened by the fact that the LLM-based auto scoring explicitly uses the last 10 as decoy targets
>When you submit a session, the system collects your last 10 targets (including the current target) to create a pool of possible matches. A multimodal AI agent is presented with your complete session (including all drawings, text, and data) along with all 10 targets from the pool. The agent is instructed to analyze and rank the targets based on how well they match the session content.
The protocol otherwise seems good, but the specific carveouts here would seem to bias results.
The source for the judging is at https://github.com/Social-RV/comparative-judging which is the part which would need to be studied carefully. At first glance, it exposes raw filenames to the LLM which might bias things. The ranking logic also seems a bit sketchy, it does some tournament-style elimination thing which I haven't analyzed thoroughly but if decoys are eliminated in an earlier round it could bias things compared to just asking the LLM to order the 10 images based on similarity in a single-pass which is obviously unbiased.