Hacker News new | ask | show | jobs
by hiddencost 1199 days ago
Nope.

If you train a model where the input is an integer between 1 and 10, and the output is a specific image from a set of ten, the model will be able to get zero loss on the task. That is what's happening here.

5 comments

Yes but the input isn't an integer from 1 to 10 right? It's MRI data.

Although it seems they're only able to extract the subject of the brain activity, not any actual "pictures".

Are you saying the demonstrated results are all in sample? Because this is definitely not true for out of sample data. And the GP comment implies that there is in fact a validation/holdout set.
I'm also confused by this. If everything was done properly, test results on the holdout set would've been shown. Wasnt that the case?
It's still a legitimate direction to pursue. Once you get to large enough training sets, it's basically the same way our own brains work. We don't perceive or remember all the details of a building - just "building, style 19B", plus a few extra generic parameters like distance, angle, color and so on. Totally manageable for deep learning to recognize, and perhaps even combine.
We performed visual reconstruction from fMRI signals using LDM in three simple steps as follows (Figure 2, middle). The only training required in our method is to construct linear models that map fMRI signals to each LDM component, and no training or fine-tuning of deep-learning models is needed. We used the default parameters of image- to-image and text-to-image codes provided by the authors of LDM 2, including the parameters used for the DDIM sam- pler. See Appendix A for details.
But unless they tested this on a single human being; doesn't this mean that we can read brains (it's just this one particular reader is bad).
From the paper, it's four people from the NSD:

« We analyzed data for four of the eight subjects who completed all imaging sessions (subj01, subj02, subj05, and subj07 »

P3 here: https://www.biorxiv.org/content/biorxiv/early/2022/11/21/202...

I am pretty pretty sure that this is just per person. so all it does is categorize complex brain patterns of one person into 10 category numbers and then do some hula hoop to display the numbers.