| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jmde 3489 days ago

I started reading the essay not knowing what to think, and it turned out to be more relevant to my work than I thought.

The issues being discussed in the essay have been a central issue in some area of psychology and behavioral sciences for some time--how to interpret components such as these.

One thought about your "coming into focus at a certain level of compression" comment: I've done some analyses of these vectors as applied to text samples, and one thing that struck me was how unreplicable some of them were across datasets that should be ostensibly similar (but are not the same). Others, in contrast, reappeared across multiple corpora. To the extent some of these components represent "real" features, they should reappear consistently across different datasets where you'd expect them to. That is, they should be robust to changes in idiosyncratic features of the database.