Hacker News new | ask | show | jobs
by refulgentis 757 days ago
> there's no reason for our mind maps to meaningfully differ here

Yes there is.

If you think all training runs converge to the same bits given the same output size, I would again stress that the visual dimensions analogy is poetics and extremely tortured.

If you're making the weaker claim that generally concepts sort themselves into a space and they're generally sorted the same way if we have the same training data. Or rotational symmetry means any differences don't matter. Or location doesn't matter at all...we're in poetics.

Something that really sold me when I was in a similar mindset was word2vec's king - man + woman = queen wasn't actually real or in the model. Just a way of explaining it simply.

Another thought from my physics days: try visualizing 4D. Some people do claim to, after much effort, but in my experience they're unserious, i.e. I didn't see PhDs or masters students in my program claiming this. No one tries claiming they can see in 5D.

2 comments

Yes, I'm making the weaker claim that concepts would generally sort themselves into roughly equivalent structures, that could be mapped to each other through some easy affine transformations (rotation, symmetry, translation, etc.) applied to various parts of the structures.

Or, in other words, I think absolute coordinates of any concept in the latent space are irrelevant and it makes no sense to compare them between two models; what matters is the relative position of concepts with respect to other concepts, and I expect the structures to be similar here for large enough datasets of real text, even if those data sets are disjoint.

(More specific prediction: take a typical LLM dataset, say Books3 or Common Crawl, randomly select half of it as dataset A, the remainder is dataset B. I expect that two models of the same architecture, one trained on dataset A, other on dataset B, should end up with structurally similar latent spaces.)

> Something that really sold me when I was in a similar mindset was word2vec's king - man + woman = queen wasn't actually real or in the model. Just a way of explaining it simply.

Huh, it seems I took the opposite understanding from word2vec: I expect that "king - man + woman = queen" should hold in most models. What I mean by structural similarity could be described as such equations mostly holding across models for a significant number of concepts.

What would be an appropriate test?

- Given 2 word embedding sets,

- For each pair (A,B) of embeddings in one set,

- There exists an equivalence (A’,B’) in the other set,

- Such that dist(A,B) ≈ dist(A’, B’),

Something like that, to start. But would need to look at longer chains of relations.

I think you are hung up on the visual representation.

Last week, the post about jailbreaking ChatGPT(?) talked about turning off a direction in possibility-space to disable the "I'm sorry, but I can't..." message.

In a regular program, it would be a boolean variable, or a single ASM instruction.

And you could ask the same thing. "How does my program have an off switch if there aren't enough values to store all possible meanings of "off"? Does my off switch variable map to your off switch variable?"

And the answer would be yes, or no, or it doesn't matter. It's a tool/construct.