| What does each of those 200 dimensions represent/encode? This is a very interesting and pragmatic design. I tried a couple of queries and the results were great for most of them, but not so much for esoteric ones, or for queries involving names, which makes sense according to the implementation specifics outlined in their blog posts. I never heard of this company before, and I am very interested in IR technologies( https://github.com/phaistos-networks/Trinity ). I am glad they are trying and it looks like they have a chance for something great there. Ahrefs.com is also working on a large scale web search engine, but other than their CEO announcing it on Twitter, there hasn’t been any other update since. |
So the default situation is that each individual dimension means "nothing and everything"; if you had some specific factors which you know beforehand and that you want to determine, then you could calculate a transformation to project all the vectors to a different vector-space where #1 means thing A, #2 means thing B, etc - for example, there some nice experiments with 'face' vectors that can separate out age/masculinity/hair length/happiness/etc out of the initial vectors coming out of some image analysis neural network with an unclear meaning of each separate dimension.