Hacker News new | ask | show | jobs
by sharepower123 1153 days ago
An hypothesis is that in the latent space there is something like a branching factor that can be used in context learning to select the main tree for others layers. So a LLM is able to have the knowledge of many specialized smallers LLM and select the appropriate one by way of giving values to the branching factor. The branching factor could be some combination of attention layers operating in the latent space of previous layers.