|
|
|
|
|
by alok-g
2 days ago
|
|
Thanks. I read several times, and along with another response, I think I have a better understanding now, though still not having a complete grasp. >> So sampling one point gives us the gold amount for cheese amount 1, 2, and 3. This is the 'function', and ... I get this part, so each point in this N-dimensional space yields a function f of the index, and this is the function. >> Yes, the function changes shape as you get more data because the parameters governing that function Getting more data should now get more such points (in N-dimensional space), but with each such point being the 'function' how is it changing shape. Nevertheless, I think I have much better glimpses after reading your and other other responses here than from the original article, which I still find confusing even on reading again. |
|
So how does it change shape? Well this part is actually something I don't fully grasp myself yet. But I can sketch a crude bayesian interpretation, which is how I think of it. Not completely correct but works as a placeholder until I fully work out the math of updating the parameters.
Basically, from a bayesian perspective we can condition the joint distribution of function outputs as a likelihood conditioned on the kernel parameters theta: p(f(x1), f(x2), ... | theta).
Then we can derive the posterior distribution over theta p(theta | f(x1), f(x2), ...) like so:
p(theta | f(x1), f(x2), ...) ≈ p(f(x1), f(x2), ... | theta) p(theta).
So we fit the theta parameters based on how well it fits the observed data we feed our bayesian model.
FWIW, I recommend chapter 14 of Richard McElreath's Statistical Rethinking for a better introduction of GPs. This article kind of glosses over a lot of the intuition and introductory concepts that you need to really grok it.