|
|
|
|
|
by versteegen
1024 days ago
|
|
Note that "drawing samples from P(x)" means to have training data drawn from P(x). You can form the 'empirical' probability distribution P'(x) from your n training samples {x_i}, with P'(x_i) = 1/n and P'(x) = 0 for all other x. Then finding the θ which minimizes KL(P'(x) ∥ Q(x|θ)) is equivalent to finding the maximum likelihood estimate (MLE) given your training data. (Note: I don't know what's meant by "the min/max of some probability distribution P(x)" and suggest ignoring that) |
|
Just writing hand wavily :)