Hacker News new | ask | show | jobs
by ww520 2184 days ago
The main parameter in ML is θ (theta), as in Y = θ0 + θ1 X1 + θ2 X2 + ..., which are learned using the training data. (X1, X2, ...) are the features. The main goal of ML is to determine these theta parameters to a model so that you can use them to predict result on new data.

Hyperparameters in ML is the tuning parameters on the shape and structure of the model, such as the number of features in linear regression above, the number of layers in a NN or number of neurons in each layer. I think basically any tuning parameters besides theta can be considered hyperparameters. The difference is the theta parameters are learned while the hyperparameters are decided by human. But you can also run experiments on different tuning parameters and compare the outcomes so in a sense hyperparameters can be learned.

Convolution, well, the article is trying to explain it. It's like rolling up a portion of an image using a filter. E.g. Making an image blur by pixelizing it. The main purpose is the find out high level feature of the image. E.g. Put a filter on to find the edge of an object in the image.

Kernel is a small NxN matrix (3x3, 4x4, 16x16, etc) used as filter to convert the pixels in an image to high level feature. E.g. the mean-color-kernel takes 4x4 pixels and computes the average of their colors. Now apply the mean-color-kernel over all the 4x4 blocks of an image and you got one convolution.

1 comments

Thanks, very helpful.