|
|
|
|
|
by alimw
2356 days ago
|
|
Please try this exercise and report back :) Suppose it happened that when setting up your parameter space you found yourself working with ξ and η instead of x and y, where the relationship is given simply by (ξ, η) = A (x, y) for A an invertible linear mapping (2×2 matrix). This could easily happen in practice. Is gradient descent in (ξ, η) the same procedure as gradient descent in (x, y)? What should we make of any difference? |
|