|
|
|
|
|
by rohitarondekar
5020 days ago
|
|
You mean set x0 to 1, right? People who do linear regression at work don't add a x0 feature? During the lecture the prof. only said that adding a x0=1 for all samples m, is by convention and helps simplify the computation. Unless I missed something during the lecture that's the only explanation that was given. |
|
> People who do linear regression at work don't add a x0 feature?
Sometimes they do that; sometimes the data already has a subset known to have sum 1 (e.g., if you binary variables that reflect "one of n choices" which must be set), and in this case adding x0=1 makes things worse (from a numerical perspective) for many algorithms.
Regardless, I've always seen regulation theory stated with lambda*identity matrices.