Hacker News new | ask | show | jobs
by kachnuv_ocasek 2457 days ago
In short, ANOVA is usually what you want to do: https://en.wikipedia.org/wiki/One-way_analysis_of_variance

In practice, if you have n countries, you'll add n-1 binary variables to your regression equation. The first country is the reference level (all zeros), for the second country set the first new variable to one, the rest to zero, etc.

1 comments

So one-hot encoding, plus one "none-hot" base case. Why not just one-hot for all? To save one input?