Hacker News new | ask | show | jobs
by liquorice 1399 days ago
They are probably not silly mistakes. Label encoding can be very useful for tree based models when the categories are ordinal, or when there are a high amount of categories.
1 comments

They are most of the times. You get a prediction with a meaningless float (unless the categories are ordinal, which isn’t so common), and categories can change their assigned number (happens in lots of analyses) at every run since they’re not properly sorted. Crawl a few notebooks, I spotted that error quite often.