Hacker News new | ask | show | jobs
by brudgers 3978 days ago
So if you have a free-form input, you're going to need to figure out how to analyze that data to produce the groupings that you really want.

That suggests the root of the problem. Presumably the model is based on identifying a particular feature of the world as worth measuring [e.g. gender]. But boxing gender [e.g. masculine | feminine ] is not a feature of the world and the boxing means that the model does not correspond to the world in regard to gender, even though that was the purpose of capturing gender in the model. The idea of "getting the groupings I want" means my methods are suboptimal scientifically. The objective truths are in the data not in my interpretation.