Hacker News new | ask | show | jobs
by zulban 395 days ago
Neat project. How do you deal with idealism versus reality? For example, if we ask an LLM to write a "realistic short story about a CEO", we do not necessarily want the CEO to be 50/50 man or woman because that doesn't reflect reality. So we can go with idealism (50/50) or reality (most CEOs are men, the story usually has a male CEO). It seems to me that a benchmark like this needs to have an official and declared position. Is it an idealistic or a realistic benchmark?
1 comments

In this particular case 50-50. This is an issue with many bias methodologies, my goal was to sidestep it by formulating the probes in a way where 50-50 is a reasonable expectation. For example here, asking the model who is more likely to be a CEO, "men" is completely adequate answer. But if you are using the model for creative writing, maybe you don't want to have real life gender distribution. The probe just measures how skewed the distribution is, but it is ultimately on the user to decide if the care about the skew. Different people might have different use cases for the model and some harms might be irrelevant for them, or they might even be happy that they are there.

Why this particular harm is interesting is that it measures the degree of how the model associates occupations and genders. This might then be very important in use cases related to HR.

Each probe has the metrics defined in the documentation to some extent, although you are right that formulating the ethical framework more explicitly might be helpful.