Hacker News new | ask | show | jobs
by ajennings 3221 days ago
Good question.

Usually we consider "aggregation functions" with a fixed number of graders, N. It has been proven that if you want an aggregation function that is:

- anonymous: all graders treated equally

- unanimous: if all graders give the same grade, then that must be the output grade

- strategy-proof: a grader who submitted a grade higher (lower) than the output grade, if given the chance to change their grade, could do nothing to raise (lower) the output grade

- strictly monotone: if all graders raise (lower) their grade, then the output grade must rise (fall)

then your aggregation function must be an "order statistic": the median (if N is odd) or some other function which always chooses the Mth highest input grade.

If you relax the last criterion to:

- weakly monotone: if all graders raise (lower) their grade, then the output grade must rise (fall) or stay the same

then your aggregation function must be "the median of the input data and N-1 fixed values". As an example of this last type of function, let's take @panic's idea that each grader has an honest evaluation between 0 and 100 but has an agent that submits a fake grade (0 or 100 usually) to pull the average toward their honest evaluation.

As I say in a descendant comment, this system will converge to the unique number, X, such that X percent of the graders want the final grade to be X or above. You noted that this whole system (the average and the agents) is strategy-proof, so each grader should be honest with their agent. We might as well pull the agents into the system and say, "submit your honest evaluation and we'll calculate X, the unique number such that X percent of the graders want the final grade to be X or above."

This is an aggregation function. It is anonymous, unanimous, strategy-proof, and weakly monotone. I call it the "linear median" in my PhD thesis. Rob LeGrand called it "AAR DSV" in his thesis. We've been calling it the "chiastic median" more recently. It has some interesting properties. Considered in the context of "the median of the input data and N-1 fixed values", with 100 graders, the 99 fixed values are 1,2,...,99, and this function always returns the median of the input data with these 99 fixed values. (No matter how many graders there are, the fixed values will equally divide the number line between 0 and 100.)

You can see chapters 5-8 of my PhD dissertation for more info: http://ajennings.net/dissertation.pdf

Now, you're thinking about how the grade changes when a new vote is added, so we're really talking about a family of aggregation functions, one for each possible number of graders. We want each one to be strategy-proof in itself, but we also need to consider how they relate to each other.

Do you want strict monotonicity or weak? (I find strict monotonicity too restrictive, myself.) If you say "strict", then for each N you need to choose which order statistic you want. If you say "weak", then for each N you need to choose N-1 fixed values and you'll always take the median of the input data and the appropriate array of fixed values.

In my thesis (section 7.2) I talk about how you can create a "grading function" to unify a family of aggregation functions, but I don't think that's a perfect fit since we want to somehow "punish" subjects that don't have very many grades (that's what the OP is about). Do we want to pull them towards 0, or pull them towards some global neutral value (like 3 out of 5)?