It won't be very random though. E.g. Apple keyboards will almost always be on device with Apple's hardware & stack, cheap keyboards will tend to be with cheap computers/monitors, expensive keyboards the opposite.
It'll still be interesting data (and I'll dump as many of my systems in as I can), but it will take a lot more than treating differences as noise to answer those kinds of questions in a meaningful way.
That is indeed an annoyance. Last time I looked at the data the "Apple keyboard" and "Apple computer" submissions overlapped so much I couldn't include them separately in the model.
Similarly, I can't separate the effects of programmable and split keyboards because those two almost perfectly overlap in the data.
I hope getting more submissions will help at least a little bit with this.
It might help to pose the problem as regression analysis with with categorical dependent variables. Perhaps with generalized linear model to account for the case that the predicted variable is positive.
Alternatively, if you have enough data, see if an orthogonal array design is feasible. It will not be very kosher because you would be selecting as opposed to assigning.
My current best attempt is a mixed-effects quantile regression (to capture the influence of the fixed effects on the tenth percentile while accounting for dependence between trials from the same person) but it is so compute hungry with this (apparently relatively large) data set that I'm looking into Bayesian methods for accomplishing similar things.
But now that you say it, I could annotate the data with a guess about OS and browser from the user agent string. Cool idea. Thanks!