Hacker News new | ask | show | jobs
by leeny 3394 days ago
Author here. I appreciate the notes and am happy to revisit and make corrections when needed. To respond to your points:

1. As a sanity check, I did do a t-test of technical ability vs. # of endorsements before publishing. There is no statistically significant relationship between the 2. (P < 0.335)

2. What do you mean by "language matters here" (re the histogram)?

3 comments

After reading the way in which your work is being criticized (regardless of the validity of the criticisms), I'm very impressed to see you here making lemonade of it all. I'd be crying in the corner if it were me. Good work.
Do you mean that you fit a simple linear model, of the form below?

ability = b0 + b1*endorsements + error

And when you say t-test, are you saying you did a t-test for the parameter b1?

Usually when people refer to a t-test, without more information, they are saying they tested the difference of means between two groups. (or one mean against a number).

See, for example, the Wikipedia article on t-tests: https://en.m.wikipedia.org/wiki/Student's_t-test

> Do you mean that you fit a simple linear model, of the form below?

That would be the form of the best-fit line in the scatterplot. (and it would make sense to assume that the t-test refers to b1 != 0, as there is only one group)

Edit: on second thought, you're probably right. I think I was too off the cuff in responding. Left original response below.

If by best fit you mean minimizing sum squared error, that's fair.

But to be sure, if someone said t-test, and they only had one group, I would first guess they were doing a one-sample t-test.

Even with two dependent variables and one group, I would think over whether they did a dependent t-test.

I figured it was a simple linear model (in this case a correlation) because they mentioned that they tested the relationship, and it makes sense, but it seems important to sanity check the use of the term t-test, which can be highly ambiguous (and I have seen used in very surprising ways).

Hope it was helpful.

It's unclear what you t-tested here. Ideally, you would test for difference between groups of "Is there a difference in number of endorsements between people who got a "yes" in advancing to the next round or not". As a followup, is there a difference between those who's preferred was most endorsed or not?

I'm a bit stunned that you didn't recognize Language as programming language...... :(

As an example, people probably get endorsed for SQL or CSS far more than their programming language of choice that is tested in an interview.

What do you mean re not recognizing language as a programming language? We only counted actual programming languages, i.e. not SQL, HTML, CSS, etc.