Hacker News new | ask | show | jobs
by kestreloats 2618 days ago
Somehow the idea that this is all caused by people applying to more places doesn't cut it to me. It might account for some of it, but not all of it. A lot of this is probably fairly chaotic and complex. For example, the GI bill probably contributed directly somewhat also, but also probably led to cultural shifts because of the veteran population, which then in turn changed career and educational expectations, but the GI bill itself reflects WWII, which also reflects cataclysmic societal changes (that is, WWII caused many significant things, but was also caused by many significant things).

Speaking as someone who has done research on selection and standard testing, and whose first-hand experiences support what I see from the theory and research, this is kind of horrifying, because it inevitably leads to increases in bullshit.

The accuracy of any of the selection methods we use in education is very poor. It just is. Standardized tests, "holistic review", all of it. It's not useless, but it is poor. It's the sort of thing that works well for a certain level of selectivity, but not past it. Once you go past it, you're selecting on well-impressioned noise, and incentivizing bullshit.

Somehow this all seems related in my mind to the rise of concerns about replicability in science, college admissions scams, and the age of fraud in general (https://www.theatlantic.com/entertainment/archive/2019/04/th...). Something is broken. Probably many things.

4 comments

> The accuracy of any of the selection methods we use in education is very poor. It just is.

The SAT when combined with the high school GPA (HSGPA) has an adjusted correlation correlation coefficient of 0.56 with first-year GPA, meaning the combined measurement accurately predicts how a potential college applicant will perform in their first year of college 56% of the time. [1]

That's actually pretty good, what other proposed metrics can say their signals match outcomes with 56% validity? How much you liked their essay?

Lower SAT scores have about 63% retention rate for first-year students whereas high SAT scores have about a 95% retention rate [2]. That is, high schoolers with poor SATs drop out of college about 40% of the time in their first year.

Standardized tests have many problems -- obviously -- but no one has developed a less unfair system.

When colleges abandon standardized tests what else are they relying on? Random signals made up by admissions officers? That's worse than job interviewing.

I have no problem criticizing standardized testing, but I feel everyone who does should be obligated to propose a better alternative method with a higher validity rate than 56%.

[1] https://files.eric.ed.gov/fulltext/ED563202.pdf

[2] https://files.eric.ed.gov/fulltext/ED563471.pdf

> The SAT when combined with the high school GPA (HSGPA) has an adjusted correlation correlation coefficient of 0.56 with first-year GPA, meaning the combined measurement accurately predicts how a potential college applicant will perform in their first year of college 56% of the time.

Thaaat's not what "correlation" means.

I'm summarizing for a general audience. I could say, r is " the strength of the linear relationship between two variables on a graph" but I'm not sure that helps the average person understand the connection.

If you have a better description, it's more helpful to chime in with that instead of "You're wrong!"

A better summary would be that those two quantities explain about half of the variation, not that they predict accurately half the time.

If you took a random sample of cases, half of them wouldn’t exhibit a direct relationship b/w SAT and first year GPA and half nothing (unless the data is _super_ weird). Instead, SAT would be instructive-ish in predicting first year GPA for all those cases.

Explaining half the variation, and the other half?

The point was to draw a connection for the general audience, not present the most scientifically accurate description of a relationship between two variables -- that's what the links to the research are for.

It's good to communicate for a general audience, but your presentation misleads rather than simplifies.

> meaning the combined measurement accurately predicts how a potential college applicant will perform in their first year of college 56% of the time.

"accurately predicts...56% of the time" implies that half of predictions are 'accurate', which most readers would interpret as 'correct' i.e. knowing SAT + HSGPA allows you to state FYGPA _exactly_ for about half of cases. That's not what the research you cited says. Rather, the square of the multiple correlation R (which is exactly R^2, the coefficient of determination) indicates how much of the variance in the output variable is explained by the input variables. That quantity _must_ be communicated in terms of the strength of the relationship, not accuracy for a given or share of cases as it doesn't tell us anything about a given case. One could say it tells us about 30% (0.56^2, correction from my statement above) of the information we'd need to know to perfectly predict the outcome, or that the relationship is better than random, but doesn't predict perfectly, or ...

Additionally, table 5 of the link you cited indicates the adjust correlation coefficient b/w FYGPA and the combination of HSGPA and SAT is 0.62. None of the numbers in that table are 0.56, so I'm not sure where you pulled that exact number from. I've used 0.56/56% above to be clear which quantity I'm referring to.

That's not summarizing. "It's the strength of the relationship" is summarizing. "The combined measurement accurately predicts how a potential college applicant will perform in their first year of college 56% of the time" is just wrong. See Anscombe's quartet for a great example of why it's just plain wrong.

https://en.wikipedia.org/wiki/Anscombe%27s_quartet

And your completely scientifically accurate but easy for the lay reader to understand description in a few simple words is...?
Isn't that the example I used?

"It's the strength of the relationship"

I happen to like:

"It's how perfectly you can fit a straight line to them."

You can be mathematically accurate without being mathematically precise. Better imprecise but correct than incorrect but precise.

If you're trying to give a quantitative lay picture of what exactly 0.56 linear correlation means, you need to still be quantitatively right, while the above are quantitative. Pictures and examples can help. "For perspective, 0.56 is about the correlation between <example> and <example>"

>The SAT when combined with the high school GPA (HSGPA) has an adjusted correlation correlation coefficient of 0.56 with first-year GPA, meaning the combined measurement accurately predicts how a potential college applicant will perform in their first year of college 56% of the time.

This is a totally incorrect interpretation of what correlation is.

Again, I'm summarizing for a general audience. If you have a better way to describe it that doesn't devolve into polynomials and linear relationships between variables on a graphs it's more helpful to do so than just say, "You're totally incorrect!"
But a coin flip on a large number of people would trend toward 50% predictions over time.
And a correlation of 0%
Right, but I'm trying to get at 56% isn't great cause random is 50, and there's no clarify on correlation of the measure that gets to 56.
Correlation is not probability. You can't compare them at all. Flipping a coin for each student would produce a correlation of 0, far lower than the correlation of 0.56 cited above. Have a look at some plots of data [1] with different correlation coefficients to see how dramatic it can be. Note the difference between r = 0.00 and r = 0.60. That's about what we're dealing with here.

[1] http://www.bwgriffin.com/gsu/courses/edur7130/images/twelve_...

Which is worse than 56%
Yeah, but if randomness gets you 50, 56 doesn't feel that useful.
The SAT reliably correlates with IQ, and of any psychometric variable we are able to measure, IQ has the strongest correlation with long-term socioeconomic success. Insofar as it is useful to funnel smart people into college, using the SAT is a good way to filter them.
Do you have a source on the first claim?

I'm curious, as someone who was slightly involved with a documentary [1] exposing pitfalls of standardized testing. Generally the SAT only has shown a weak correlation between test scores and first year (some studies I read showed only first semester, but I don't have them on hand right now) collegiate performance [2].

[1] https://m.imdb.com/title/tt3393042/ [2] https://www.insidehighered.com/news/2016/01/26/new-research-...

>Generally the SAT only has shown a weak correlation between test scores and first year [...] collegiate performance [2].

That's not what your cite says. The InsideHigherEd article actually shows a strong correlation between SAT and grades but Aquinas identified a minority % of schools where it didn't. Please carefully read the 3 bullet points again and notice the minority percentages.

Your qualifier of "Generally" in your comment is misrepresenting Aguinis' findings.

https://www.psychologicalscience.org/pdf/ps/Frey.pdf?origin=...

Correlation was .82 between the SAT and the armed forces vocational aptitude battery, which is designed as an IQ test.

https://pumpkinperson.com/2016/12/14/how-well-does-the-sat-c...

Correlation was 0.48 between SAT and Raven Progressive Matrices.

Why Taleb?

I hate that piece because most of what he's referring to has been dealt with for decades. It's like "narcissist who knows nothing about a field doesn't bother to learn anything about it and as a result takes down a strawman to make himself look good."

Taleb should stay in his lane. He sounds like a "get off my lawn" old douche when he talks about anything other than financial markets. His central tenet in that article is that IQ doesn't correlate with wealth and is therefore useless. How enlightening.
Why not just administer an IQ test then?
There is a Supreme Court ruling that IQ tests are assumed to unfairly disadvantage minority candidates. This was in the context of employment rather than college admissions, but I don't think anyone has been bold enough to test whether it applies.

Broadly speaking, IQ tests tend to include a lot of cultural assumptions and therefore members of the in-group will test higher than members of out-groups of the same aptitude. IQ tests are therefore treated as discriminatory (disproportionate impact) "by default." The administration of a particular test for a specific purpose can usually be OK'd by either showing that particular test is not discriminatory, or by showing that particular test has a measurable correlation to the specific purpose for which it's used. My understanding is that "specific purpose" in the employment context means the individual job description, not just hiring in general.

> There is a Supreme Court ruling that IQ tests are assumed to unfairly disadvantage minority candidates

I wonder if there's an "IQ test" that unfairly disadvantages non-minority candidates. It would be interesting to see the results if the same pool of people (minority and non-minority) take both tests.

SAT presumably tests you have learned certain things in high school and so are ready for university. E.g. you won’t show up to your first university math class needing to learn all of high school math, and thus be unable to complete the class.
IQ is a lot like GDP. It doesn't measure what we need as accurately as we need. If you keep that in mind, though, it can remain a broadly useful metric.
IQ and the old SAT are highly correlated, and is why Mensa uses the SAT for admissions purposes
If we predominantly cared about intelligence then shouldn’t we not beat around the bush and directly test for it?
So I agree with you about SAT <-> IQ, and IQ <-> SES. However...

IQ isn't the only correlate of success. I think if you look at it all, there's a large dose of conscientiousness (which standardized testing companies are now going after), and attractiveness/charisma...

... but that's on the individual side of things. There's also a whole host of societal and random stuff that is outside the control of the individual, or maybe is significant to a person, but that studies tend to treat as irrelevant.

Also, saying that the SAT is useful as a selection device doesn't mean it's the only useful selection device, or that as a selection device it's very good. Offhand I don't remember the numbers, but standardized test score probably correlates .45 or so with first-year college grades? Think about that for a second. First, that's a ton of noise. That's not very predictive at all. Second, that's first-year college grades. Change the criterion but it's still the same: your best tool really is a pretty fuzzy predictor.

So now take this very fuzzy predictor, add some other fuzzy predictors that at best get things up to like maybe .6 correlation? Still fuzzy. Now you're going to be really selective on these things? What you're going to end up with is a lot of people who would have done as well but for whom the stars didn't align right at a particular period in their life. But now we as a society have this crazy income inequality, rent-seeking monopolies of all forms, and a general winner-takes all climate, so these small meaningless differences get amplified tenfold.

The conversations about this too have this kind of all-or-nothing quality, like you're forced to choose between "standardized tests are meaningless" or "standardized tests are valid predictors for a large portion of people so we need to treat them as infallible predictors for everyone." The truth is really much greyer than these positions: yes they predict, but they predict pretty weakly, all things considered, and generally for people who fit into a certain box. This all might be fine, except now we've structured our society in part around these oversimplified assumptions, pretending it works when it doesn't really. It might be ok when there's lots of second chances, lots of opportunities for people to get back on their feet from the vagaries of life, and good opportunities in general for everyone, but when resources gets hoarded by fewer and fewer, there's more noise, that's compounded by people gaming the system, etc. etc. etc.

Just as a thought exercise: what do most human traits look like in terms of distribution? They're pretty normal, pretty Gaussian. What does income look like? Not that, not at all. The discrepancy between them should be shocking to everyone.

>IQ isn't the only correlate of success.

No is saying it is, clearly there other factors that may be much more important but IQ is clearly a factor.

The difficult solution is increasing the accuracy and “trustability” of k12 assessments. As is, high school educators are not assessing their student in a manner that allows for real judgements of their students’ knowledge/capabilities/creativity/etc. by an outside party. We spend immense amounts of money, time, and effort creating and assessing students in school, but the results are only useful with the context of progressing within those specific courses.

Trustable assessments would go a long way to accurately rewarding merit.

Currently standardized testing better predicts future academic performance than years of GPA, which in a way is the sum of teachers’ views on a student.

But teachers still prefer teachers over standard tests, and they shun conversation with their researching peers. The classroom is their kingdom and that’s how people prefer it.

“Teacher tests” often have different objectives than solely measuring one student’s ability compared to another. Likewise, the investments made in the creation of standardized tests are not likely to be matched by classroom teachers due to logistical constraints even if they have the necessary knowledge to create an equally valid assessment.

The objective should be the creation of a system that allows teachers to retain the flexibility to modify curriculum in a manner appropriate to their specific student population while also maintaining the trust and reliability benefits found in standardized tests.

Maybe it is our expectation for everything to be fair, straightforward and honest that is broken?

These expectations are the root of the current outrage culture. It's as if now that we have these wonderful technological advances we assume every problem should have been solved by now, and get outraged when we realize we don't live in a utopia.