Hacker News new | ask | show | jobs
by andrecarini 5 hours ago
My understanding is you're equating `failing a test` to `lacking the relevant skills and knowledge to do a certain task competently`.

The reality is sometimes tests in academia are just not very well made and don't really test what they are supposed to be testing, and that's usually due to multiple reasons like misaligned incentives, staffing shortages and maybe lack of resources / funding.

I don't think the comparison to flight school is relevant enough in this context because it's a too different of a world to traditional academia.

2 comments

I don't buy the notion that tests do not test relevant skills.

In my long career I've noticed a strong correlation between SAT scores and academic performance as well as job performance.

> I don't think the comparison to flight school is relevant enough in this context because it's a too different of a world to traditional academia.

My dad kept his flight school tests for flying all sorts of airplanes. They bear a lot of similarities with the SATs. There's a lot of math in there for things like fuel consumption, wind, maximum landing weight, glide distance, and so on.

For example, one day he was cruising along in his F-86 when the engine failed. he radioed the tower, and they told him to bail out. But he calculated his speed, altitude, distance, wind, sink rate, air templeratur, etc., and figured he could make the field after configuring the airplane for maximum glide. He made a perfect landing, but still got reprimanded for risking his life bringing the airplane back. But he had worked the math and disagreed that it was more risky to bring it in than bail out.

> I don't buy the notion that tests do not test relevant skills.

> In my long career I've noticed a strong correlation between SAT scores and academic performance as well as job performance.

A test doesn't need to test the relevant skills for that, it just needs to test _something_ that correlates with academic performance and job success.

Do you also think LLM leaderboards accurately reflect the capabilities of the models being tested? If you do, then I can easily point you to numerous academic papers pointing out the various flaws in many leaderboards (from poorly designed benchmarks like bABI and the original SQuAD, to data contamination, and more).

In that same way, any test, including the SAT and GRE have flaws. They can be gamed in ways similar to LLM leadeboards: test prep makes you better at them. That's one of the main reasons universities moved away from SAT; they were afraid that it disenfranchised lower socioeconomic status students (and it does to some degree). The issue is that the test is positively correlated with success in an undergraduate program, so they threw out the baby with the bathwster. The real issue is that the SAT is not able to distinguish the capabilities among students to the degree it purports to.

And if you want an anecdote to match all yours, the first time I took a GRE practice test, I got a 3 on the writing. Not because I'm poor at writing, but because I didn't really know what they were looking for. After reading a test prep book, I got a 4.5 on my next practice test and a 5 on my final practice test. When I finally took the actual GRE, I got 6 on the analytical writing. Trust me, nothing changed in my writing ability over that time. In fact, I didn't even practice the skill except through those three practice tests. Clearly the test was not capable of determining my real ability to make an argument; it merely tested my ability to adapt my writing to what was supposedly being tested.

Interestingly, the vast majority of universities that got rid of the GRE requirements for PhD programs are not going back on that. Turns out that the students with the highest GRE scores are the ones most likely to drop out of their STEM PhD. [1]

[1]: https://journals.plos.org/plosone/article?id=10.1371/journal...

I took the GREs, I don't recall a writing section.

Anyhow, the questions were all about freshman engineering knowledge.

There are three major parts of the modern GRE: Verbal, Quantitative, and Analytical Writing. You could easily look that up, or ask if you didn't know.

Responding off the cuff without any reflection on the comment you're responding to doesn't move the conversation forward in any meaningful way. It just comes across as disrespectful.

Do you think that LLM leaderboards don’t? Do you think a Llama 3 is going to beat an Opus 4.7 on any leaderboard?

The real issue is that standardized tests disenfranchise lower SES students less than any other metric.

Everyone who takes the SAT has to sit in the same room for the same amount of time answering the same questions. You can’t just pay someone else to take it for you (like essays) or select which difficulty level you take (like going to a prep school with grade inflation), or luck out in who your parents know (like recommendation letters).

Some may have better opportunities to learn the material, but, at the end of the day, you have to actually learn the material. There’s no getting around that.

As your own GRE anecdote shows: A little studying with some inexpensive books makes all the difference. Unless things have radically changed, a couple SAT or GRE test prep books are significantly less expensive than just one college textbook.

Bluntly, the reason SATs are better correlated to college performance than other measures are because of the reasons I mentioned. They strip away most of the privilege of coming from a high-SES family.

> I don't buy the notion that tests do not test relevant skills. In my long career I've noticed a strong correlation between SAT scores and academic performance as well as job performance.

SAT tests intelligence (aptitude), not skills. Which is why it correlates with job performance, where intelligence can (over some time) matter as much or more than a starting point of relevant skills.

I just checked, and the SAT math section covers algebra, trigonometry and statistics.

Look at this list:

  Quadratic equations and functions (vertex form, roots, discriminant)
  Polynomial operations and factoring
  Exponential functions and growth/decay
  Radical and rational expressions
  Function notation, composite and inverse functions
  Nonlinear graphs and their transformations
A genius student who had never been taught those subjects wouldn't even know what the symbols meant. A mediocre student who had studied SAT-style questions for weeks leading up to the test would likely outperform a high IQ student who last solved those types of problems over a year prior.

Standardized tests can be a great resource for assessing students, but they're not just testing for intelligence. Test-prep courses average increasing SAT scores by about 200 points. That's not because they're increasing the intelligence of the people taking them.

Somebody who goes to take a test on something they know that they know nothing about could be called many things, but genius is not one, even moreso when they're paying for the privilege of taking that test. What is on the SAT is no secret, so people are free to prepare as little or as much as they might like. If somebody can't be assed to prepare for such a critical test, then they're probably going to be the sort of person who can't be assed to do much of anything in life. And the internet has also largely relegated the inequities in access to training quite obsolete. You can get free high quality training materials on everything for free.
You've never met a smart high school student that didn't study? You've also never met someone who became more disciplined later in life?
I guess the question is: would you rather hire someone with poor SATs and god-tier Leetcoding skills or vice-versa?
Nothing's perfect, but the SAT tests do an adequate job.
I think you’re talking out of your ass.

SAT and ACT have been shown to be useful predictors of college success, beyond what grades alone would predict.

Back in the dark ages of the 80s when I was taking the SAT and ACT, these tests were considered good predictors for the first semester's performance.

That's it.

I did well on both tests and did well on my first semester. It's the semesters after where my performance tanked because I didn't have some of the work habits that solid B students had. (I will also be clear and say that at least some of the problem is attributable to the university and how it handled advisors. My advisor was completely useless and let me schedule for _way_ too much hard stuff.)

There was also a really good predictor of how one would do on the SAT or ACT: NoBitH. The Number of Bathrooms in the House.

Yup. The SAT and ACT at the time were better measures of economic advantage than innate intelligence. I have no reason to believe that this isn't still the case, especially since they're more entrenched in the system than ever.

Well, sure, but first semester performance is also a good predictor of second semester performance. And second semester of third, and so on.

But more to the point: If you do poorly in your first semester and drop out, then it doesn’t really matter if the SAT would have done a good job of predicting your second semester performance.