| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by PathOfEclipse 1293 days ago

> Except 1) those tests weren't designed for that purpose

I don't know if that's true. And even if it were, why would they need to be designed for that purpose to be successfully and correctly used for that purpose? In fact, at one time the federal government required this data to be provided by schools. However, the teacher's unions lobbied hard, and the 2015 "Every Student Succeeds Act" barred the government from requiring this data: https://www.edweek.org/policy-politics/essa-loosens-reins-on...

"But the teachers’ unions see an opening to change policies their members have broadly rejected. They are also far more powerful among state legislatures than in Congress."

"The American Federation of Teachers plans to bring its political clout to bear on the issue, too."

On the other hand, strong research exist to show that SGPs are a valid and useful measurement: https://pubs.aeaweb.org/doi/pdfplus/10.1257/aer.104.9.2593

"The main lesson of this study is that value-added models which control for a student’s prior-year test scores provide unbiased forecasts of teachers’ causal impacts on student achievement. Because the dispersion in teacher effects is substantial, this result implies that improvements in teacher quality can raise students’ test scores significantly."

And the follow-up study: https://pubs.aeaweb.org/doi/pdfplus/10.1257/aer.104.9.2633

"This paper has shown that the same VA measures are also an informative proxy for teachers’ long-term impacts."

> 2) they are a worse measure of student preparedness than GPAs

GPAs are highly subjective, and more importantly, harder to compare across schools and even across classes. By using standardized scores, for instance, one could track successfully that a teacher's performance remains consistent as he or she moves across schools. Remember, this was about measuring teacher performance, not student performance. That said, if GPAs really were better for teacher evaluation, there is nothing stopping you from measuring student GPA improvement instead of student standardized test score improvement, so I'm not sure what you're really arguing against at this point.

> 3) they only test those topics which are easy to test in a standardized setting.

Many important topics taught in secondary school are well-understood and amenable to standardized testing, including: math, reading comprehension, grammar, some aspects of science and history, etc.

> Since you think peer review is important, why do you point to non-peer-reviewed sources?

These books cite peer-reviewed sources and are a great starting point before digging further.

> Just looking at the authors shows that I expect them to have a pro-standardized testing viewpoint.

Everyone is biased. The NEA spends millions convincing people to drop standardized tests through their advocacy group, FairTest, which serves as one of their front organizations. Much of education academia is biased against standardized testing. Biases are everywhere, and were fairly obviously present in your sources that I checked. At some point, you have to pick a bias you trust more, and I trust the bias that says standardized tests are useful over the bias that says they should be entirely done away with.

1 comments

eesmith 1293 days ago

> "This paper has shown that the same VA measures are also an informative proxy for teachers’ long-term impacts."

Ah, as I figured, you are promoting VAM. I already mentioned how it's a difficult tool to use. And there are well-known problems with using VAM which aren't mentioned in that paper, which you don't seem to be aware of.

For example, a Texas court threw out EVAAS, as a way to evaluate Houston teachers, because of due process concerns, like how teachers are unable to have their score independently re-evalauted. The judge also points out the "house-of-cards" nature of VAM, and the ongoing academic debate about its applicability. https://www.courthousenews.com/wp-content/uploads/2017/05/Ho...

The VAM opponent expert witness presented their main arguments. Quoting http://vamboozled.com/houston-lawsuit-update-with-summary-of...

1) Large-scale standardized tests have never been validated for this use.

2) When tested against another VAM system, EVAAS produced wildly different results.

3) EVAAS scores are highly volatile from one year to the next.

4) EVAAS overstates the precision of teachers' estimated impacts on growth

5) Teachers of English Language Learners (ELLs) and “highly mobile” students are substantially less likely to demonstrate added value

6) The number of students each teacher teaches (i.e., class size) also biases teachers’ value-added scores.

7) Ceiling effects are certainly an issue.

8) There are major validity issues with “artificial conflation.” (This is the phenomenon in which administrators feel forced to make their observation scores "align" with VAAS scores.)

9) Teaching-to-the-test is of perpetual concern.

10) HISD is not adequately monitoring the EVAAS system. HISD was not even allowed to see or test the secret VAM sauce.

11) EVAAS lacks transparency.

12) Related, teachers lack opportunities to verify their own scores.

Here's one paper analyzing the specific details of the EVAAS numbers SAS generated for Houston - https://www.researchgate.net/publication/341532272_Methodolo... , with citations of its own about various issues with VAM. More below (via Google Scholar 'EVAAS houston effective').

> consistent as he or she moves across schools

Here's another paper: https://www.redalyc.org/pdf/2750/275022797012.pdf . "Almost half (46%) of a sample of HISD teachers who moved to different grade levels reported switching value-added ranks after the move, from “ineffective” to “effective” or vice versa, also across grade levels that were adjacent ".

If it's not consistent when moving grade levels, why do you think it's consistent moving across schools?

Is it because "Dr. William L. Sanders, the developer of the SAS ® EVAAS®, claims that teachers who move from one environment to another, even if radically different, continue to do just as well (LeClaire, 2011)"?

> GPAs are highly subjective, and more importantly, harder to compare across schools and even across classes.

And yet are a better predictor of future academic success than test scores. As I highlighted.

It appears you prefer to use use a worse predictor, one which requires an artificially imposed "high-stakes" testing environment, because it lets you do fancier types of data science that appeal to your sense that numbers are objective.

> strong research exist to show that SGPs are a valid and useful measurement

Remember earlier how you implied these methods were objective?

Odd that the paper you linked to says the other VAM methods didn't factor for a "drift in teacher quality".

Almost as if there's no agreement on what the model should be.

Almost as if the choice of model to use was also "highly subjective."

If they aren't subjective, then different VAM models should make the same predictions for the same population, right?

Points #2 and #3 above should be very rare, right?

And if they are not rare, they should not be used to determine who to fire, right?

> Remember, this was about measuring teacher performance, not student performance.

And VAM has not proved useful at measuring teacher performance, because of the flaws I quoted above.

I believe you approve of the idea of firing teachers with low VAM scores, which Houston and other school districts have done. Yet, quoting now from "All sizzle and no steak: Value-added model doesn’t add value in Houston" at https://journals.sagepub.com/doi/full/10.1177/00317217177341...

] while EVAAS was in use for educational reform purposes in Houston (i.e. to increase student achievement), Houston students saw no improvements of the sort that had been promised in grades 3-8 in reading, grades 4 and 7 in writing, grades 5 and 8 in science, and grade 8 in social studies (Figure 1, blue trend lines). In those subject areas and grades, tests scores declined overall from 2012 to 2015, as compared to other similar students throughout the state (black trend lines).

Almost as if VAM-based firing isn't a useful tool.

> and amenable to standardized testing

Yes, that's exactly my point. You highlighted the areas which are easy to test.

Composition is not easy to test, and it's also important. Being able to write an essay on Populism in the late 1800 US is not easy to test (not impossible - the AP American History tests do this, but it's expensive). But this is also a skill taught in school. My school required students take a practical art course. Yet testing for drafting skills, or wood working, or auto repair, isn't included in the high-stakes testing.

Why does it just happen to be that only those things which are easy/cheap to test are coincidentally the right topics to test?

> Everyone is biased

Film at 11. I don't listen only to Philip Morris scientists to judge if smoking tobacco has health problems.

> I trust the bias that says standardized tests are useful

So far it doesn't seem like you are aware of the evidence that VAM is not an effective method for deciding if a teacher should be fired. That would easily explain your comments.

link

PathOfEclipse 1292 days ago

> Ah, as I figured, you are promoting VAM.

I am promoting student growth measures in teacher evaluations, which is related to VAM, but not necessarily the same thing depending on who you are talking to: https://www.michigan.gov/mde/services/ed-serv/educator-reten...

> If it's not consistent when moving grade levels, why do you think it's consistent moving across schools?

Some models have been shown to be consistent across both grade levels and schools.

> And yet are a better predictor of future academic success than test scores. As I highlighted

And as I highlighted, GPAs are too subjective and in control of the teacher. If GPAs were used for teacher evaluation a teacher could write his or her own pay check via grade inflation. Standardized tests also act as a counterbalance in general to slow down grade inflation.

> Almost as if VAM-based firing isn't a useful too

Neither is trusting the teacher's unions to decide who to fire, because they fire virtually no one and even protect known incompetent teachers. We need a better solution than what we have now, and SGPs/SGMs have been shown to be far more effective than what we have now.

> So far it doesn't seem like you are aware of the evidence that VAM is not an effective method for deciding if a teacher should be fired

There is also plenty of evidence for using student growth measures as part teacher evaluations. You seem happy, however, to ignore the evidence I've given you that goes against your own bias. A little self-awareness might be in order.

link

eesmith 1292 days ago

> which is related to VAM,

I noticed that one of those two methods you linked to is the VAM method used in Houston, EVAAS.

> Some models have been shown to be consistent across both grade levels and schools.

So said the creator of EVAAS. However, the EVAAS method has not been published, the algorithm is proprietary, and the results of the Texas trial showed the limitations of EVAAS.

Your citations in support of your view are either published by people trying to sell you their testing system, or were published during the hype phase of VAM, before there was evidence that they were not able to do things like identify poorly performing teachers for the purposes of firing them.

> If GPAs were used for teacher evaluation a teacher

Like, duh. That's why GPAs aren't used for teacher evaluation.

You can't seriously think that 50 years ago there were no effective ways to evaluate teachers.

> Standardized tests also act as a counterbalance in general to slow down grade inflation.

And yet grades are still more effective at predicting college success than standardized tests.

Huh.

And we've had a full student generation of students required to do high-stakes testing, yet the decades of yammering about grade inflation is still going on, as if the two really coupled at all?

Huh.

And, umm, standardized tests are normed. If students or teachers across the US were getting better, norming would hide that improvement.

We know this because of the Flynn effect. Unnormed IQ scores have improved by about 15 points over the decades. Shouldn't this be reflected in improved overall school grades?

> Neither is trusting the teacher's unions to decide who to fire

We don't trust teacher unions to decide who to fire.

School districts decide.

Unions can slow it down, or provide legal support to stop it. They are their to help the teachers.

It's not like principals and school districts always follow the law and employment contract requirements. And never pressure teachers to give better grades to the football team, or the kid of the head of the school board.

> We need a better solution than what we have now

Non-union charters schools haven't proved any better.

We now have, what, a full generation of students that have gone through high-stakes testing?

When do we decide it's not worthwhile?

In my view, all this noise about testing and teacher evaluation is meant to justify school privatization, so private companies can profit from all that public school money, and rich people can get tax-payer subsidized good private school education while poor people are left with the dregs.

In my view, if you want to improve grades and future success then don't look to high-stakes testing. Look to free breakfasts and lunches for everyone, low limits on class size, more teaching aides, school nurses who can provide basic health care support, and more.

But those are expensive. So instead we punish the teachers which some secretive black box say are under-performing.

> using student growth measures as part teacher evaluations

That statement alone is meaningless. It could mean anything from "huh, your XYZ scores are a bit low, so we'll provide additional training for you" to "your XYZ scores are a bit low, we're going to fire you."

So when you say "as part teacher evaluations", you need to clarify just what you mean.

VAM has been used to fire teachers - which is what its supporters often want - and in violation of their due process rights.

VAM has not been used to provide more funding to lower-scoring schools to help with staffing or facility issues. Yet it could also be used for that.

link

PathOfEclipse 1291 days ago

> I noticed that one of those two methods you linked to is the VAM method used in Houston, EVAAS.

Yes, and the other method mentioned on the same page was student growth percentiles, which is much closer to what I have been advocating. You seem to have a habit of focusing on what you want, regardless of what the person you are talking to is actually saying.

> And yet grades are still more effective at predicting college success than standardized tests.

I'm not sure why you say this like it's a mic drop. The point of saying "standardized tests help anchor GPAs" is to say that, without standardized tests, GPAs would become a worse predictor of future success. So, even if GPA alone is better, standardized tests help it be that way. More importantly, GPA+standardized test is better than either alone. You keep conveniently forgetting that last part when you advocate for the removal of standardized testing.

> And, umm, standardized tests are normed.

Only some of them are. Otherwise, you couldn't make claims like the following: https://www.usnews.com/news/education-news/articles/2019-10-...

The tests in my state are not normalized but are instead standards-based: https://www.texasassessment.gov/en/staar-about.

> We don't trust teacher unions to decide who to fire. School districts decide.

Now you're just getting into semantic games. Unions help define the regulations that make it impossible to fire teachers, and they also use their money, influence, and legal might to fight against many firings.

https://www.americanexperiment.org/teachers-agree-teachers-u...

https://www.nationalreview.com/corner/ineffective-public-sch...

"They argue that existing public schools are not beyond saving. They suggest that reformers “commit to taking the tenure process seriously, rather than rubber stamping every eligible teacher for approval” and explain how this has been done in some New York City schools, where teachers immediately granted tenure fell from 94 percent to 56 percent. Such reforms are welcome, but they usually run into roadblocks from unions and stubborn regulators."

> Non-union charters schools haven't proved any better.

The data in this book beg to differ: https://www.amazon.com/Charter-Schools-Enemies-Thomas-Sowell...

Some charter schools in New York produced 10x or more as many students proficient in math and reading relative to nearby public schools, including public schools that shared the same building. There was no socioeconomic or racial diference between the schools.

Not all charter schools succeed, but the point is to find the ones that do and replicate their success: https://www.aeaweb.org/articles?id=10.1257/pol.20190259

> We now have, what, a full generation of students that have gone through high-stakes testing? When do we decide it's not worthwhile?

When do we finally settle on the fact that it is in fact worthwhile?

> In my view, all this noise about testing and teacher evaluation is meant to justify school privatization, so private companies can profit from all that public school money, and rich people can get tax-payer subsidized good private school education while poor people are left with the dregs.

Your world view is corrupt. You think private organizations are greedy and evil and yet somehow public organizations are magically saintly. You think apparently standardized scores in the U.S. are getting worse over time, despite there being more welfare programs than ever before, because kids aren't getting enough food benefits. Even the left-leaning Brookings institute can do better than that: https://www.brookings.edu/blog/social-mobility-memos/2016/12...

"Over the last 30 to 40 years, the United States has invested heavily in education, with little to show for it. The result is a society with more inequality and less economic growth; a high price."

One of my parents grew up in the 1950s before welfare reform. He got C's and D's in school because he spent most of his time working jobs to feed himself. Very few children are starving today at the level my parent did. Most of the "poor" today are richer than the middle class of 100 years ago. The reasons for lack of achievement go far beyond alleged malnutrition and far beyond an alleged lack of funding in schools.

I suggest you read "Life at the Bottom" by Dalrymple to get a better understanding of what many students are up against. The greatest predictor of academic success is whether or not the child is growing up in a stable 2-parent home. But that doesn't fit the progressive leftist narrative that seeks to destroy both religion and family.

https://www.bbc.com/news/education-47057787

https://link.springer.com/article/10.1007/S10680-017-9424-6

https://www.christianpost.com/news/2-parent-families-are-bes...

Some claim that socioeconomic status is the greatest predictor. But, guess what? 2-parent households do better socioeconomically. And, studies from countries with universal daycare like Sweden are showing that generations of students suffer from psychological problems due to not having a mother in the home: http://www.imfcanada.org/archive/1107/swedish-daycare-intern...

And, of course, religious is also highly correlated with better grades: https://www.marripedia.org/effects_of_religious_practice_on_...

As one person I admire has said, errant do-gooderism is akin to straightening the deck chairs on the Titanic. If you are a bad doctor, you see symptoms, diagnose the wrong illness, prescribe the wrong medication, and make your patients sicker. This allegory aptly describes the modern progressive movement, as well as most modern education fads.

The best thing we can do to improve education in the U.S. is school vouchers. This will allow more diversity in school makeup, better support religious schooling, which religion correlates with improved education, increase competition, which almost always benefits society, and improve the lot of socioeconomically disadvantaged parents who are currently stuck with poorly-performing public schools.

> That statement alone is meaningless.

Meaningless and high-level aren't the same thing. A statement doesn't need to go into specifics to provide meaning.

link