Tests and Assessment
Getting Test Results
What Do They Mean?
Dumbing-Down the State Tests
"A law where the consequences mean
that Arkansas has zero failing schools and Michigan has 1,500
is bound to have unintended consequences -- every state strives
to be Arkansas."
-- Lisa Snell,
"If you're in Oklahoma right now, you're told that 95% or 96% of your
schools are doing fine ... And if you're in
Massachusetts, you're told that 40% to 45% of your schools are doing
fine. But if you look at the actual achievement data, it suggests
that kids in Massachusetts are doing far better than kids in
-- Wall Street Journal, editorial, March 6, 2007
"We've seen a race to the bottom. States are lying to children.
They are lying to parents.
They're ignoring failure, and that's unacceptable."
-- Arne Duncan, U.S. Secretary of Education, quoted by David Brooks, New York Times, March 12, 2009
dumb down the test. Make the test easier,
more kids will pass -- simple!
The federal No Child Left Behind Act demands that states test the children
in public schools, and specifies penalties and remedies for schools
that are observed to be failing. Sounds great ... except that there is
a huge loophole. The easiest move a state education bureaucracy can make
to have more schools pass is to
Example: In New York, the state education bureaucracy recently observed that
too many students couldn't achieve the stated passing score of
55% on their "Regents" math test. Their solution? As of August 2003, they lowered the required
passing grade to 39%.
Right here in Illinois, we've seen tweaks and rejiggerings to make the state test easier
while sending scores higher, without any real improvement taking place:
"State may ease test norms: Board targets scores for 8th-grade math"
Chicago Tribune, February 23, 2006
As Illinois ramps up state testing to record levels early next month,
with nearly 1 million grade school students set to take tests,
education officials are looking at changing state rules to help more
If approved, more schools would be able to meet strict state and
federal standards that penalize schools if too many children flunk
the tests. But critics say the changes would amount to lowering
In addition, state officials are considering changing the way
Illinois evaluates test results, including using a more liberal
statistical formula that would help schools meet passing requirements
on the tests even if student scores fall short.
A year later, the Tribune editorialized:
"Illinois is one of the worst offenders ... Illinois tests have some of the lowest 'cut scores' in the nation"
Dumbing down the ISAT" Editorial, Chicago Tribune, October 4, 2007.
"Are the kids doing all right? A [new] study ... suggests
we have no idea. [It] bolsters the contention that many states
bob and weave their way around strict school-testing standards -- and
Illinois is one of the worst offenders.
A quick primer: A student is determined to be proficient in a subject
when he reaches or exceeds the 'cut score' on a standardized test.
Illinois tests have some of the lowest 'cut scores' in the nation,
particularly in math, the study finds."
Paul E. Peterson, a senior fellow at the Hoover Institution, observes,
"[A]ccountability systems tend to soften over time.
They may be legislated like lions, but they get implemented like lambs."
And how does that happen?
The Politics and Practice of Accountability
Martin R. West and Paul Peterson explain,
"As popular as tough
accountability is when first announced, it encounters political opposition as
time goes by.
Tough accountability has vague, general support from broad constituencies, but,
as its coercive teeth begin to bite, the individuals and groups most directly
affected complain bitterly. To ease political opposition,
standards are lowered, exceptions granted, and penalties postponed."
On October 23, 2002, U.S. Secretary of Education Rod Paige warned against dumbing-down
state tests to make bureaucracies look good. Here are some of his comments,
a letter to all state ed bureaucracy officials:
some states have lowered the bar of expectations to hide
the low performance of their schools. And a few others are discussing
how they can ratchet down their standards in order to remove
schools from their list of low performers. ...
" ... it is nothing less than shameful that some defenders of the status
quo are trying to hide the performance of underachieving schools in
order to shield parents from reality ...
"Those who play semantic games or try to tinker with numbers
to lock out parents and the public stand in the way of progress
and reform. They are the enemies of equal justice and equal opportunity.
They are apologists for failure. ... And they will not succeed."
"Once parents discover that children in their local schools are not learning
as well as they could, they will demand results - no matter how much
one state tries to buck accountability. ...
"As a former superintendent of the Houston Independent School District,
I understand the promise and the peril of improving schools. It takes
courage to confront the forces of bureaucracy, regulation, and special
- Has the bar has been lowered in Illinois?
Click here to read more on the Illinois Learning Standards.
Second City Ruse: How States Like Illinois Rig School Tests to Hype Phony Achievement
Wall Street Journal, July 18, 2009.
"The new Chicago report explains that most of the improvement in
elementary test scores came after the Illinois Standards Achievement
Test was altered in 2006 to comply with NCLB. 'State and local school
officials knew that the new test and procedures made it easier for
students throughout the state -- and throughout Chicago -- to obtain
higher marks,' says the report.
"Chicago students fared much worse on national exams that weren't
designed by state officials. On the 2007 state test, for example, 71%
of Chicago's 8th graders met or exceeded state standards in math, up
from 32% in 2005. But results from the National Assessment of
Educational Progress exam, a federal standardized test sponsored by
the Department of Education, show that only 13% of the city's 8th
graders were proficient in math in 2007. While that was better than
11% in 2005, it wasn't close to the 39 percentage-point increase
reflected on the Illinois state exam."
- The 2005 NAEP results:
- Gains on State Reading Tests Evaporate on 2005 NAEP (Introduction), Education Gadfly,
Thomas B. Fordham Foundation, October 20, 2005.
"Many have suspected that states are beginning to game the No Child Left Behind
accountability system to show academic progress where none actually exists.
The newly released 2005 National Assessment of Educational Progress results
tend to confirm those suspicions.
While nineteen states reported gains from 2003 to 2005 in the percentage of
eighth-graders rated 'proficient' (or the equivalent) on state reading tests,
only three showed any progress at even the 'basic' level on the NAEP test.
'The long-feared 'race to the bottom' appears to have begun,' said Fordham
president Chester E. Finn, Jr."
Gains on State Reading Tests Evaporate on 2005 NAEP (Report), Education Gadfly,
Thomas B. Fordham Foundation, October 20, 2005. Subheadings:
- "Has a 'Race to the Bottom' Begun?"
- "Gains on State Reading Tests Evaporate on NAEP"
- "Decline in 8th Grade Scores Points to 'Middle School Slump'"
- The 2006 Illinois ISAT results:
Making Grade Just Got Easier:
More Illinois schools met U.S. standards because of changes
by Stephanie Banchero, Chicago Tribune, March 13, 2007:
"A record number of Illinois schools escaped federal No Child Left
Behind sanctions this school year, largely because of changes in how
schools are judged and alterations that made state achievement exams
easier for students to pass.
"Nearly 82 percent of the state's public schools met the federal goals
on the 2006 state math and reading tests, compared with 74 percent
the year before, according to a Tribune analysis of state data.
"But 450 of the nearly 3,100 elementary and high schools that met the
federal goals did so because state education officials changed the
way students' test scores were counted, not because students
necessarily did better on the tests, according to the state data.
"'There is clearly a race to the bottom going on,' said Kevin Carey, a
policy director at Education Sector, a think tank that studied state
testing changes. 'When states change rules under No Child Left
Behind, it's always changes that will make it easier for schools. One
state will come up with an 'innovative' way to give schools the
statistical benefit of the doubt and then every state will follow
"There is clearly a race to the bottom going on"
- Chicago Sun-Times, March 6, 2007:
"The tests reflected a laundry list of changes. Some question whether
scores soared, in part, because of those changes. [Uh, gee, do ya THINK?]
- Ten extra minutes per reading and math section, for a total of 30
extra minutes per subject
- Kid-friendly test booklet, with new color illustrations and more
- New answer sheet, making it harder to put answers in the wrong place
- Pass score in eighth-grade math dropped from 67th to 38th
percentile to conform with other tests
- Reading and math exams expanded from third, fifth and eighth to
third through eighth
- New testing contractor and new scoring system
- Weight of open-ended "extended response" questions, among the
hardest questions on the ISAT, dropped from 15 to 10 percent
- In Chicago only: A second standardized test was dropped, making it
easier for teachers to focus on ISAT-tested skills. For the first
time, ISAT scores determined whether students had to attend summer
school and could apply to college prep high schools.
- In Chicago only: New reading tests given twice before ISAT to
pinpoint student weaknesses. The tests made by same company that
wrote the ISAT. Some schools also gave diagnostic math tests to
- The 2005 Illinois ISAT results:
Reprieve For More Schools: Relaxed Rules Mean Fewer U.S. Sanctions
by Diane Rado, Stephanie Banchero and Darnell Little, Chicago Tribune, September 15, 2005.
"Fewer Illinois public schools will face tough federal sanctions this
year for academic performance, in part because the state relaxed
rules on judging school progress. ...
This year, educators expected
the number of failing schools to rise in Illinois because the federal
reforms required more students to pass--47.5 percent, up from 40
percent in prior years.
"But Illinois, like three dozen other states, loosened requirements on
how schools are measured, making it easier to meet the new passing
Among the changes, the state used a new statistical approach that
allowed schools to meet the standards even if some of their test
results fell short. The state also relaxed rules on how some test
scores for special-education, minority and low-income students would
The changes are sure to cause confusion for parents and educators
trying to assess the strength of their schools, because comparisons
with prior years are virtually impossible.
"Supt. Donald Hendricks, in DuPage's Addison School District 4, said
he hopes the state doesn't make further changes, because 'then we'll
never have a sense of growth or honest achievement.'"
Narrowing The Grade-School Standards Gap,
CBS Evening News, May 30, 2007.
"Good grades always make teachers happy.
And in Georgia, there were plenty of smiles after statewide testing of
students ... 87% of the state's fourth-graders were rated
proficient in reading in 2005. But when Georgia's performance is
measured nationally, the numbers tell a different story: Only 26%
of the state's fourth-graders were rated proficient on a
national reading test ...
The problem, say experts, is one word: proficiency.
Each state can come up with its own definition. ...
That has some states crying foul, accusing other states of lowering the bar
to make their schools look more successful.
Just across the border from Georgia is South Carolina.
The two states score the same on that national test, but have very
different results on their state tests. Just 36% of South
Carolina fourth-graders were rated proficient in reading -- far below
Hot Air: How States Inflate Their Educational Progress Under NCLB
by Kevin Carey, Research and Policy Manager, Education Sector, May 2006.
complete PDF report, which includes the methodology and appendix.
From the introduction:
"No Child Left Behind (NCLB) gives states wide discretion to define
what students must learn, how that knowledge should be tested, and
what test scores constitute 'proficiency' -- the key elements of any
educational accountability system. States also set standards for high
school graduation rates, teacher qualifications, school safety and
many other aspects of school performance. As a result, states are
largely free to define the terms of their own educational success.
"Unfortunately, many states have taken advantage of this autonomy to
make their educational performance look much better than it really
is. In March 2006, they submitted the latest in a series of annual
reports to the U.S. Department of Education detailing their progress
under NCLB. The reports covered topics ranging from student
proficiency and school violence to school district performance and
teacher credentials. For every measure, the pattern was the same: a
significant number of states used their standard-setting flexibility
to inflate the progress that their schools are making and thus
minimize the number of schools facing scrutiny under the law."
Education Law Encourages Fuzzy Math
by Marie Gryphon, February 28, 2005. Excerpts:
"State and federal politicians ... are committed to a higher-stakes bluff: they must convince
us that the No Child Left Behind Act can work.
The Act wasn't designed to work. An accidental byproduct of
bipartisan compromise, it reflects no single idea of accountability.
Lauded as a great political achievement, it is unlikely to improve
student achievement. And like an emperor's nudity, no state or
federal functionary can afford to notice the fact.
States must generate the appearance of complying with the law,
including a display of 'annual yearly progress' necessary to stave
off the Act's more discombobulating remedies. They are doing so by
fudging their figures.
Some states play this game better than others. According to the RAND
Corporation, Texas boasted an 88 percent pass rate on its eighth
grade reading test last year while South Carolina turned in a
miserable 21 percent pass rate.
Texas children read far better than South Carolinians, one might
conclude. One would be wrong, though. On the standard National
Assessment of Educational Progress, scores from these two states are
nearly identical: South Carolina has a 24 percent "proficiency" rate
compared with only 26 percent among Texans.
Last year the state of Michigan reduced the number of 'failing'
schools under its care from 1,500 to 216. But this remarkable
achievement was merely a statistical sleight of hand. Michigan
lowered the minimum passing score on the state's assessment from 75
percent to a mere 42 percent, the Heartland Institute reports."
Report: Dishonest Education Reporting by States Is 'Widespread'
by Katie Farber, Human Events, June 2, 2005.
"Some of the education statistics sent by states to the federal
government in compliance with the No Child Left Behind Act simply
can't be trusted, according to a new Cato Institute study of the law.
'Sadly, dishonest reporting about graduation rates turns out to be
widespread,' writes Larry Uzzell in a Cato Institute policy brief
titled, 'NCLB: The Dangers of Centralized Education Policy.'
Uzzell, a former staff member of the U.S. Department of Education and
the U.S. House and Senate committees on education, cites the example
of California, which in late 2003 announced a graduation rate of
86.9%. However, California's own specialists admitted the true figure
was closer to 70%.
'Unless those data are honest and accurate and reliable, even when
the findings are threatening to the same people who are in charge of
finding and compiling it, then NCLB is not going to work,' Uzzell
said at a Tuesday debate on recent opposition to the act."
Defining Failure Down, Thomas B. Fordham Foundation,
November 18, 2004. Excerpt:
"According to the National Education Association, of the 41 states
that have reported their NCLB test results from spring 2004, 32
showed improvement in the number of schools meeting their adequate
yearly progress (AYP) goals. ... But
before anyone makes grand claims, take a careful look at what those
numbers mask. Specifically, while 32 states have reduced the number
of schools that 'need improvement,' according to the Center on
Education Policy, at least 35 states have amended the rules that
determine which schools pass and which schools fail.
As the Wall
Street Journal reports ...
the Education Department ... allowed Delaware,
among other states, to label a school district as failing 'only if
children at all three school levels--elementary, middle, and high
school--miss their learning goals.' (Previously, a district was
deemed 'in need of improvement' if children in one grade in any level
failed to meet AYP--a rule that last year resulted in 17 of 19
Delaware districts being labeled 'in need of improvement.') Nancy
Wilson, head of the Delaware school improvement office, insists that
'It's not about ducking accountability. It's about managing morale.
These labels are morale busters.' What, one wonders, is their current
morale based on? Failure?"
The Fight for High Standards
by Miriam Kurtzig Freedman, Hoover Digest, Summer 2004.
"An increasing number of states are requiring students to pass exit
exams in order to graduate from high school. Such tests simply
demonstrate what students have actually learned. So why do they make
some people so nervous? ... a strange thing is happening:
As we get closer to having the graduation tests 'count,'
many leaders have blinked, with the result that standards are
compromised and test results invalidated. But why is it happening -- why are some blinking?
Is it fear of litigation? It is confusion about legal requirements?
Is it the excruciating pressure on that one-diploma option?
Word choice is telling. It used to be that a student 'earned' a
diploma. Now many speak of a student being 'denied' a diploma. The
first is about standards; the second, about rights and lawsuits."
Who Needs School Boards?
by Chester Finn, Thomas B. Fordham Foundation, October 23, 2003.
"Whatever one's view of No Child Left Behind, it's a valiant effort to
bring needed change to American education. Where is the National
School Boards Association on this? Singing the establishment anthem,
which piously declares its support for NCLB's intentions and then
proceeds to pick apart almost every significant aspect of the law as
unworkable. What message does that send to America's 15,000 local
school boards? It's akin to Dad saying, 'Your mother told you to eat
your spinach but you really don't have to unless it's sprinkled with
sugar and eaten in front of TV. With ice cream to follow.'"
- Bastiaan J. Braams, a math prof at New York University, has put together a web page
Content Reviews of Standardized Assessments which looks at national tests,
as well as state tests in Texas, California, Florida, New York and Massachusetts.
New School Of Thought: 'Superior' Is Adequate
by Bud Kennedy, Dallas-Ft. Worth Star-Telegram, Sept. 25, 2003. Excerpt:
"Texas is grading schools again. This time -- surprise! -- nearly everybody gets an "A."
... Almost every area school district won top awards for "superior
achievement" ... How wonderful. In Texas, 84 percent of all schools are 'superior.'
... Most of the less-than-'Superior' school districts, 130, still
ranked as 'Above Standard" -- a 'B.'
In other words, Texas tested every district, and set the curve so low
that 96 percent of the grades were 'A's' or 'B's.'
If our schools are all so superior -- why test?
It's so superintendents can call their districts 'superior.'"
The Knowledge Deficit by Diane Ravitch. Ravitch says that standardized
tests are becoming easier and easier, and require less and less background
The Tests We Need by E.D. Hirsch Jr. "Curriculum-based tests hold more
promise than skills-based tests to promote significant
gains in achievement and equity."
by Thomas Sowell, October 29, 2003. Sowell errs in the opening sentence, when he
ascribes dumbing down to a political bias. (Education reform is not a one-sided
political issue: for more, see our page on politics, left and right.)
But after that, he nails the problem on the head, referring to
"a headline in the San Francisco Chronicle of October 25th:
'California School Rankings Improve.'
According to education officials quoted in the story, an
'unprecedented rise' in test scores has been achieved by 'shifting
away from a nationally normed test and toward exams that measure what
children are being taught in the classroom.'
In other words, when school children in California were taking the
same tests as children in other states, their results were lousy.
But, now that we have our own test, results are much better.
If you or I or anyone else could make up his own test, wouldn't we
all turn out to be geniuses?
The idea of gearing the test toward what is being taught in California
schools is turning things upside down. The whole reason for giving tests
is to find out whether students and schools are up to standards.
Obviously, if California schools teach drivel and there is drivel
on the tests, everybody looks good."
Children Left Behind Despite Bush Education Act
by Phyllis Schlafly, October 27, 2003. Excerpts:
"Because the penalties for not complying with act requirements are
severe, states and school districts have devised ingenious methods to
avoid sanctions. The Texas State Board of Education reduced the
number of correct answers students must provide to the test's 36
questions from 24 to 20 out of 36.
Michigan officials lowered from 75 to 46 the percentage of students
who must pass statewide high school English tests in order to certify
a school as making adequate progress. Colorado restructured its
grading system, lumping 'partially proficient' with 'proficient'
"One sanction imposed on failing schools is to give
their students the option of transferring to another school. Los
Angeles and Chicago officials are meeting this challenge by approving
very few transfers, citing overcrowding concerns. New York City
schools approved [only] 8,000 transfer requests, but one-third of the
students have been moved from one 'failing' school to another."
Keeping an Eye on State Standards
by Frederick M. Hess, Paul E. Peterson, Education Next, Summer 2006.
"While No Child Left Behind (NCLB) requires all students to be
'proficient' in math and reading by 2014, the precedent-setting 2002
federal law also allows each state to determine its own level of
proficiency. It's an odd discordance at best. It has led to the
bizarre situation in which some states achieve handsome proficiency
results by grading their students against low standards, while other
states suffer poor proficiency ratings only because they have high
Lake Woebegone, Twenty Years Later (PDF)
by John Jacob Cannell, MD., March 9, 2006.
"Almost twenty years ago, I wrote - and then privately published - the two 'Lake
Woebegone' reports, named after Garrison Keillor's mythical Minnesota town
where 'all the women are strong, all the men are good-looking, and all the
children are above average.' The first 'Lake Woebegone' report documented
that all fifty states were testing above the national average in elementary
achievement and concluded the testing infrastructure in America's public
schools was corrupt. The second report delineated the systematic and pervasive
ways that American educators cheat on standardized achievement tests. Both
reports received widespread national publicity, were extensively discussed in
academic journals, and helped spur the testing reform movement. ...
"This paper discusses how I learned about 'Lake Woebegone' testing, thereason
why I left the testing reform movement, and my observations on where testing is
today. Is No Child Left Behind (NCLB) testing much different from what was
occurring during the 'Lake Woebegone' years?"
Who Is Opposed to Testing -- and Why?
Putting "FairTest" to the Test by Dave Ziffer, December 15, 2004.
Who is this "FairTest" group that sets itself up as an authority on the
evils of school testing? Dave challenges the core premises of this group,
and argues instead for honest, objective evaluations.
Dave Ziffer is one of the founders of the Illinois Loop, and (for a while) the
operator of the I Can Read afterschool centers.
FairTest: Dumbing Down America
by Ken Blackwell, Townhall, February 22, 2009.
"The unfortunate trend is that too many schools are redefining merit
as it has traditionally been recognized. The main engine behind this effort to change the nature of academic
merit is a group called Fair Test, a Boston-based organization ...
The efforts and track record of this organization demonstrate that
simply administering a standardized test constitutes a misuse, while
the primary flaw of such tests is that they exist at all. ...
"In fact, the SAT and the ACT ... have long since addressed legitimate
claims of bias in testing. Both are scrupulously developed, reviewed
and updated by dedicated educators to ensure they reflect a student's
academic merit. They also are administered in a consistent manner ...
"The efforts of Fair Test and others who want to eliminate
standardized testing stand to put all of American higher education at
"Standards of academic excellence are critical to the future of
students and our economy. If we forsake such standards based on the
ill-conceived ideology of Fair Test and like-minded individuals, we
risk not only our children's future but that of our nation."
Why Testing Experts Hate Testing by Richard Phelps, Fordham Foundation, January 1999.
Phelps provides replies for the claims made by the ed school theoreticians
who oppose meaningful assessments of educational progress.
Test-Basher-Speak by Richard Phelps, January 26, 2001:
Dissects the vocabulary and obfuscation techniques used by test-bashers.
- Also see this related book:
"Kill the Messenger: The War on Standardized Testing"
by Richard P. Phelps with a forward by Herbert J. Walberg
Putting the Fox in Charge of the Hen House; Or, Why School Reform Often Fails to Improve Education
by J. Martin Rochester, Curators Distinguished Teaching Professor at the
University of Missouri-St. Louis, and author of
Class Warfare: Besieged Schools, Betrayed Kids, Bewildered Parents, and the Attack on Excellence.
"Schools of education and other parts of The Blob are mounting a
strong counter-attack against the growing national pressure for
testing and accountability, offering all kinds of rationalizations
for why testing is bad. Given the questionable effectiveness of many
trendy reforms adopted as part of 'continuous improvement,' it is not
surprising that educators are resistant to the search for empirical
evidence that might invalidate their theories and new "best
practices." To the extent that educators are promoting testing today,
it is in the form of "performance assessment" and portfolios,' which
are inherently subjective and unreliable as evaluation instruments
and, therefore, are unsuitable for purposes of high-stakes
accountability of the type school systems dread."
The General Patton of the Testing Wars
by Nicholas Stix, March 15, 2004. This is a meaty review of
Richard Phelps' book, "Kill the Messenger: The War on Standardized Testing."
"A week doesn't go by, without a mainstream media story on the
'horrors' of standardized testing, in which reporters tell of
widespread testing error, of how testing is causing students to drop
out of school, or of how testing is causing an epidemic of cheating.
The story behind the stories is that the relative prevalence of
testing error is infinitesimal, that journalists stressing the
dropout factor are mindlessly repeating a myth invented by radical
Boston College teacher education professor Walter Haney, and that
cheating is more easily prevented on standardized tests than with
For years, the American public has been force-fed a diet of
test-bashing by the establishment media, the teachers' unions,
professors of teacher education and well-financed anti-testing
organizations, in which test-bashers have twisted existing data,
ignored contrary data, and fabricated data outright. So reports
Richard Phelps in his brilliant new book,
Kill the Messenger: The
War on Standardized Testing."
- In this lively
review of a Time magazine article, education activist Elizabeth Carson
reveals the anti-testing bias that dominates much of the news coverage
on standardized tests.
Testing: Myths & Realities, WrightsLaw, August 21, 2003.
This article starts with a short preamble "Why Tests Are Necessary" and
then tackles these "9 Myths About Testing":
- Myth: Testing suppresses teaching and learning.
- Myth: Testing narrows the curriculum by rewarding test-taking skills.
- Myth: Testing promotes "teaching to the test."
- Myth: Testing does not measure what a student should know.
- Myth: Annual testing places too much emphasis on a single exam.
- Myth: Testing discriminates against different styles of test takers.
- Myth: Testing provides little helpful information and accomplishes nothing.
- Myth: Testing hurts the poor and people of color.
- Myth: Testing will increase dropout rates and create physical and emotional illness in children.
"Education After the Culture Wars" (PDF doc) by Diane Ravitch,
Daedalus, Journal of the American Academy of Arts & Sciences, Summer 2002
The Tests We Need by Herbert J. Walberg. Excerpts:
"Despite many reforms and substantially increased spending, our
schools are doing no better than they were in 1983. ... Few
professionals or other workers want to be held accountable; but, in
education, our nation's welfare and students' development are at
- In October 2003, a cover story in Time magazine promised
a look "Inside the New SAT." Much of the article
depended as sources upon the usual ranks of the test-bashers.
Elizabeth Carson, of the very active and effective reform
New York City HOLD replies with
Soccer Moms vs. Standardized Tests
by Charles J. Sykes, December 6, 1999. Excerpts:
"After decades of endless gold stars, happy faces and inflated grades,
American parents apparently were not ready for a reality check about how
much our schools are really teaching our children. ...
It is not surprising that more rigorous state standards have come
under fire from the usual opposition coalition of civil rights groups,
progressive educators and teacher unions.
What is striking though, is the opposition from soccer moms.
In Wisconsin ... most of the opposition to [a proposed exam] came not from
troubled urban schools, but from affluent suburbs.
... For much of this century, the educational establishment has behaved
as if it were addicted to bad ideas, indulging its own wishful and
romantic thinking even in the face of mounting evidence of failure.
... The schools had been allowed to obscure the fact
that many children were not mastering basic subjects. The constant
positive reinforcement of unrealistic grading and easy tests was
meant not only for the children, whose self-esteem remained strong in
the face of shaky math and reading abilities, but for their parents,
For many of these parents, the new tests were a very rude shock.
Accustomed to thinking of educational difficulties as somebody else's
problem, they and their school districts suddenly faced the
possibility of failure."
For more on the threats and challenges in suburban schools,
see our page, Illinois Loop: Suburbs.
The Assessment Debate by Jeffrey M. Jones, M.D..,Ph.D.
This is a quick intro to the essential issues in the types
of tests, and their merits. It also responds to anti-testing
groups that want to dumb-down, fuzzy-up, or eliminate tests entirely.
Test-Bashers Oppress Students, and Leave the Truth Behind
by Nicholas Stix, October 28, 2003. Some of this article's rhetoric
is a bit over-the-top, but it also provides some cogent arguments
against claims of the test-bashers.
- Alfie Kohn is a lecture tour speaker who rails against
testing and objective accountability. Prof. Ralph Raimi attended
one of Mr. Kohn's talks, and wrote
this account of what was said.
"Teaching to the Test"
GOOD Teachers Teach to the Test: That's because it's eminently sound pedagogy
by Walt Gardner, Christian Science Monitor, April 17, 2008.
"For the entire 28 years that I taught high school English, I taught to the test. ...
I know that fessing up to this perceived transgression will reflexively draw clamor from everyone with children in school.
... But stay with me here: This type of reaction is the result of a fundamental misunderstanding of both curriculum and instruction."
What's So Bad About Teaching to the Test?
by Lisa Rosenthal, GreatSchools.com, January 2008.
"If teaching content standards is considered 'teaching to the test,' it may not be such a bad thing. ...
Good test preparation focuses on making sure that students are meeting state standards ..."
Teaching to the Test:
Increasingly, Schools Are Finding It Just Makes Sense to Align Curriculum and Assessment
by Kevin Bushweller, American School Board Journal,
September 1997, National School Boards Association.
"Teaching to the test -- the very words have always been heresy to educators. ...
But today a new perspective (and a new education buzz phrase) is emerging.
It's called curriculum alignment, and it means teaching knowledge and skills
that are assessed by tests designed largely around academic standards set by the state.
In other words, teaching to the test."
State Tests Don't Make the Grade
by Linda Starr, Education World, March 26, 2002.
"A few years ago, one of the TV news magazines aired a segment on what was then
a fledging movement to assess student performance using standardized testing.
Part of the show, as I recall, involved a panel of teachers describing how
standardized tests were forcing them to abandon 'real teaching' in favor of
'teaching to the test.' I was appalled at the idea ...
In the intervening years, however, I've gradually been converted to
the belief that standardized testing is, in fact, a necessary good, a
vital means of determining whether students are learning what they
are supposed to be learning and of determining what they still need
to learn. Recent surveys indicate that most other teachers also have come to
accept -- if not to embrace -- the idea that standardized tests are a
legitimate way to assess student performance."
Let's Teach to the Test
by Jay Mathews, Washington Post,
February 20, 2006.
"Teaching to the test, you may have heard, is bad, very bad ...
[yet] in 23 years of visiting classrooms I have yet to see any teacher preparing kids
for exams in ways that were not careful, sensible and likely to produce more learning."
Teaching to the Test
by Thomas Sowell, Washington Times, August 23, 2002.
"There is much wringing of hands and gnashing of teeth because so
much classroom time is spent 'teaching to the test' as our
'educators' put it. Unfortunately, most of the people who call themselves
educators have not been doing much educating over the past few decades ...
While our students spend about as much time in school as students in
Europe or Asia, a higher percentage of other students' time is spent
learning academic subjects, while our students' time is spent on all
sorts of nonacademic projects and activities. Those who want to keep
on indulging in popular educational fads that are failing to produce
academic competence fight bitterly against having to 'teach to the
test.' ... If there has actually been such 'genuinely great
teaching,' then why has there been no speck of evidence of it during
all these years of low test scores and employer complaints about
semiliterate young people applying for jobs? Why do American students
learn so much less math between the fourth and the eighth grade than
do students in other countries? Could it be because so much more time
has been wasted in American schools during those four years?
Evidence is the one thing that our so-called educators want no
part of. They want to be able to simply declare there is genuinely
great teaching, 'creative' learning, or 'critical thinking,' without
having to prove anything to anybody."
The Fallacy of "Teaching to the Test"
By Leanne Hoagland-Smith.
"From a performance improvement perspective, teaching to the test is 100% absolutely correct."
Iowa Test of Basic Skills
The NAEP on Illinois
The National Assessment of Educational Progress (NAEP),
also known as "the Nation's Report Card," is the most respected
national assessment of what
America's students know and can do in various subject areas.
Website of the NAEP
- To get reports for individual states, including Illinois, click
Illinois and the Nation, SchoolMatters, a Service of Standard & Poor's. Use this site to see how Illinois compares with the rest of the country,
on NAEP and state tests in reading and math.
Multiple Choice Questions
Theorists who are opposed to an emphasis on developing a factual basis for
later understanding regularly raise objections to the use of multiple
choice questions in standardized tests.
- A detailed defense of the use of multiple choice questions is presented
by Prof. E. D. Hirsch, in his book
The Schools We Need: And Why We Don't Have Them.
Prof. Hirsch lists the major charges made, and responds to each.
Do Standardized Multiple-Choice Tests Penalize Deep-Thinking or Creative Students? (PDF)
by Donald E. Powers and James C. Kaufman.
This paper reports on a study of the relationship of Graduate Record Examinations (GRE)
General Test scores to selected personality traits, including conscientiousness,
rationality, ingenuity, quickness, creativity, and depth.
Analyses revealed statistically significant, positive correlations of
GRE verbal, quantitative, and analytical scores with both creativity and "quickness."
(Quickness was defined here by, for instance, the ability to handle a lot of
information and the ability to understand things.) In other words, the
"deep-thinkers" did better on multiple choice questions, just what we would hope for.
"The 'deep-thinkers' did better on multiple choice questions, just what we would hope for."
Who Scores Open-End Questions?
(a.k.a. "Authentic Assessment" or "Performance Assessment")
It's no secret that many ivory tower ed theorists despise
the notion of knowledge as a key goal of learning.
Instead, they argue, testing should evaluate
concepts and "critical thinking", and that therefore tests should require
thoughtful essay responses. There may be something to that in a
classroom situation, but is this actually accomplished by
the big national manufacturers who are now promoting
fuzzy, open-ended tests?
- What does it cost to ask a even a single open-ended question? $1.4 to $2.4 million!
Illinois ISBE officials estimate that asking a single open-ended
question in the state ISAT math test triples or quintuples the cost of administering the
test. From the Chicago Tribune, January 8, 2004:
"State officials estimate that it would cost an extra $2 million to $3 million
to include one open question for every grade on these two tests. By contrast,
the state's social science test, which is all multiple-choice and is scored
automatically by computers, would cost about $600,000 to administer and score."
- Here is a crucially vital question about open-ended
test questions: Who actually scores these tests? Read on...
Standardized Writing Assessments May Be Harmful To Children's Learning,
University of Chicago Chronicle, July 11, 2002, Volume 21, No. 18.
Excerpt: "George Hillocks Jr., Professor in English Language & Literature and
Director of the Master of Arts Program in Teaching [found that] in ...
Illinois ... the tests guide the curriculum and 'encourage the learning of vacuous thinking, thinking without substance.' ...
'As a result of the test conditions, writing teachers usually rely
on the formulaic five-paragraph structure. In Illinois, students have 40 minutes
to complete the task. ... Students then churn out essays with a 'first, next, last'
structure, but they are not taught how to discern real evidence or support for
their points. ... Evaluators reward students for following the structure,
but not for their choice of evidence. The result ... 'kids are passing the
tests by writing drivel.'
"Illinois ... encourage[s] the learning of vacuous thinking, thinking without substance ...|
Kids are passing the tests by writing drivel."
"Hillocks also researched the process of scoring
the Illinois test. A company
in North Carolina, which supervised the process, trained its judges to grade
each writing test on a 32-point scale within 60 seconds. Under this type of
time-pressure, judges simply looked for the formula. 'Any teacher who has ever
taught writing knows that 60 seconds is not enough time to grade a paper.'"
Who's Scoring Those High-Stakes Tests? Poorly Trained Temps
by Cameron Fortner, Christian Science Monitor
"Growing up in California's public schools ... I approached each test with
all the solemnity and effort a child can muster. ...
My summer as a test-scorer disabused me of that notion. ...
Instead of the professionals I'd envisioned painstakingly grading exams,
I found a room full of temporary employees who had little respect for --
and minimal investment in -- their jobs."
Temp Workers Score WASL Tests ... In Minutes:
The Sun, Bremerton, Washington, August 28, 2000. Excerpt:
"The state's new standardized tests -- used to determine how money is
allocated to schools, which courses are taken and, eventually, who
graduates from high school -- are graded in minutes by $10-an-hour temps."
- Temps Spend Just Minutes To Score State Education Test,
Seattle Times, August 27, 2000. Excerpts:
"In a matter of minutes, a $10-an-hour temp assigns a score to your child's test,
a grade that helps determine how money is spent in Washington schools,
which courses students take and, before long, who is denied a high-school diploma.
Such weighty decisions rely on the judgment of seasonal workers with 16 hours of
training who sift through dozens of exams each day.
Working at assembly-line pace, these college-educated moonlighters spend as
little as 20 seconds grading each math question ...
And scorers plow through as many as 180 writing essays a day,
at a rate of 2 1/2 minutes each. ...
Several other scorers confessed to skimming tests. ... A recent grad
said she looked for certain key numbers or phrases in math problems
so she wouldn't have to read a whole paper. She described herself as
an easier grader, giving students the benefit of the doubt."
- The above article also includes these quotes from some
of the $10/hour part-time "graders" of open-ended essay
questions for a national standardized test:
- "It's low-maintenance, low-cerebral work"
- "I kind of bop in and out according to their workload"
- "The scoring guide isn't always clear when it's between a 2 and a 3"
- "They started hounding us about the
pace. [Supervisors asked us to] pick up
the rate [and told us,] 'Don't pay as much attention to accuracy.'"
- "After doing this work, I know for sure that I don't want my own children
to take these kinds of tests"
- From the same article, here are comments by educators after learning
how open-ended standardized tests are actually graded:
- "I had the impression it was a little more thorough and scientific
than that. The [tiny] amount of time they spend [scoring] surprises me
a lot - I couldn't do it in that length of time." (a teacher)
- "The part that bothers me is there's no double-check .. [assigning
a grade based on one person's scoring] clearly would
be subjective." (a principal)
- "This is a very fallible process, and mistakes are going
to be made. ...
And yet people take those numbers as if they're written
in stone." (a college professor)
- "Graders bring their own preferences about writing to the job. ...
Some are drawn to grammar and spelling, while some are swayed by ideas,
and others give weight to vocabulary and expression." (a professor who has
researched open-ended test grading)
- "NCS to hire up to 500 to score K-12 tests; Office in Tucson to pay $10 an hour",
by RuthAnn Hogue, The Arizona Daily Star, November 10, 1999.
"National Computer Systems is looking for up to 500 Tucsonans who
can make the grade -- literally. .. It's opening a Tucson office to score the
tests which are administered by states around the country.
'We will be hiring as many employees as we can find to actually come in,'
said Jim O'Connor, a company spokesman ... Scorers ... will be paid
about $10 an hour ... [They] will not physically handle the tests they score.
Instead, they will do image-based scoring, meaning they will view images
of tests that have been scanned into a computer."
Right Answer, Wrong Score: Test Flaws Take Toll
by Diana B. Henriques and Jacques Steinberg, New York Times, May 20, 2001.
- "Jake Plumley was pulled out of the classroom ... and told to report to his
guidance counselor. ... The news was grim. Jake, a senior, had failed a
standardized test required for graduation. ... In fact, Jake should have been elated.
He actually had passed the test. But the company that scored it had made an
error, giving Jake and 47,000 other Minnesota students lower scores
than they deserved."
- "Despite the recent mistakes, the industry says, its error rate is infinitesimal
on the millions of multiple-choice tests scored by machine annually.
But that is only part of the picture. Today's tests rely more heavily on
essay-style questions, which are more difficult to score. ...
Testing companies turn the scoring of these writing samples over
to thousands of temporary workers earning as little as $9 an hour."
- "Several scorers, speaking publicly for the first time about problems they saw,
complained in interviews that they were pressed to score student essays without
adequate training and that they saw tests scored in an arbitrary and
inconsistent manner. 'Lots of people don't even read the whole test - the time
pressure and scoring pressure are just too great,' said Artur Golczewski, a doctoral
candidate, who said he has scored tests for NCS for two years, most recently in April."
- "The pressures reported by NCS executives are affecting the temporary workers
who score the essay questions in vogue today, said Mariah Steele, a
former NCS scorer and a graduate student in Iowa City.
But one evening in late July ... Ms. Steele said, she was asked by her supervisor
to stop grading math and switch to a reading test from another state,
without any training. 'He just handed me a scoring rubric and
said, 'Start scoring,' Ms. Steele said. Perhaps a dozen of her co-workers
were given similar instructions"
- "Renée Brochu of Iowa City recalled when a supervisor explained that a certain
response should be scored as a 2 on a two-point scale. 'And someone would gasp
and say, 'Oh, no, I've scored hundreds of those as a 1,' Ms. Brochu
said. 'There was never the suggestion that we go back and change the ones
already scored.' Another former scorer, Mr. Golczewski, accused supervisors of
trying to manipulate results to match expectations. 'One day you see an essay
that is a 3, and the next day those are to be 2's because they say we need
more 2's,' he said. He recalled that the pressure to produce worsened as
deadlines neared. 'We are actually told,' he said, 'to stop getting too involved
or thinking too long about the score - to just score it on our first impressions.'
One reading teacher said she was assigned to score eighth-grade
math tests. 'I said I hadn't been in eighth-grade math class since
I was in eighth grade,' she said. Another teacher, she said, arrived late
at the scoring session and was put right to work without any training."
We Hung The Most Dimwitted Essays On The Wall
by Amy Weivoda, June 6, 2002. "The biggest case against [open-ended questions on]
standardized testing might be the people who score the tests --
people like me, for instance. ... I ended up scoring public school students'
standardized tests. It paid [$6 an hour,] a dollar more per hour [than working at a Dairy Queen],
although I'd have to bring my own lunch. ...
The interview was a little like jury duty. I appeared at the required time.
I twitched in a room with 20 or 30 people until my name was called. An intake
person photocopied my diploma and assigned me to the Georgia Basic Skills Test.
I was asked to report to work the following Monday. That was it. I did not have
to interview, network or submit a résumé. I did not need a background
in K-12 education. I did not need to care about or understand children, although
it was obvious I'd been one, pretty recently."
Grading This Article? First, Take Time to Learn the Rules
by Tamar Lewin, New York Times, June 11, 2003.
"At my grading session, about 100 teachers from across the country
are being paid $22 an hour to grade the 33,000 essays produced at
the May 3 SAT II writing test. Each essay is read by at least two graders,
so over the five days, each one will be plowing through some 660 essays.
Our instructions don't help me much."
Fuzzy curricula and vague project-oriented schoolwork
create a dilemma
for school administrators: With such amorphous materials, how can anyone document
if children are actually accomplishing anything? The education industry's response
to this need is the portfolio: Put many examples of a child's productions into
a folder, so as to impress with bulk and variety, if not substance.
Charles Sykes, in his powerful and thorough book
"Dumbing Down Our Kids"
(click for more info) raises many disturbing questions about portfolios:
The idea of the portfolio is to gather together collections of
student work over a semester or a school year -- student essays,
reports, math assignments, post-projects -- to provide a more
meaningful measurement of their progress and skills. Portfolios are
popular in programs that claim to be outcome based or "performance
based," because theoretically they demonstrate what children can do.
As attractive as the idea might sound in theory, there is evidence
that it does not work nearly as well in reality. In some classes a
portfolio might consist of a handful of paragraphs, while in another
it might consist of detailed essays that require extensive research.
Such inconsistencies continue to plague so-called "authentic
assessments." A detailed study of Vermont's use of "portfolio
assessment," for example, found that teachers scored the portfolios
inconsistently. An evaluation of the pioneering effort for the
state's education department found the scoring of portfolios
"unreliable because in too many instances, two individual teachers
graded the same collection of work differently." The study found that
the teachers' scores on student writing portfolios agreed with one
another less than half the time. Even on portfolios of math work,
the Vermont study found that teachers agreed on their scores less
than 60 percent of the time. Such inconsistency would seem to
suggest that judging the portfolios is so subjecrive as to be
meaningless for purposes of comparison. One teacher's score of 90
might be anbother's score of 80, and another's 96. As a system of
reliable measurement and accountability, portfolios are essentially
useless in practice.
After raising such serious doubts about the supposed merit of portfolios,
Sykes goes on to point out some of the hidden costs:
Although enthusiastically backed by educationists, the
alternative assessments are also remarkably burdensome and
time-consuming for teachers. Schools in Great Britain have
experimented with "alternative" and "authentic" assessments that
involve portfolios and other required "performances." Researchers
there found that virtually every teacher surveyed "reported that
major disruptions had occurred to normal classroom practice, and half
of those surveyed felt that the [alternative assessments] were
totally unmanageable." By one estimate it took 82 to 90 hours "to
plan for the assessments, mark them, and record the marks." Another
estimate put the time alloted to the alternative assessments at two
to five weeks out of the British schools' year. Such a huge
investment of time translates into huge costs for the educational
system. If similar assessments were employed in American schools,
some experts have estimated the costs could run into the billions of
In one of the most important and thoughtful books on school reform,
"The Schools We Need: And Why We Don't Have Them" (click for more info) by E. D. Hirsch,
Prof. Hirsch defines "portfolio assessment" this way:
In portfolio assessment, students preserve in a portfolio all or some of their
productions during the course of the semester or year.
At the end of the periof, students are graded for the totality of their
production. It is a device that has long been used for the teaching of writing
Beyond that, Prof. Hirsch sees problems:
But there its utility ceases. It has proved to be virtually
useless for large-scale, high-stakes testing.
Dr. Elaine McEwan says (in her book
"Angry Parents, Failing Schools") (click for more info) that this practice justifiably has alarmed many parents:
"...The current practice in some schools of refusing to send work home
with students because it needs to stay in the school portfolio has many parents
alarmed because they want to review their child's progress."
Dr. McEwan also notes a study conducted in 1995 by a panel of nationally known experts
looking into the use of portfolios in one statewide system. She reports, "They
concluded that the assessments based on portfolios were 'inappropriate' and
'too flawed' for use in their system of accountability."
In her excellent book on teacher education,
"The Feel-Good Curriculum: The Dumbing-Down of America's Kids in the Name of Self-Esteem"
Dr. Maureen Stout has this to say (pages 145-6) about portfolio assessment:
Standards and evaluation are complex but very necessary aspects of
schooling, without which there is no way to gauge student progress
and achievement. But by introducing gradeless report cards, or
report cards with numbers instead of letters ..., or dispensing with
them altogether, we have no way to determine whether Johnny has
learned what he was supposed to.
Tom Burkard, an educator in England, comments:
Educators have invented portfolios as a way around this problem, but
although they can be useful, they do not, by themselves, provide a
complete picture of progress or achievement. Portfolios are a
collection of a student's best work in a term or a year and thus are
a good measure of progress over time. They provide no information
however, on how one student compares to her peers or whether or not
she has reached a desired standard, so they have limited value.
To make matters worse, rather than use old-fashioned words like
"excellent", "good", "fair" or "poor" (or, heaven forbid, A, B, C, or D)
portfolios are typically judged using words like "distinctive",
"appropriate" and various other euphemisms that make it virtuallly
impossible to understand whether Johnny can write or not. Because
the words are so vague, each of us will likely interpret them
differently. Consequently, it is virtually impossible to obtain
either a consensus or a clear understanding on just what Johnny's
capabilities and achievements are.
... None of this should come as a surprise. Portfolios were part of
an enterprising and well-orchestrated attempt by educators and
self-esteem advocates to avoid anything like accurate, objective
assessment (things like tests, actual grading, comparing students to
a clear standard) and they have succeeded beyond anyone's wildest
"I've always had the suspicion that the main purpose of student portfolios is
to provide proof that subjects have been 'covered'. Unfortunately, they
don't always reflect what they have actually learnt. In England, they are
usually a reflection of what parents have done. There's nothing like a
good, old-fashioned test for finding out what students have learnt. Beats
the hell out of project work any day."
More on Portfolio Assessment
Portfolios: A Backward Step in School Accountability
by Robert Holland, Lexington Institute, September 2007.
"Portfolios are collections of student work, such as essays, artwork,
and research papers. Progressive educators long have advocated that
portfolios be substituted for paper-and-pencil tests because they are
more 'natural' and 'authentic.'
In the 1990s, Vermont and Kentucky implemented portfolio assessment
as an integral part of education reform plans. Separate studies by
nationally respected researchers showed that as a school
accountability tool, portfolio assessment was a huge flop in both
states, yielding results that were wildly unreliable and very
expensive to obtain.
Among the problems found:
- A failure to yield reliable comparative data.
- Large differences in the way teachers implemented portfolios.
- Major differences in the degree of difficulty of assignments, rendering comparisons among
students or groups of students highly misleading."
Comments on Portfolios and Accountability
by Richard Innes, August 27, 2007.
"... In Kentucky the writing portfolio program actually hampers the teaching of
writing because any piece a child creates can be selected later as a
portfolio item. As a result, teachers are constrained on the types of
correction comments they can make on student papers. The constraints
severely impact the effective teaching of punctuation and grammar.
Spelling, of course, has not been a key requirement in Kentucky ever since
our fad-laden reform was enacted in 1990. ..."
by Jay Mathews, Education Next, Summer 2004. Excerpts:
"Most critics of portfolio assessment say they like the emphasis
on demonstrated writing and oral skill, but have seen too many
instances in which a refusal to give traditional tests of factual
recall leads to charmingly written essays with little concrete
information to support their arguments."
Portfolio Assessment in the Therapeutic State by Dr. Martin Kozloff, November 2004.
Here's the start of this pull-no-punches essay:
"You can hardly take a step in Edland without tripping over a
portfolio. Little kids in fourth grade ... are busily selecting, cutting, pasting, magic
markering, stapling, and binding 'artifacts' and 'evidences' of their
'authorship' of 'literacy materials' for 'authentic assessment' of
portfolios. And when they bring these foul creations home -- covered
with glitter and half-dried Elmer's Glue dripping off the
sides -- their parents Oo and Ah and assume their kids have learned
something. It takes a cynical and heart-hearted parent to look at his kid's
portfolio and say, 'How's this different from toting home all your
junk in a sack?'
Well, the portfolio biz is no longer limited to kids. Having made the
little ones illiterate (with whole language) and unable to make
change (with fuzziest math), the ed establishment in some districts
now requires graduating high schoolers to present their portfolios to
a board of portfolioticians for evaluation."
- For a biting satire of portfolio assessment, see
Scoring Rubrics for Portfolio Assessment by Dr. Martin Kozloff.
In High Schools, a 'B' Is New 'C':
Higher Grades Not Matched By Higher Test Scores,
by Eleanor Chute, Pittsburgh Post-Gazette, June 3, 2007.
"At high schools across the country, more and more students are
graduating with grade-point averages of A, including some whose
averages are well above the traditional 4.0 for an A.
Grades ... are getting so high that a solid B is
becoming the new C, which years ago was considered average.
Consider: ... Seniors at Pine-Richland High School who have a weighted grade point
average below 3.3 -- a B -- are in the bottom half of the class."
High schools inflate grades, and parents are fooled,
USA Today, August 30, 2001, page 12A. Excerpt:
"Among a blizzard of charts released Monday to explain an uptick in scores on
the nation's favorite college-entrance exam, the SAT, one stood out:
Since 1991, the percentage of [SAT] test takers with averages ranging from
A-minus to A-plus soared from 28% to 41%. Taken at face value, that seems
like good news. ... But scratch a little deeper ...
The verbal and math SAT scores for those A-average students are declining --
a signal of grade inflation. That suggests students are receiving good
grades for material they didn't learn."
Your Kid's A+ Doesn't Mean Much:
Students' GPAs are skyrocketing, but their knowledge is plummeting
by Michael Skube, Los Angeles Times, March 4, 2007.
"High school students' grade-point averages keep going up, up, up --
and what students actually know stays where it's always been. If
anything, students seem to know even less, GPAs notwithstanding. ...
high-school GPAs are all but meaningless. For too many students, the
luster is lost once they arrive at college and are expected to know
certain rudimentary things -- an acquaintance, for example, with the
geography of the world, the contours of U.S. history, the parts of
speech. There is, in other words, little correlation between the GPA
and what a graduating high school student knows."
The Gentleman's "A", Education Next, Spring 2004.
of this article is from the National Council on Teacher Quality:
A new study out of Florida backs up teachers who are tough graders. ... Are
elementary students more likely to make real gains when assigned to a
teacher who makes it harder to get an A than to squeeze blood out of a
turnip? The study concludes that, overall, tough grading policies
benefit all students academically and high ability students more so
than low ability students. The impact, however, differed dramatically
depending on the ability of the student and the average ability of the
class as a whole. High ability students did best with teachers with
high grading standards when the overall class performance was low. Low
ability students, on the other hand, reacted better to tougher grading
standards when the overall class performance was high.
- Item in
Economic Trends by Gene Koretz,
Business Week On-Line,
Monday, August 27, 2001: "... [a study from the National Bureau of Economic Research]
... indicates that shifting to a class with high grading standards significantly
improves learning. And shifting from a tough teacher to an easy grader retards
learning by a similar amount. The results hold up regardless of students'
relative achievement levels and racial or economic backgrounds."
Inflating grades simply deflates education by William Bainbridge,
as printed in Columbus Dispatch [Dispatch.com], Saturday, July 21, 2001.
Tough teachers improve students, comment by Judith Kleinfeld, professor of psychology at the University of Alaska,
Anchorage Daily News, August 31, 2001
High School Grade Inflation From 1991 to 2003 (PDF) by David J. Woodruff and Robert L. Ziomek, ACT, Inc.
Also see the sections of this web site regarding:
Inside High Ed: A good source for recent articles about SAT issues, methods and controversies.
- The latest changes to the SAT involve a number of elements that raise concerns
about highly subjective assessments and constructivist approaches. One area that is
sure to generate controversy is the proposed new "essay" requirements
of the "Writing Section" of the new SAT. The College Boad itself,
on its webpage on the
"Writing Section", says,
Students will be asked to write a short essay that requires them to
take a position on an issue and use examples to support their
The concern here is over what the College Board means by "an issue."
further description, the College Board gives as an example
a very benign "issue":
Think carefully about the issue presented in the following excerpt ...
The concern is over what could happen if the "issue" chosen involved a more
controversial or "politically correct" topic. The College Board intends to send
photocopies of all such essay to all colleges getting the test results, raising serious
questions about how a student is to answer if different colleges may be looking
for certain "correct" viewpoints. Stay tuned!
...each failure leads us closer to deeper
knowledge, to greater creativity in understanding old data, to new
lines of inquiry.
Assignment: What is your view on the idea that it takes failure to
achieve success? Plan and write an essay in which you develop your
point of view on this issue. Support your position with reasoning and
examples taken from your reading, studies, experience, or
SAT Essay Test Rewards Length and Ignores Errors
by Michael Winerip, New York Times, May 4, 2005.
"Dr. [Les] Perelman is one of the directors of undergraduate writing at
Massachusetts Institute of Technology. He did doctoral work on testing and
develops writing assessments for entering M.I.T. freshmen. He fears that the new 25-minute SAT
essay test that started in March - and will be given for the second time on Saturday --
is actually teaching high school students terrible writing habits.
'It appeared to me that regardless of what a student wrote, the
longer the essay, the higher the score,' Dr. Perelman said. ...
'I have never found a quantifiable predictor in 25 years of grading
that was anywhere near as strong as this one,' he said. 'If you just
graded them based on length without ever reading them, you'd be right
over 90 percent of the time.' The shortest essays, typically 100
words, got the lowest grade of one. The longest, about 400 words, got
the top grade of six. In between, there was virtually a direct match
between length and grade.
He was also struck by all the factual errors in even the top essays.
An essay on the Civil War, given a perfect six, describes the nation
being changed forever by the 'firing of two shots at Fort Sumter in
late 1862.' (Actually, it was in early 1861, and, according to
'Battle Cry of Freedom' by James M. McPherson, it was '33 hours of
bombardment by 4,000 shot and shells.')
Dr. Perelman contacted the College Board and was surprised to learn
that on the new SAT essay, students are not penalized for incorrect
facts. The official guide for scorers explains: 'Writers may make
errors in facts or information that do not affect the quality of
their essays. For example, a writer may state 'The American
Revolution began in 1842' or 'Anna Karenina,' a play by the French
author Joseph Conrad, was a very upbeat literary work.' (Actually,
that's 1775; a novel by the Russian Leo Tolstoy; and poor Anna hurls
herself under a train.) No matter. 'You are scoring the writing, and
not the correctness of facts.'"
- No SAT required for admission:
Dumbing Down America's Colleges
by Alan Caruba, July 7, 2008.
"[Now] we have the specter of university and college presidents
eliminating one of the most respected tools for measuring a
prospective student's ability to qualify for admission. The venerable
SAT, the gold standard for measuring readiness for college for nearly
80 years, is slowly being eviscerated by colleges and universities.
... Disparaging the SATs for helping set high academic standards
ignores the fact that more than two million students take the SAT
every year and that more than 88% percent of America's colleges
require it for admission.
Those that don't require the SAT for admission often use it for
course placement and scholarship consideration. The overwhelming
majority of colleges use the SAT because it has acquired a
well-deserved reputation for its ability to aid the evaluation
The best way to prepare for college and the SAT is to work hard in
high school and take a well rounded curriculum. Cheating qualified
students who have taken the time and effort to prepare for this by
devaluing and eliminating the SAT is just wrong."
Predictions of Freshman Grade-Point Average From the Revised and
Recentered SAT I: Reasoning Test (PDF doc) by
Brent Bridgeman, Laura McCamley-Jenkins,
nd Nancy Ervin, College Entrance Examination Board, New York, 2000.
This technical report examines how well the SAT predicts the freshman
grades of students entering college. The answer, in short, is the SAT does
an excellent job on predicting college success. One disturbing finding is
that African-American and "Hispanic/Latino" men actually performed less
well in their freshman college grades than the SAT predicted.
- A look at historical SAT scores shows an ominous trend: "Between
1963 and 1994, there was a nationwide drop in verbal and math scores on the SAT exam.
On the math test, the score declined from a postwar high of 502 in 1963 to a low
of 466 in 1980 ... On the verbal test, the score declined from 478 in 1963 to
424 in 1980 ... critics attributed the drop to weaker curricula" (from Martin Rochester,
Class Warfare, page 57).
- But the College Board found a quick fix in 1995:
"Recentering" the SAT made the dropping scores look better: For example,
your old SAT verbal score of 590 is equivalent to a score today of 660.
The College Board
thoughtfully provides tables to compare "old" SAT scores with the new
dumbed-down, oh, excuse me, I meant "recentered" SAT scores.
Use this table
to convert old and new scores for individuals,
and this table
to convert scores for groups.
- "Also starting in 1995, SAT test-takers were given 30 more minutes to answer fewer questions.
They were permitted to use calculators on the math section. By 2002, the notoriously
difficult analogies section of the verbal test had been eliminated. This is in
addition to special accommodations for 'disabilities' sufferers."
(Martin Rochester, Class Warfare, page 58)
SAT Scores Up, Compared to What?
by Diane Ravitch, September 4, 2003.
"The famous SAT score decline began in 1964; scores hit bottom about
1980 and have slowly begun to come up since then, at least in math.
Tracking the SAT score trends became much harder after 1994, the year
the College Board decided to 'recenter' the scores. For reasons that
I have trouble remembering, the College Board decided to declare that
the 1994 average scores in both verbal and math were 500. This was an
immense boost for verbal scores, which had languished around 430 for
a whole generation. So, voila, 'average' scores were re-pegged at
whatever they were in 1994."
The Effects of SAT Scale Recentering on Percentiles (PDF),
College Board, Research Summary RS-05, May 1999.
Look here for hard data on the effects of recentering.
The table at the end is particularly interesting, because it shows
how many points (as compared to the pre-1995 tests) are being added
to various types of scores.
The Recentering of SAT Scales and Its Effects on
Score Distributions and Score Interpretations (PDF),
College Board, Research Report 2002-11.
This detailed analysis documents the long-term drop
in SAT scores throughout the latter half of the 20th century (leading
to the recognized "need" for recentering).
Of particular interest is section II, "Brief History of SAT Score
Scales," which discusses the 20th century SAT score declines in
- And then there's this, from the parody newspaper, The Onion,
for March 8, 2001:
"Guidance Counselor Prefaces SAT Results By Talking About Test's Flaws:
Mahwah, NJ -- In a preamble that boded poorly for the academic future
of Mahwah High School senior Kevin Stember, guidance counselor Elvin Cross
prefaced Stember's SAT scores by downplaying the test's reliability
and worth Monday. 'You know, the SAT is a flawed, inexact measure of
one's abilities,' a grim-faced Cross told Stember. 'It measures what
you know rather than what you're capable of doing.' Cross added that
the SAT fails to judge many essential real-life skills, like punching
in on time and maintaining a clean uniform."
"The SAT ... is biased.
It's biased against people who aren't well-educated."
- SAT tutor David S. Kahn, writing in the Wall Street Journal,
May 26, 2006:
"People complain that the SAT is biased and that the bias
explains why students don't do well. That's true -- it is biased.
It's biased against people who aren't well-educated. The test isn't
causing people to have bad education, it's merely reflecting the
reality. And if you don't like your reflection that doesn't mean that
you should smash the mirror."
Finding the Pony in the ACT: Just about every state can find some good news somewhere in the ACT results,
if they look hard enough!
Click here for the Illinois Loop's analysis of the 2006 ACT results
across the nation and in Illinois.
- Changes in ACT scores may reflect who is taking the test:
Jump in ACT Scores by Scott Jaschik, Inside High Ed, August 16, 2006.
"One of the changes noted by ACT officials is a substantial increase
in the number of people taking the ACT outside of the Midwest. Almost
all colleges accept both the ACT and the SAT, although in areas like
New England, colleges are far more likely to receive SAT scores, even
when the ACT is an option. In recent years, more students applying
from New England or applying to New England colleges have started
taking the ACT (frequently still taking the SAT) to go for the best
possible score to submit. ...
For whatever reason, more students are taking the ACT in parts of the
country where that was once unheard of. In New England, 19,721
students to the ACT this year, up 13 percent from last year's total,
which was also 13 percent higher than the previous year. Other states
where the numbers of ACT test-takers were up significantly this year
included Florida (+14%), New Jersey (+33%), and Oregon (+12%)."
- Here are descriptions of the subject-area components of the ACT,
taken directly from the ACT website:
- The English Test:
The English Test focuses on the student's
understanding of the conventions of standard written English
(punctuation, usage, and sentence structure) and of rhetorical skills
(writing strategy, organization, and style). Spelling, vocabulary,
and rote recall of rules of grammar are not tested. Students are
asked to make a variety of decisions about revising or editing short
essays, which are written especially for the test in order to reflect
the interests and experiences of students.
- The Reading Test:
The Reading Test measures the student's reading
comprehension as a product of referring and reasoning skills. The
test questions require students to derive meaning from texts by
(1) referring to what is explicitly stated and (2) reasoning to
determine implicit meanings, draw conclusions, and make comparisons
and generalizations. The tests consist of copyrighted passages drawn
from appropriate published sources and cover the areas of prose
fiction, the humanities, the social sciences, and (in the ACT
Assessment only) the natural sciences.
- The Mathematics Test:
Emphasizing quantitative reasoning and the
ability to solve practical quantitative problems, the Mathematics
Test focuses on reasoning rather than memorization of formulas and
computational skills. The ACT also
"You may use a calculator on the Mathematics Test."
- The Science Test:
The Science Test measures interpretation,
analysis, and evaluation skills required in the science content areas
of biology, chemistry, physics, and Earth/space science.
A different area of the ACT website
advises, "The questions require you to:
recognize and understand the basic features of, and concepts related to, the provided information
examine critically the relationship between the information provided and the conclusions drawn or hypotheses developed
generalize from given information and draw conclusions, gain new information, or make predictions."
Note that both of these ACT sources suggest that knowledge of science content is NOT evaluated in this test!
- In 1989, the ACT changed its tabulation procedure. In comparing
pre-1989 with post-1989 scores, it is necessary and crucial
to take this into account. We are advised that appropriate adjustments
can be made using conversion recommendations from the ACT.
The ACT website
says, "The former ACT Assessment was revised
in the late 1980s, and the Enhanced ACT Assessment was first
administered in October 1989. This new version is currently in use,
but the word 'enhanced' is no longer included in its title."
ACT's "PLAN" Test
The ACT administers a test called "PLAN" which is taken by high school sophomores.
(The official PLAN website is here.)
The intended goals are to prepare for and to predict a student's most likely performance on the ACT test down the road,
and to provide information on strengths and weaknesses so as to plan coursework during the remainder
of high school. As such, it is sort of an ACT version of the PSAT, which is a precursor to the SAT test.
In addition, the PLAN test also attempts to gauge a student's interest and suitability for various careers,
displaying results in what it calls a "world of work" chart.
Whether PLAN achieves any of these goals is another matter. We have been unable to determine the
means used in PLAN to predict likely future ACT scores, and it's a little unsettling that this topic isn't given
more substance in official ACT documents such as the
PLAN_Program_Handbook. Similarly, it's hard not to be
skeptical about a career assessment that claims to distill its advice down to a simplistic two-dimensional
chart of job titles.
If you can help, we would welcome your insight
into the technical aspects and validation of the PLAN test,
and the degree of its value to high school students.
More About Testing
- Kimberly Swygert, a Ph.D. and psychometrician,
runs an interesting "blog" of commentary on testing
and assessment, and other education issues, named
Number 2 Pencil
Supreme Court Upholds School Practice Of Having One Student Grade Another's Work, February 19, 2002.
"The Supreme Court upheld [in a 9-0 decision] the common schoolroom practice of having
one student grade another's work, ruling today that such paper-swapping does not violate federal privacy law.
... Teachers nationwide commonly tell students to swap homework,
quizzes or other schoolwork and then correct one another's work as
the teacher goes over it aloud. Sometimes the teacher then has
students call out the results, and the teacher records them.
'Correcting a classmate's work can be as much a part of the
assignment as taking the test itself,' Justice Anthony M. Kennedy
wrote for himself and seven colleagues. ...
'It is a way to teach material again in a new context, and it helps
show students how to assist and respect fellow pupils,' wrote
Kennedy, a former law professor who still teaches several classes a
From our extensive listing of recommended books on education,
here are our selections on testing and assessment:
"Defending Standardized Testing"
by Richard P. Phelps (Editor)
Editorial reviews cited by Amazon:
Howard Wainer, Journal of Educational Measurement:
"Very much worth buying and reading."
"Easy to read ...provides a balanced approach ...help[s] the
reader understand current debates within the community of testing experts"
"Kill the Messenger: The War on Standardized Testing"
by Richard P. Phelps with a forward by Herbert J. Walberg
This book, released in May 2003, has generated a tremendous positive response.
The author, Dr. Richard Phelps, provides details of the book on his website,
Kill the Messenger.
"Testing Student Learning, Evaluating Teaching Effectiveness" by Williamson M. Evers, Herbert J. Walberg
More than ever, parents want to know how their children are achieving
and how their children's school ranks compared to others. This book
shows how defective tests and standards and a lack of accountability
cause American students to fall behind those of other
countries -- despite our schools' receiving nearly the world's highest
levels of per-student spending. The book takes on common objections
to testing and reveals why they are false.
The book also presents several specific constructive uses for tests,
including diagnosing children's learning difficulties and procedures
for solving them, measuring the impact of curriculum on specific
aspects of achievement, and assessing teachers' strengths and