You have 121 highlighted passages
You have 52 notes
Last annotated on February 24, 2017
Chapter 1 If Only It Were So SimpleRead more at location 69
She assumed that because of what I do for a living, I ought to know this.Read more at location 71
the strength of the school’s music or athletic programs, some special curricular emphasis, school size, social heterogeneity, and so on.Read more at location 75
She was not pleased. She clearly wanted an answer that was uncomplicatedRead more at location 82
They wanted something simpler: the names of the schools with the highest test scores,Read more at location 86
“If all you want is high average test scores, tell your realtor that you want to buy into the highest-income neighborhood you can manage. That will buy you the highest average score you can afford.”Read more at location 88
that scores on a single test tell us all we need to know about studentRead more at location 90
A third common misconception is that testing is simple and straightforward.Read more at location 92
“A reading comprehension test is a reading comprehension test. And a math test in the fourth grade—there’s not many ways you can foul up a test … It’s pretty easy to ‘norm’ the results.”Read more at location 93
this claim was entirely wrong: it is all too easy to foul up the design of a test,Read more at location 96
For many years, Parade magazine has featured a regular column by Marilyn vos Savant, who is declared by the magazine to have the highest IQ in the country. Rather than simply saying that Ms. vos Savant is one damned smart person, if indeed she is, the editors use the everyday vocabulary of “IQ”—justRead more at location 107
it is just another way of saying that she is smart. But it does seem to give the assertion more weight, a patina of scientific credibility.Read more at location 112
So what are some of the complications that make testing and the interpretation of scores so much less straightforwardRead more at location 120
test scores usually do not provide a direct and complete measure of educational achievement.Read more at location 129
these tests can measure only a subset of the goals of education.Read more at location 131
tests are generally very small samples of behavior that we use to make estimates of students’Read more at location 132
different tests often provide somewhat inconsistent results.Read more at location 139
For example, for more than three decades the federal government has funded a large-scale assessment of students nationwide called the National Assessment of Educational Progress, often simply labeled NAEP (pronounced “nape”), which is widely considered the best single barometer of the achievement of the nation’s youth. There are actually two NAEP assessments, one (the main NAEP) designed for detailed reporting in any given year, and a second designed to provide the most consistent estimates of long-term trends. Both show that mathematics achievement has been improving in both grade four and grade eight—particularly in the fourth grade, where the increase has been among the most rapid nationwide changes in performance, up or down, ever recorded. But the upward trend in the main NAEP has been markedly faster than the improvement in the long-term-trend NAEP. Why? Because the tests measure mathematics somewhat differently,Read more at location 139
When scores have serious consequences, scores on the test that matters often go up far faster than scores on other tests.Read more at location 152
The experience in Texas during George Bush’s tenure as governor provides a good illustration. At that time, the state used the Texas Assessment of Academic Skills (TAAS) to evaluate schools, and high-school students were required to pass this test in order to receive a diploma. Texas students showed dramatically more progress on the TAAS than they did on the National Assessment of Educational Progress.Read more at location 153
Even a single test can provide varying results.Read more at location 163
Students who take more than one form of a test typically obtain different scores.Read more at location 164
These arise partly because the test forms, while designed to be equivalent, have different content,Read more at location 165
Fluctuations also occur because students have good and bad days:Read more at location 166
Then there is the problem of figuring out how to report performance on a test.Read more at location 171
Most of us grew up in a school system with some simple but arbitrary rules for grading tests,Read more at location 172
We know that to obtain a grade of “A” can require much more in one class than in another.Read more at location 174
Psychometricians therefore have had to create scales for reporting performance on tests.Read more at location 175
Further, sometimes a test does not function as it should. A test may be biased,Read more at location 180
For example, a mathematics test that requires reading complex text and writing long answers may be biased against immigrant students who are competent in mathematics but have not yet achieved fluency in English.Read more at location 181
bias must be distinguished from simple differences in performanceRead more at location 183
For instance, if poor students in a given city attend inferior schools, a completely unbiased test is likely to give them lower scores because the inferior teaching they received impeded their learning.Read more at location 184
For example, the assessment designs that are best for providing descriptive information about the performance of groups (such as schools, districts, states, or even entire nations) are not suitable for systems in which the performance of individual students must be compared. Adding large, complex, demanding tasks to an assessment may extend the range of skills you can assess, but at the cost of making information about individual students less trustworthy.Read more at location 190
validity, reliability, bias, scaling, and standard setting,Read more at location 204
Chapter 2 What Is a Test?Read more at location 215
Note: 2@@@@@@@@@@@ UN TEST È UTILE SE LE SKILL MISURATE RAPPRESENTANO BENE QUELLE COINVOLTE NELL ASSOLVIMENTO DI UN COMPITO Edit
ON SEPTEMBER 10, 2004, a Zogby International poll of 1,018 likely voters showed George W. Bush with a 4-percentage-point lead over John Kerry in the presidential election campaign. These results were a reasonably good prediction: Bush’s margin when he won two months later was about 2.5 percent.Read more at location 216
provide a handy way to explain the workings of achievement tests.Read more at location 223
why should we care about these 1,018 people? Because together they represent the 121 millionRead more at location 228
Accuracy also depends on the way in which survey questions are worded;Read more at location 239
changes in the wording of questions can have substantial effects on respondents’ answers.Read more at location 240
Original question: “What is the average number of days each week you have butter?” Revised question: “The next question is just about butter. Not including margarine, what is the average number of days each week you have butter?”Read more at location 242
Finally, accuracy depends on the ability or willingness of respondentsRead more at location 248
when students are asked about parental income, for example. They may refuseRead more at location 249
“social desirability bias”: a tendency for some respondents to provide socially acceptableRead more at location 251
For example, a study published in 1950 documented substantial overreporting of several different types of socially desirable behavior. Thirty-four percent of respondents reported that they had contributed to a specific local charity when they had not, and 13 to 28 percent of respondents claimed to have voted in various elections in which they had not.Read more at location 254
Educational achievement tests are in many ways analogous to this Zogby poll in that they are a proxy for a better and more comprehensive measure that we cannot obtain.Read more at location 258
The full range of skills or knowledge about which the test provides an estimate—analogous to the votes of the entire population of voters in the Zogby survey—is generally called the domain by those in the trade.Read more at location 264
Chapter 3 What We Measure: Just How Good Is the Sample?Read more at location 479
there are some aspects of the goals of education that achievement tests are unable to measure.”Read more at location 482
Tests measure what is important, their argument goes, and those who focus on other “goals” are softies.Read more at location 483
obscure paper published more than half a century ago by E. F. LindquistRead more at location 489
he was remarkably prescient in anticipating controversies that engulfed the world of educationalRead more at location 501
The evidence shows unambiguously that standardized tests can measure a great deal that is of value,Read more at location 513
ITBS manual advises school administrators explicitly to treat test scores as specialized information that is a supplement to, not a replacement for, other information about students’ performance. And for the same reason, itRead more at location 516
Second, Lindquist argued that even many of the goals of schooling that are amenable to standardized testing can be assessed only in a less direct fashion than we would like.Read more at location 523
focus of daily attention for teachers and students are just proxiesRead more at location 524
to teach students how to reason algebraically so that they can apply this reasoning to the vast array of circumstances outside of school to which it is relevant. This sort of very general goal, however, is remote from decisions about the algebra content to be taught in a given middle school this Thursday morning.Read more at location 527
curriculum designers and teachers must make a large number of specific decisions about what algebra to teach. For example, do students learn to factor quadratic equations? Many considerations shape these decisions, not just a subject’s possible utility in a wide range of work-related and other contexts years later.Read more at location 529
difference between learning content specified in a curriculum and later application of that knowledge.Read more at location 532
Many years ago, I had Sunday brunch in Manhattan with three New Yorkers. All were highly educated, and all had taken at least one or two semesters of mathematics beyond high school. In my experience, New York natives make their way about town in part by drawing on a prodigious knowledge of the location of various landmarks, such as the original Barnes and Noble store on Fifth Avenue. That Sunday morning, I found to my surprise that none of the three New Yorkers could figure out the location of the restaurant where we were to have brunch. It was on one of the main avenues, and they knew the address, but they could not figure out the cross street. I suggested that the problem might turn out to be a very simple one. I asked if they knew where the addresses on the avenues in that part of Manhattan reached zero and, if so, whether they reached zero at the same street. They quickly agreed that they did and gave me the name of the cross street. I then asked if the addresses increased at the same rate on these avenues, and if so, at what rate. That is, how many numbers did the addresses increase with each cross street? They were quite certain that the rate was the same, but it took a little more work to figure out what it was. Using a few landmarks they knew (including the original Barnes and Noble store), they figured out the rate for a couple of avenues. The rates were the same. At that point, they had the answer, although they had not yet realized it.Read more at location 532
All three were competent in dealing with algebra much more complex than this, but they had not developed the habit of thinking of real-world problems in terms of the mathematics they had learned in the classroom.Read more at location 545
in the ideal world we would assess achievement by measuring the ultimate goalsRead more at location 551
a test author usually has to focus on the proximate goals of educators, even if these are only proxies for the ultimate social goals of education.Read more at location 579
we have to put all test-takers in the same environmentRead more at location 583
Lindquist wanted as much as practical to isolate specific knowledgeRead more at location 589
attempting to create test items that present complex, “authentic” tasks more similar to those students might encounter out of school.Read more at location 595
they conduct a “holistic” review of applicants, considering not only SAT or ACT scores but also grades, personal statements, persistence in extracurricular activities, and so on.Read more at location 609
in much of the testing that now dominates K–12 education, Lindquist’s advice that test scores must be seen as incomplete measures is widely ignored.Read more at location 616
this good news is often more apparent than real