Visualizzazione post con etichetta daniel koretz masuting up. Mostra tutti i post
Visualizzazione post con etichetta daniel koretz masuting up. Mostra tutti i post

sabato 25 febbraio 2017

la miglior scuola in città

“Scusi Dr. Koretz, puo’ dirmi per cortesia la miglior scuola in città a cui iscrivere mio figlio?”
E’ questa la domanda che si sente fare tutti i giorni Daniel Koretz.
Poiché per vivere valuta le scuole attraverso i test scolastici – il suo libro “Measuring Up è una Bibbia – la cosa non desta meraviglia.
Ma la sua risposta delude quasi sempre.
Di solito invita a valutare
… the strength of the school’s music or athletic programs, some special curricular emphasis, school size, social heterogeneity, and so on…
Poi consiglia di visitare di persona le scuola per valutare se sembrano posti promettenti.
Osservare e descrivere, dunque. Una roba faticosa.
Il genitore che ha interpellato Koretz lo congeda velocemente e freddamente, è palpabile la sua insoddisfazione, vuole qualcosa di meno complicato da un progettista di test. Qualcosa di meno ambiguo. Per esempio la scuola che fa meglio nei test…
… They wanted something simpler: the names of the schools with the highest test scores…
C’è una risposta standard da dare a questi scocciatori…
… “If all you want is high average test scores, tell your realtor that you want to buy into the highest-income neighborhood you can manage. That will buy you the highest average score you can afford.”…
Segui il denaro: più si paga, più i test sono migliori. Andate nei quartieri a più alto reddito medio e lì troverete le scuole che fanno meglio nei test.
Il nervosismo è frutto di un’incomprensione: c’è chi crede che conoscere l’esito di un test ci dica l’essenziale su uno studente o  una scuola.
Un’altra credenza malriposta è che progettare e somministrare un test sia una cosa semplice: detto, fatto.
Le parole del Presidente Bush presentando il programma “No Child Left Behind” tradiscono questa credenza…
… “A reading comprehension test is a reading comprehension test. And a math test in the fourth grade—there’s not many ways you can foul up a test … It’s pretty easy to ‘norm’ the results.”…
Sbagliato: non c’è niente di più facile che “sporcare” un test e renderlo inutile, nella fortunata ipotesi che il test non sia già fallato di per sé.
I test sembrano semplici ma sono difficilissimi da preparare e somministrare. Farlo in massa è praticamente impossibile.
Ormai si parla dei test scolastici anche al bar
… For many years, Parade magazine has featured a regular column by Marilyn vos Savant, who is declared by the magazine to have the highest IQ in the country. Rather than simply saying that Ms. vos Savant is one damned smart person, if indeed she is, the editors use the everyday vocabulary of “IQ”…
Ma pochi frequentatori di bar sanno cos’è l’ IQ e come si testa? C’è da dubitarne, il concetto non è affatto immediato.
Altro mito: credere che i test siano indicatori potentissimi
… it is just another way of saying that she is smart. But it does seem to give the assertion more weight, a patina of scientific credibility…
Sarebbe molto più appropriato dire che Tizio è un tipo intelligente (come facevano i nostri nonni) che far riferimento al suo IQ.
***
Cosa complica maledettamente le cose?
Innanzitutto il fatto che i test siano moltissimi, praticamente infiniti.
Non esiste un test che ci dia un’immagine completa del lavoro fatto da una scuola. E nemmeno tutti i test messi insieme riescono nell’impresa.
Innanzitutto perché considerano solo un sottoisieme degli scopi educativi. Poi perché non sono misurazione diretta di qualcosa ma semplici stime che utilizzano campionature.
Un test scolastico è come un sondaggio. Si guarda a poche cose per farsi un’idea del tutto.
***
Un problema dei test è la loro frequente invalidità: si presenta quando due test in teoria equivalenti danno esiti diversi. Un esempio:
… For example, for more than three decades the federal government has funded a large-scale assessment of students nationwide called the National Assessment of Educational Progress, often simply labeled NAEP (pronounced “nape”), which is widely considered the best single barometer of the achievement of the nation’s youth. There are actually two NAEP assessments, one (the main NAEP) designed for detailed reporting in any given year, and a second designed to provide the most consistent estimates of long-term trends. Both show that mathematics achievement has been improving in both grade four and grade eight—particularly in the fourth grade, where the increase has been among the most rapid nationwide changes in performance, up or down, ever recorded. But the upward trend in the main NAEP has been markedly faster than the improvement in the long-term-trend NAEP. Why? Because the tests measure mathematics somewhat differently,…
Invalsi, Pisa, Timss… le graduatorie su questo e quello cambiano sempre.
Cambiano anche nel tempo. Quando un test ha conseguenze sostanziali (carriera, stipendi…), guarda caso, i miglioramenti sono iperbolici. L’esempio del Texas…
… The experience in Texas during George Bush’s tenure as governor provides a good illustration. At that time, the state used the Texas Assessment of Academic Skills (TAAS) to evaluate schools, and high-school students were required to pass this test in order to receive a diploma. Texas students showed dramatically more progress on the TAAS than they did on the National Assessment of Educational Progress…
Ma si tratta di miglioramenti ben poco rassicuranti, in genere frutto della pratica “teaching to test”.
***
C’è poi un problema di attendibilità: studenti che fanno due volte lo stesso test ottenendo risultati differenti.
Il SAT si somministra più volte, per esempio. Ma non sempre è possibile, specie se la massa degli studenti è cospicua.
Molti test progettati per essere equivalenti hanno contenuti diversi (è ovvio, non si puo’ sottoporre lo stesso identico test), ma i contenuti non sono mai neutrali.
Parte della fluttuazione è dovuta dallo stato di forma dell’allievo. Magari il soggetto è nervoso o ha dormito poco.
Non ha senso dare grande peso a piccole differenze.
***
Poi ci sono i problemi di scala: come riportare gli esiti?
Noi siamo abituati con i voti: una scala arbitraria che rende impossibili i confronti…
… We know that to obtain a grade of “A” can require much more in one class than in another…
Ma non è facile superare questi limiti: scale diverse danno rappresentazioni diverse della performance e la cosa limita comunque i confronti.
***
Poi c’è il problema dei test lacunosi (o fallati): sono i test che non funzionano come dovrebbero.
Esempio di test fallato in sfavore degli immigrati
… For example, a mathematics test that requires reading complex text and writing long answers may be biased against immigrant students who are competent in mathematics but have not yet achieved fluency in English…
Qui si pongono problemi: se un test è perfettamente neutrale risulta fallato per i poveri. Che fare? la cosa crea imbarazzo…
… For instance, if poor students in a given city attend inferior schools, a completely unbiased test is likely to give them lower scores because the inferior teaching they received impeded their learning…
E che dire dei test fallati contro le donne? Qui si entra in questioni filosofiche. Il fatto è che il test discrimina: lo facciamo proprio per poter discriminare!
***
Poi c’è un problema di settaggio: un test deve essere mirato al suo scopo, di solito più angusto di quel che si crede.
Per esempio, voglio valutare la scuola o gli studenti? Occorrono test differenti a seconda dell’obbiettivo…
… For example, the assessment designs that are best for providing descriptive information about the performance of groups (such as schools, districts, states, or even entire nations) are not suitable for systems in which the performance of individual students must be compared. Adding large, complex, demanding tasks to an assessment may extend the range of skills you can assess, but at the cost of making information about individual students less trustworthy….
***
Riassumiamo i cinque problemi chiave: invalidità, attendibilità, rappresentazione, lacunosità e settaggio.
Si tratta di problemi che richiedono soluzioni complicate e fragili. Purtroppo, c’è sempre chi tende ad associare le complicazioni al trascurabile.
***
Ma poi ci sono almeno un paio di problemi ancora più importanti, vediamoli.
Cos’è un test? Essenzialmente un sondaggio.
Per risolvere un certo problema, per esempio, noi attiviamo 1000 abilità differenti ma solo la misurazione di alcune è fattibile. Tra queste è necessario selezionare un campione rappresentativo della totalità. Se sbagliamo campione, il test si puo’ buttare.
La logica dei test è la medesima dei sondaggi…
… ON SEPTEMBER 10, 2004, a Zogby International poll of 1,018 likely voters showed George W. Bush with a 4-percentage-point lead over John Kerry in the presidential election campaign. These results were a reasonably good prediction: Bush’s margin when he won two months later was about 2.5 percent…
A volte sondaggi del genere falliscono miseramente: un esempio storico è la corsa Dewey vs Truman. Ma anche di recente Trump e Brexit.
Eppure non possiamo farne a meno, di solito ci prendono. Una cosa è certa: la bontà del sondaggio dipende dal campione prescelto. Ma anche da come sono poste le domande. Esempio…
… Original question: “What is the average number of days each week you have butter?” Revised question: “The next question is just about butter. Not including margarine, what is the average number of days each week you have butter?”…
Questo qui sopra è il caso di due domande equivalenti a cui si è risposto in modo molto diverso.
Poi conta la voglia di rispondere in modo onesto. Ci sono domande che incentivano la “disonestà”; se chiedo a un tale quanto guadagna magari costui non ha voglia di dirmelo.
Onnipresente poi è il “social desirability bias”, ovvero la voglia di compiacere l’intervistatore dicendo la “cosa giusta”. Nei sondaggi nessuno è razzista o sessista, e tutti fanno volontariato…
… For example, a study published in 1950 documented substantial overreporting of several different types of socially desirable behavior. Thirty-four percent of respondents reported that they had contributed to a specific local charity when they had not, and 13 to 28 percent of respondents claimed to have voted in various elections in which they had not…
I test scolastici sono sondaggi e hanno dunque tutte le pecche dei sondaggi…
… Educational achievement tests are in many ways analogous to this Zogby poll in that they are a proxy for a better and more comprehensive measure that we cannot obtain… The full range of skills or knowledge about which the test provides an estimate—analogous to the votes of the entire population of voters in the Zogby survey—is generally called the domain by those in the trade…
***
Ma cosa misuriamo esattamente in un test scolastico? Quanto è rappresentativo il campione prescelto?
Qui comincia la diatriba che divide. Ci sono i critici
… there are some aspects of the goals of education that achievement tests are unable to measure…
E ci sono gli entusiasti…
… Tests measure what is important, their argument goes, and those who focus on other “goals” are softies…
I critici hanno molte frecce al loro arco, non si puo’ non riconoscere dei limiti alla capacità di quantificare l’istruzione passata nel discente.
A dirlo non è il sindacalista anti-meritocratico ma un padre della psicometria come E. F. Lindquist in un articolo dove oltre mezzo secolo fa c’era già tutto: “Preliminary Considerations in Objective Test Construction”.
Lindquist anticipò le controversie attuali affermando che gli scopi educativi sono vari e solo alcuni possono essere standardizzati.
Esempio di scopi non standardizzabili: la voglia di apprendere. Oppure: l’abilità nell’applicare in modo pertinente cio’ che si è appreso.
L’ esperienza ci dice che i test misurano variabili di grande importanza. Ma altre – non meno importanti - sono inevitabilmente trascurate.
Un esempio di atteggiamento accorto
… ITBS manual advises school administrators explicitly to treat test scores as specialized information that is a supplement to, not a replacement for, other information about students’ performance….
C’è poi un’altra lacuna…
… Second, Lindquist argued that even many of the goals of schooling that are amenable to standardized testing can be assessed only in a less direct fashion than we would like
Lo scopo dell’istruzione è troppo lontano e generico per capire se stiamo misurando le variabili giuste.
Per esempio, perché insegniamo l’algebra? Un’ipotesi…
… to teach students how to reason algebraically so that they can apply this reasoning to the vast array of circumstances outside of school to which it is relevant. This sort of very general goal, however, is remote from decisions about the algebra content to be taught in a given middle school this Thursday morning… curriculum designers and teachers must make a large number of specific decisions about what algebra to teach. For example, do students learn to factor quadratic equations? Many considerations shape these decisions, not just a subject’s possible utility in a wide range of work-related and other contexts years later…
Ma è un’ipotesi vaga: si rischia di misurare abilità che non verranno mai chiamate in causa o attivate dal soggetto.
Si possono imparare tante cose ma se poi non si sarà in grado di capire quando e come usare cio’ che si è imparato? Un aneddoto gustoso
… Many years ago, I had Sunday brunch in Manhattan with three New Yorkers. All were highly educated, and all had taken at least one or two semesters of mathematics beyond high school. In my experience, New York natives make their way about town in part by drawing on a prodigious knowledge of the location of various landmarks, such as the original Barnes and Noble store on Fifth Avenue. That Sunday morning, I found to my surprise that none of the three New Yorkers could figure out the location of the restaurant where we were to have brunch. It was on one of the main avenues, and they knew the address, but they could not figure out the cross street. I suggested that the problem might turn out to be a very simple one. I asked if they knew where the addresses on the avenues in that part of Manhattan reached zero and, if so, whether they reached zero at the same street. They quickly agreed that they did and gave me the name of the cross street. I then asked if the addresses increased at the same rate on these avenues, and if so, at what rate. That is, how many numbers did the addresses increase with each cross street? They were quite certain that the rate was the same, but it took a little more work to figure out what it was. Using a few landmarks they knew (including the original Barnes and Noble store), they figured out the rate for a couple of avenues. The rates were the same. At that point, they had the answer, although they had not yet realized it…
Per orientarsi gli studenti avrebbero dovuto risolvere una semplice equazione di promo grado. Non lo hanno capito, anche se di solito all’università risolvevano problemi matematici enormemente più difficili…
… All three were competent in dealing with algebra much more complex than this, but they had not developed the habit of thinking of real-world problems in terms of the mathematics they had learned in the classroom…
Nel mondo ideale dovremmo valutare le persone osservandole direttamente all’opera sui problemi che saranno chiamati ad affrontare anche dopo, ma i test scolastici sono lontanissimi dal mondo ideale della valutazione, ci si arrabatta quindi in qualche modo…
… a test author usually has to focus on the proximate goals of educators, even if these are only proxies for the ultimate social goals of education…
Lindquist raccomandava di testare le conoscenze specifiche
… Lindquist wanted as much as practical to isolate specific knowledge… tests to include tasks that focus narrowly on these specifics… attempting to create test items that present complex, “authentic” tasks more similar to those students might encounter out of school…
La tendenza è stata di segno opposto.
***
Come si puo’ concludere sulla base di queste considerazioni?
Che i test sono uno strumento utile ma incompleto.
Che è temerario abbinare all’esito dei test conseguenze così importanti come lo stipendio o la carriera (test high stake).
Che i giudizi vanno espressi tenendo conto dei test ma non solo (una componente tra le altre). Un po’ come fanno le migliori università
… they conduct a “holistic” review of applicants, considering not only SAT or ACT scores but also grades, personal statements, persistence in extracurricular activities, and so on…
studying

Riassunto complessivo Measuring Up by Daniel M Koretz

Measuring Up by Daniel M Koretz
You have 121 highlighted passages
You have 52 notes
Last annotated on February 24, 2017
Chapter 1 If Only It Were So SimpleRead more at location 69
Note: 1@@@@@@@@@@@@@@ I CINQUE PROBLEMI DI UN TEST Edit
help her identify good schools.Read more at location 71
Note: RICHIESTA Edit
She assumed that because of what I do for a living, I ought to know this.Read more at location 71
the strength of the school’s music or athletic programs, some special curricular emphasis, school size, social heterogeneity, and so on.Read more at location 75
Note: x COSE DA CONSID PRIMA DEI TEST Edit
visit a few schools that looked promising.Read more at location 77
observations and descriptive informationRead more at location 81
She was not pleased. She clearly wanted an answer that was uncomplicatedRead more at location 82
Note: VOGLIA DI SEMPLIFICARE Edit
less ambiguity and complexity.Read more at location 83
They wanted something simpler: the names of the schools with the highest test scores,Read more at location 86
“If all you want is high average test scores, tell your realtor that you want to buy into the highest-income neighborhood you can manage. That will buy you the highest average score you can afford.”Read more at location 88
Note: x LA RISPOSTA AI SEMPLIFICATORI Edit
misunderstandingsRead more at location 90
that scores on a single test tell us all we need to know about studentRead more at location 90
Note: ... Edit
to know about schoolRead more at location 91
Note: c Edit
A third common misconception is that testing is simple and straightforward.Read more at location 92
No Child Left Behind,Read more at location 93
“A reading comprehension test is a reading comprehension test. And a math test in the fourth grade—there’s not many ways you can foul up a test … It’s pretty easy to ‘norm’ the results.”Read more at location 93
Note: x BUSH SEMPLIFICA Edit
this claim was entirely wrong: it is all too easy to foul up the design of a test,Read more at location 96
testing seems so misleadingly simpleRead more at location 103
Testing has become a routine part of our vocabularyRead more at location 106
For many years, Parade magazine has featured a regular column by Marilyn vos Savant, who is declared by the magazine to have the highest IQ in the country. Rather than simply saying that Ms. vos Savant is one damned smart person, if indeed she is, the editors use the everyday vocabulary of “IQ”—justRead more at location 107
Note: X TIPICO EQUIVOCO Edit
very few readers have any idea what an IQ test containsRead more at location 110
another issue: the rhetorical power of testing.Read more at location 112
it is just another way of saying that she is smart. But it does seem to give the assertion more weight, a patina of scientific credibility.Read more at location 112
So what are some of the complications that make testing and the interpretation of scores so much less straightforwardRead more at location 120
At first, they may seem discouragingly numerous.Read more at location 121
test scores usually do not provide a direct and complete measure of educational achievement.Read more at location 129
they are incomplete measures,Read more at location 129
these tests can measure only a subset of the goals of education.Read more at location 131
Note: PRIMA RAG INCOMPL Edit
tests are generally very small samples of behavior that we use to make estimates of students’Read more at location 132
Note: SEC RAGIONE Edit
an achievement test is in many ways like a political poll,Read more at location 134
opinions of a small number of voters are usedRead more at location 134
different tests often provide somewhat inconsistent results.Read more at location 139
Note: CONSEG. PROBLEMA DELL INVALIDITÀ Edit
For example, for more than three decades the federal government has funded a large-scale assessment of students nationwide called the National Assessment of Educational Progress, often simply labeled NAEP (pronounced “nape”), which is widely considered the best single barometer of the achievement of the nation’s youth. There are actually two NAEP assessments, one (the main NAEP) designed for detailed reporting in any given year, and a second designed to provide the most consistent estimates of long-term trends. Both show that mathematics achievement has been improving in both grade four and grade eight—particularly in the fourth grade, where the increase has been among the most rapid nationwide changes in performance, up or down, ever recorded. But the upward trend in the main NAEP has been markedly faster than the improvement in the long-term-trend NAEP. Why? Because the tests measure mathematics somewhat differently,Read more at location 139
Note: x AESEMPIO Edit
When scores have serious consequences, scores on the test that matters often go up far faster than scores on other tests.Read more at location 152
Note: HIGHT STAKE ATTENDIBILITÀ Edit
The experience in Texas during George Bush’s tenure as governor provides a good illustration. At that time, the state used the Texas Assessment of Academic Skills (TAAS) to evaluate schools, and high-school students were required to pass this test in order to receive a diploma. Texas students showed dramatically more progress on the TAAS than they did on the National Assessment of Educational Progress.Read more at location 153
Note: X ES DEL TEXAS Edit
Even a single test can provide varying results.Read more at location 163
Note: PROBLEMA DELL ATTENDIBILITÀ Edit
Students who take more than one form of a test typically obtain different scores.Read more at location 164
SAT college-admissions test more than once,Read more at location 165
These arise partly because the test forms, while designed to be equivalent, have different content,Read more at location 165
Fluctuations also occur because students have good and bad days:Read more at location 166
too nervous to sleep wellRead more at location 167
it makes no sense to place much faith in small differencesRead more at location 168
Then there is the problem of figuring out how to report performance on a test.Read more at location 171
Note: CALCOLO Edit
Most of us grew up in a school system with some simple but arbitrary rules for grading tests,Read more at location 172
We know that to obtain a grade of “A” can require much more in one class than in another.Read more at location 174
Psychometricians therefore have had to create scales for reporting performance on tests.Read more at location 175
various scalesRead more at location 179
Note: ... RAPPRESENTAZIONE Edit
provide differing views of performance.Read more at location 180
Note: c Edit
Further, sometimes a test does not function as it should. A test may be biased,Read more at location 180
For example, a mathematics test that requires reading complex text and writing long answers may be biased against immigrant students who are competent in mathematics but have not yet achieved fluency in English.Read more at location 181
Note: x BIAS IMMIGRATI Edit
bias must be distinguished from simple differences in performanceRead more at location 183
For instance, if poor students in a given city attend inferior schools, a completely unbiased test is likely to give them lower scores because the inferior teaching they received impeded their learning.Read more at location 184
Note: x IMBARAZZO X I TEST NN BIAS Edit
For example, the assessment designs that are best for providing descriptive information about the performance of groups (such as schools, districts, states, or even entire nations) are not suitable for systems in which the performance of individual students must be compared. Adding large, complex, demanding tasks to an assessment may extend the range of skills you can assess, but at the cost of making information about individual students less trustworthy.Read more at location 190
Note: x NN ESISTE IL TEST OTTIMO. ESEMPIO. IL SETTING VARIA AL VARIARE DEGLI SCOPI Edit
principles of testing are beyond the reach of most people.Read more at location 201
Note: t Edit
validity, reliability, bias, scaling, and standard setting,Read more at location 204
Note: CONCETTI CHIAVE Edit
Many people simply dismiss these complexities,Read more at location 205
proclivity to associate the arcane with the unimportantRead more at location 207
Chapter 2 What Is a Test?Read more at location 215
Note: 2@@@@@@@@@@@ UN TEST È UTILE SE LE SKILL MISURATE RAPPRESENTANO BENE QUELLE COINVOLTE NELL ASSOLVIMENTO DI UN COMPITO Edit
ON SEPTEMBER 10, 2004, a Zogby International poll of 1,018 likely voters showed George W. Bush with a 4-percentage-point lead over John Kerry in the presidential election campaign. These results were a reasonably good prediction: Bush’s margin when he won two months later was about 2.5 percent.Read more at location 216
Note: x FIDUCIA NEI SONDAGGI Edit
Occasionally, the polls are substantially wrong—theRead more at location 219
classic example is Truman versus Dewey in 1948,Read more at location 219
The basic principles underlying polling,Read more at location 222
provide a handy way to explain the workings of achievement tests.Read more at location 223
why should we care about these 1,018 people? Because together they represent the 121 millionRead more at location 228
ability to make this predictionRead more at location 234
Note: ... Edit
It depends on the design of the sample,Read more at location 235
Note: c Edit
If Zogby had sampled only individuals in UtahRead more at location 236
Note: ... Edit
the sample would not have been a good representationRead more at location 236
Note: c Edit
errors of sample design,Read more at location 239
Accuracy also depends on the way in which survey questions are worded;Read more at location 239
changes in the wording of questions can have substantial effects on respondents’ answers.Read more at location 240
Original question: “What is the average number of days each week you have butter?” Revised question: “The next question is just about butter. Not including margarine, what is the average number of days each week you have butter?”Read more at location 242
Note: x ES Edit
Finally, accuracy depends on the ability or willingness of respondentsRead more at location 248
when students are asked about parental income, for example. They may refuseRead more at location 249
“social desirability bias”: a tendency for some respondents to provide socially acceptableRead more at location 251
For example, a study published in 1950 documented substantial overreporting of several different types of socially desirable behavior. Thirty-four percent of respondents reported that they had contributed to a specific local charity when they had not, and 13 to 28 percent of respondents claimed to have voted in various elections in which they had not.Read more at location 254
Note: x ES SOCIAL BIAS Edit
Educational achievement tests are in many ways analogous to this Zogby poll in that they are a proxy for a better and more comprehensive measure that we cannot obtain.Read more at location 258
Note: x IL TEST È UN SONDAGGIO Edit
The full range of skills or knowledge about which the test provides an estimate—analogous to the votes of the entire population of voters in the Zogby survey—is generally called the domain by those in the trade.Read more at location 264
Note: x DOMINIO Edit
Chapter 3 What We Measure: Just How Good Is the Sample?Read more at location 479
Note: 3@@@@@@@@@@ Edit
there are some aspects of the goals of education that achievement tests are unable to measure.”Read more at location 482
Note: CRITICI Edit
the label “anti-testing”Read more at location 483
Tests measure what is important, their argument goes, and those who focus on other “goals” are softies.Read more at location 483
Note: X MOTTO DEI MERITOCRATICI Edit
These critics are not entirely wrong.Read more at location 484
recognize this limitation of testing,Read more at location 487
obscure paper published more than half a century ago by E. F. LindquistRead more at location 489
Note: BASE PER I CRITICI Edit
“Preliminary Considerations in Objective Test Construction.”Read more at location 490
he was remarkably prescient in anticipating controversies that engulfed the world of educationalRead more at location 501
LindquistRead more at location 506
Note: giù Edit
goals of education are diverse,Read more at location 507
only some of these goals are amenable to standardizedRead more at location 507
some other types of skills are far more difficult to test.Read more at location 509
interest in learningRead more at location 510
Note: AB NN MIS 1 Edit
ability to apply knowledgeRead more at location 511
Note: 2 Edit
The evidence shows unambiguously that standardized tests can measure a great deal that is of value,Read more at location 513
some of what it omits is very important.Read more at location 515
ITBS manual advises school administrators explicitly to treat test scores as specialized information that is a supplement to, not a replacement for, other information about students’ performance. And for the same reason, itRead more at location 516
Note: x ES DI ATTEGGIAM ACCORTO Edit
Second, Lindquist argued that even many of the goals of schooling that are amenable to standardized testing can be assessed only in a less direct fashion than we would like.Read more at location 523
Note: x SECONDA LACUNA Edit
focus of daily attention for teachers and students are just proxiesRead more at location 524
ultimate goals are too general and too remoteRead more at location 525
For example, why do we teach students algebra?Read more at location 526
to teach students how to reason algebraically so that they can apply this reasoning to the vast array of circumstances outside of school to which it is relevant. This sort of very general goal, however, is remote from decisions about the algebra content to be taught in a given middle school this Thursday morning.Read more at location 527
Note: x XCHÈ L ALGEBRA? Edit
curriculum designers and teachers must make a large number of specific decisions about what algebra to teach. For example, do students learn to factor quadratic equations? Many considerations shape these decisions, not just a subject’s possible utility in a wide range of work-related and other contexts years later.Read more at location 529
Note: c Edit
An anecdoteRead more at location 531
difference between learning content specified in a curriculum and later application of that knowledge.Read more at location 532
Many years ago, I had Sunday brunch in Manhattan with three New Yorkers. All were highly educated, and all had taken at least one or two semesters of mathematics beyond high school. In my experience, New York natives make their way about town in part by drawing on a prodigious knowledge of the location of various landmarks, such as the original Barnes and Noble store on Fifth Avenue. That Sunday morning, I found to my surprise that none of the three New Yorkers could figure out the location of the restaurant where we were to have brunch. It was on one of the main avenues, and they knew the address, but they could not figure out the cross street. I suggested that the problem might turn out to be a very simple one. I asked if they knew where the addresses on the avenues in that part of Manhattan reached zero and, if so, whether they reached zero at the same street. They quickly agreed that they did and gave me the name of the cross street. I then asked if the addresses increased at the same rate on these avenues, and if so, at what rate. That is, how many numbers did the addresses increase with each cross street? They were quite certain that the rate was the same, but it took a little more work to figure out what it was. Using a few landmarks they knew (including the original Barnes and Noble store), they figured out the rate for a couple of avenues. The rates were the same. At that point, they had the answer, although they had not yet realized it.Read more at location 532
Note: x ES DEI TRE STUDENTI PERSI A NY Edit
The problem was a simple linear equationRead more at location 542
All three were competent in dealing with algebra much more complex than this, but they had not developed the habit of thinking of real-world problems in terms of the mathematics they had learned in the classroom.Read more at location 545
Note: x LA CONDIZ DEI TRE STUD Edit
in the ideal world we would assess achievement by measuring the ultimate goalsRead more at location 551
“The only perfectlyRead more at location 552
Note: ... Edit
would be one based on direct observationRead more at location 553
But this sort of measurement is clearly impractical,Read more at location 557
a test author usually has to focus on the proximate goals of educators, even if these are only proxies for the ultimate social goals of education.Read more at location 579
Note: X RIPET IL DIFETTO Edit
we have to put all test-takers in the same environmentRead more at location 583
Note: CONOSC E ABILIT Edit
Lindquist wanted as much as practical to isolate specific knowledgeRead more at location 589
tests to include tasks that focus narrowly on these specifics.Read more at location 590
attempting to create test items that present complex, “authentic” tasks more similar to those students might encounter out of school.Read more at location 595
they conduct a “holistic” review of applicants, considering not only SAT or ACT scores but also grades, personal statements, persistence in extracurricular activities, and so on.Read more at location 609
Note: CONSIDERARE TEST MA ANCHE ALTRO Edit
in much of the testing that now dominates K–12 education, Lindquist’s advice that test scores must be seen as incomplete measures is widely ignored.Read more at location 616
Chapter 10 Inflated Test ScoresRead more at location 3298
Note: 10@@@@@@@@@@@@@@ Edit
performance is getting better, and rapidly.Read more at location 3302
this good news is often more apparent than real