Atjaunināt sīkdatņu piekrišanu

Computerized Multistage Testing: Theory and Applications [Hardback]

Edited by (Educational Testing Service, Princeton, New Jersey, USA), Edited by (Educational Testing Service, Princeton, New Jersey, USA), Edited by (Educational Testing Service, USA)
Citas grāmatas par šo tēmu:
  • Hardback
  • Cena: 152,25 €
  • Grāmatu piegādes laiks ir 3-4 nedēļas, ja grāmata ir uz vietas izdevniecības noliktavā. Ja izdevējam nepieciešams publicēt jaunu tirāžu, grāmatas piegāde var aizkavēties.
  • Daudzums:
  • Ielikt grozā
  • Piegādes laiks - 4-6 nedēļas
  • Pievienot vēlmju sarakstam
Citas grāmatas par šo tēmu:
Devising tests that evaluate a nations educational standing and implement efficacious educational reforms requires a careful balance among the contributions of technology, psychometrics, test design, and the learning sciences. Unlike other forms of adaptive testing, multistage testing (MST) is highly suitable for testing educational achievement because it can be adapted to educational surveys and student testing. Computerized Multistage Testing: Theory and Applications covers the methodologies, underlying technology, and implementation aspects of this type of test design.

The book discusses current scientific perspectives and practical considerations for each step involved in setting up an MST program. It covers the history of MST, test design and implementation for various purposes, item pool development and maintenance, IRT-based and classical test theory-based methodologies for test assembly, routing and scoring, equating, test security, and existing software. It also explores current research, existing operational programs, and innovative future assessments using MST.

Intended for psychologists, social scientists, and educational measurement scientists, this volume provides the first unified source of information on the design, psychometrics, implementation, and operational use of MST. It shows how to apply theoretical statistical tools to testing in novel and useful ways. It also explains how to explicitly tie the assumptions made by each model to observable (or at least inferable) data conditions.

Winner of the 2016 AERA Award for Significant Contribution to Educational Measurement and Research Methodology The 2016 American Education Research Association (AERA) Div. D award committee for Significant Contributions to Educational Measurement and Research Methodology has recognized unanimously this collaborative work advancing the theory and applications of computerized MST. This annual award recognizes published research judged to represent a significant conceptual advancement in the theory and practice of educational measurement and/or educational research methodology. The 2016 award was made under the heading: Measurement, Psychometrics, and Assessment. This collective work, published in 2014 as an edited volume titled Computerized Multistage Testing: Theory and Applications, was cited by the committee both for the originality of the conceptual foundations presented in support of multistage testing and for arguing persuasively for its potential impact on the practice of educational measurement.

Recenzijas

" this is a terrific book and the editors should be congratulated." Psychometrika, Vol. 80, No. 1, March 2015

Preface xxi
Contributors xxv
List of Figures
xxix
List of Tables
xxxv
I Test Design, Item Pool, and Maintenance
1(84)
1 Overview of Computerized Multistage Tests
3(18)
Duanli Yan
Charles Lewis
Alina A. von Davier
1.1 Linear Tests and Computerized Adaptive Tests (CATs)
3(1)
1.2 Multistage Tests (MSTs)
4(3)
1.3 MST Designs for Different Purposes
7(1)
1.4 Implementation Schemes
8(1)
1.5 Designing MST
9(3)
1.5.1 Modules and Panels
10(1)
1.5.2 Number of Stages
11(1)
1.5.3 Number of Modules per Stage
12(1)
1.6 Content Balance and Assembly
12(2)
1.7 Exposure Control
14(1)
1.8 Routing
14(1)
1.9 Scoring, Linking, and Equating
15(1)
1.10 Reliability, Validity, Fairness, and Test Security
16(1)
1.11 Current and Future Applications
17(1)
1.12 Logistic Challenges
18(2)
1.13 Summary
20(1)
2 Multistage Test Designs: Moving Research Results into Practice
21(18)
April L. Zenisky
Ronald K. Hambleton
2.1 The MST Design Structure
22(2)
2.2 The State of Research: MST Development and Design Considerations
24(12)
2.2.1 Design and Design Complexity
25(3)
2.2.2 Test and Module Length
28(2)
2.2.3 Item Banks, Statistical Targets, and Test Assembly
30(2)
2.2.4 Routing and Scoring
32(2)
2.2.5 Security and Exposure
34(2)
2.3 Conclusions and Next Steps
36(3)
3 Item Pool Design and Maintenance for Multistage Testing
39(16)
Bernard P. Veldkamp
3.1 Designing an Item Pool Blueprint
40(6)
3.1.1 The Concept of a Design Space
41(1)
3.1.2 Models for Blueprint Design
42(1)
3.1.3 General Model for Integer Programming
43(1)
3.1.4 Integer Programming Blueprint Design for MST
43(2)
3.1.5 Overlapping Modules
45(1)
3.2 Applications in Item Writing
46(6)
3.2.1 Item Generation
47(2)
3.2.2 Generating the Modules
49(3)
3.3 Maintenance
52(1)
3.4 Discussion
53(2)
4 Mixed-Format Multistage Tests: Issues and Methods
55(14)
Jiseon Kim
Barbara G. Dodd
4.1 Literature Review on Design Components in Mixed-Format MST
56(5)
4.1.1 Item Pool
56(1)
4.1.2 MST Assembly
57(1)
4.1.3 MST Panel Structure
58(3)
4.2 Comparing Other Testing Approaches
61(1)
4.3 Issues and Future Research Suggestions for Mixed-Format MST
62(4)
4.4 Conclusion
66(3)
5 Design and Implementation of Large-Scale Multistage Testing Systems
69(16)
Richard Luecht
5.1 MST Design and Implementation Considerations
71(10)
5.1.1 Test Purpose and Measurement Information Targeting
72(3)
5.1.2 Item Bank Inventory Issues
75(1)
5.1.3 Test Assembly
76(1)
5.1.4 Exposure and Item Security Issues
77(2)
5.1.5 Scoring and Routing
79(1)
5.1.6 Score Precision
80(1)
5.1.7 System Performance and Data Management Issues
80(1)
5.2 Conclusions: A Research Agenda
81(4)
5.2.1 MST Panel Design and Assembly Issues
81(1)
5.2.2 Item Banking Issues
82(1)
5.2.3 New MST Applications
82(3)
II Test Assembly
85(66)
6 Overview of Test Assembly Methods in Multistage Testing
87(14)
Yi Zheng
Chun Wang
Michael J. Culbertson
Hua-Hua Chang
6.1 MST Framework
87(1)
6.2 MST Assembly Design
88(1)
6.3 Automated Assembly for MST
89(5)
6.3.1 Early Test Assembly Methods
89(1)
6.3.2 The 0-1 Programming Methods
90(1)
6.3.3 Heuristic Methods
91(3)
6.3.4 Other ATA Methods
94(1)
6.4 Setting Difficulty Anchors and Information Targets for Modules
94(1)
6.5 "On-the-Fly" MST (OMST) Assembly Paradigm
95(3)
6.5.1 The On-the-Fly MST Assembly Paradigm
96(2)
6.5.2 Future Research in On-the-Fly Test Assembly
98(1)
6.6 MST, CAT, and Other Designs---Which Way to Go?
98(3)
7 Using a Universal Shadow-Test Assembler with Multistage Testing
101(18)
Wim J. van der Linden
Qi Diao
7.1 Solving Shadow-Test Assembly Problems
103(1)
7.2 Basic Design Parameters
104(3)
7.2.1 Alternative Objectives for the Shadow Tests
105(1)
7.2.2 Alternative Objectives for the Selection of Items from the Shadow Tests
105(1)
7.2.3 Number of Shadow Tests per Test Taker
106(1)
7.2.4 Number of Test Takers per Shadow Test
107(1)
7.3 Different Testing Formats
107(4)
7.3.1 Linear Formats
107(1)
7.3.2 Multistage Formats
108(2)
7.3.3 Adaptive Formats
110(1)
7.4 Relative Efficiency of Formats
111(2)
7.5 Empirical Study
113(3)
7.5.1 Test Specifications
113(2)
7.5.2 Setup of Simulation
115(1)
7.5.3 Results
115(1)
7.6 Concluding Comments
116(3)
Appendix: Test-Assembly Constraints in Empirical Study
117(2)
8 Multistage Testing by Shaping Modules on the Fly
119(16)
Kyung (Chris) T. Han
Fanmin Guo
8.1 MST by Shaping
122(3)
8.2 MST-S versus MST-R versus CAT
125(6)
8.2.1 Simulation Design
125(2)
8.2.2 Data
127(1)
8.2.3 Results for Measurement Performance
127(3)
8.2.4 Results for Item Pool Utilization
130(1)
8.3 Discussion and Conclusion
131(4)
9 Optimizing the Test Assembly and Routing for Multistage Testing
135(16)
Angela Verschoor
Theo Eggen
9.1 Optimizing MST Assembly: A Nonexhaustive Search
135(11)
9.1.1 Constraints
136(1)
9.1.2 Optimal TIF Target
137(8)
9.1.3 Optimal Routing Module Length
145(1)
9.2 Limited Item Pools, Two- and Three-Parameter Models
146(4)
9.3 Discussion
150(1)
III Routing, Scoring, and Equating
151(98)
10 IRT-Based Multistage Testing
153(16)
Alexander Weissman
10.1 Introduction
153(5)
10.1.1 Item Response Model
153(1)
10.1.2 Likelihood Function
154(1)
10.1.3 Trait Estimation
155(1)
10.1.4 Information and Error
156(1)
10.1.5 Classification Decision
156(2)
10.2 Motivation for Tailored Testing
158(4)
10.3 Routing Rules
162(5)
10.3.1 Static Routing Rules
163(2)
10.3.2 Dynamic Routing Rules
165(1)
10.3.3 Special Considerations for Routing in Classification Tests
166(1)
10.4 Scoring and Classification Methodologies
167(1)
10.5 Final Comments
168(1)
11 A Tree-Based Approach for Multistage Testing
169(20)
Duanli Yan
Charles Lewis
Alina A. von Davier
11.1 Regression Trees
169(1)
11.2 Tree-Based Computerized Adaptive Tests
170(1)
11.3 Tree-Based Multistage Testing
171(2)
11.4 Algorithm
173(3)
11.4.1 Definition of Module Scores
173(1)
11.4.2 Definition of Cut Scores
173(1)
11.4.3 Minimizing Mean Squared Residuals
174(1)
11.4.4 Procedure and Evaluation
175(1)
11.5 An Application
176(11)
11.5.1 Data
176(1)
11.5.2 MST Construction
177(1)
11.5.3 Calibration
178(2)
11.5.4 Regression
180(1)
11.5.5 Application
181(4)
11.5.6 R2 and RMSE
185(2)
11.6 Discussion
187(1)
11.7 Limitations and Future Research
187(2)
12 Multistage Testing for Categorical Decisions
189(16)
Robert Smith
Charles Lewis
12.1 Computer-Mastery Methods
189(3)
12.1.1 Sequential Probability Ratio Test (SPRT)
190(1)
12.1.2 Adaptive Mastery Testing
190(1)
12.1.3 Computer-Mastery Test
190(1)
12.1.4 Adaptive Sequential Mastery Test
191(1)
12.2 Information Targeted at Cut Versus at Ability
192(1)
12.3 Influence of Multiple Cut Scores
193(1)
12.4 Factors That Can Reduce Optimal Solutions
194(1)
12.4.1 Cut Score Location
194(1)
12.4.2 Satisfying Content and Statistical Specifications
194(1)
12.4.3 Administering Blocks of Items Versus Individual Items
195(1)
12.5 Example Based on Smith and Lewis (1995)
195(10)
13 Adaptive Mastery Multistage Testing Using a Multidimensional Model
205(14)
Cees A. Glas
13.1 Introduction
205(1)
13.2 Definition of the Decision Problem
206(3)
13.2.1 Multidimensional IRT Models
206(1)
13.2.2 Compensatory Loss Models
207(2)
13.2.3 Conjunctive Loss Models
209(1)
13.3 Computation of Expected Loss and Risk Using Backward Induction
209(2)
13.4 Selection of Items and Testlets
211(2)
13.5 Simulation Studies
213(4)
13.5.1 Compensatory Loss Functions
213(3)
13.5.2 Conjunctive Loss Functions
216(1)
13.6 Conclusions and Further Research
217(2)
14 Multistage Testing Using Diagnostic Models
219(10)
Matthias von Davier
Ying (Alison) Cheng
14.1 The DINA Model and the General Diagnostic Model
219(3)
14.2 Experience with CD-CATs
222(3)
14.3 CD-MSTs
225(2)
14.4 Discussion
227(2)
15 Considerations on Parameter Estimation, Scoring, and Linking in Multistage Testing
229(20)
Shelby J. Haberman
Alina A. von Davier
15.1 Notation
229(3)
15.2 The Item Response Model
232(4)
15.2.1 The Conditional Distribution of Each Response Score
233(1)
15.2.2 Local Independence
234(1)
15.2.3 Sum Scores
235(1)
15.2.4 The Distribution of the Latent Variable
235(1)
15.3 The Test Score
236(5)
15.3.1 Maximum Likelihood Estimation
236(2)
15.3.2 Expected A Posteriori Estimation
238(1)
15.3.3 Modal A Posteriori Estimation
238(1)
15.3.4 Use of Sum Scores
239(1)
15.3.5 Reporting Scores
240(1)
15.3.6 Routing Rules and Estimated Scores
240(1)
15.4 Approaches to Parameter Estimation
241(4)
15.4.1 Concurrent Calibration
242(1)
15.4.2 Separate Calibration
243(1)
15.4.3 Sequential Linking
244(1)
15.4.4 Simultaneous Linking
244(1)
15.5 Conclusions
245(4)
Appendix A Routing Rules
246(1)
Appendix B Martingales
246(3)
IV Test Reliability, Validity, Fairness, and Security
249(52)
16 Reliability of Multistage Tests Using Item Response Theory
251(14)
Peter W. van Rijn
16.1 Test Reliability
252(5)
16.1.1 Test Reliability in Classical Test Theory
252(1)
16.1.2 Standard Error of Measurement in CTT
253(1)
16.1.3 Test Reliability in IRT
253(3)
16.1.4 Information Functions
256(1)
16.2 Application: IRT Reliability for MST in NAEP
257(6)
16.2.1 Sample and Design
258(1)
16.2.2 Results
258(5)
16.3 Conclusion
263(2)
17 Multistage Test Reliability Estimated via Classical Test Theory
265(6)
Samuel A. Livingston
Sooyeon Kim
17.1 The Estimation Procedure
266(2)
17.2 Testing the Accuracy of the Estimation Procedure
268(1)
17.3 How Accurate Were the Estimates?
269(2)
18 Evaluating Validity, Fairness, and Differential Item Functioning in Multistage Testing
271(14)
Rebecca Zwick
Brent Bridgeman
18.1 Content Balancing
272(1)
18.2 Opportunities for Item Review and Answer Changing
272(1)
18.3 Skipping Strategies
273(1)
18.4 MST Routing Algorithms
274(1)
18.5 The Digital Divide
275(2)
18.6 Comparability of Computer Platforms
277(1)
18.7 Accommodations for Students with Disabilities and English Language Learners
278(1)
18.8 Differential Item Functioning Analysis in MSTs
278(2)
18.9 Application of the Empirical Bayes DIF Approach to Simulated MST Data
280(4)
18.9.1 Root Mean Square Residuals of DIF Estimates
280(1)
18.9.2 Bias of EB and MH Point Estimates
281(1)
18.9.3 DIF Flagging Decisions for the EB Method
281(1)
18.9.4 Application of CATSIB to MSTs
281(2)
18.9.5 DIF analysis on the GRE MST
283(1)
18.10 Summary
284(1)
19 Test Security and Quality Control for Multistage Tests
285(16)
Yi-Hsuan Lee
Charles Lewis
Alina A. von Davier
19.1 An Overview of a Three-Component Procedure
286(1)
19.2 Tools to Evaluate Test Security and Quality Control
287(9)
19.2.1 Short-Term Detection Methods
287(6)
19.2.2 Long-Term Monitoring Methods
293(3)
19.3 A Simulation Study Using CUSUM Statistics to Monitor Item Performance
296(4)
19.4 Discussion
300(1)
V Applications in Large--Scale Assessments
301(120)
20 Multistage Test Design and Scoring with Small Samples
303(22)
Duanli Yan
Charles Lewis
Alina A. von Davier
20.1 Small Data Sample
304(1)
20.2 Item Pool
304(2)
20.3 Various MST Module Designs
306(6)
20.3.1 Module Lengths
306(1)
20.3.2 Module Difficulty Levels
306(1)
20.3.3 Biserial Correlation (rbi)
307(1)
20.3.4 Module Difficulty Ranges
307(4)
20.3.5 Characteristics of Modules
311(1)
20.3.6 Cronbach's α
312(1)
20.4 Routing and Scoring
312(2)
20.5 Comparisons of the Six MST Designs
314(9)
20.5.1 Calibrations
314(2)
20.5.2 Applications
316(2)
20.5.3 Evaluation
318(1)
20.5.4 Cronbach's α for All Designs in the Application Sample
319(4)
20.6 Discussion
323(1)
20.7 Limitations and Future Research
324(1)
21 The Multistage Test Implementation of the GRE Revised General Test
325(18)
Frederic Robin
Manfred Steffen
Longjuan Liang
21.1 From CAT to MST
327(1)
21.2 MST Design
328(7)
21.2.1 Test Specifications
328(1)
21.2.2 Scoring
329(2)
21.2.3 Measurement
331(3)
21.2.4 Test Development
334(1)
21.3 Implementation
335(2)
21.3.1 Jump-Start
335(2)
21.3.2 Steady State
337(1)
21.4 Monitoring
337(4)
21.4.1 Tests
338(1)
21.4.2 Items
339(1)
21.4.3 Scales
340(1)
21.5 Summary
341(2)
22 The Multistage Testing Approach to the AICPA Uniform Certified Public Accounting Examinations
343(12)
Krista J. Breithaupt
Oliver Y. Zhang
Donovan R. Hare
22.1 Research on Multistage Testing
343(6)
22.2 Item Bank Development for MST
349(1)
22.3 Content Security Monitoring for MST
350(2)
22.4 Inventory Exposure Planning for MST
352(2)
22.5 Discussion
354(1)
23 Transitioning a K--12 Assessment from Linear to Multistage Tests
355(16)
Carolyn Wentzel
Christine M. Mills
Kevin C. Meara
23.1 Administering CTP Items Online
356(2)
23.2 Creating a New MST Scale Using IRT
358(5)
23.2.1 Vertical Linking Item Sets
358(1)
23.2.2 Evaluation of Linear Online Data
359(1)
23.2.3 IRT Calibration and Item Fit Analysis
359(1)
23.2.4 Vertical Linking of Grades within a Content Area
360(1)
23.2.5 Evaluation of the Vertical Scales
361(2)
23.3 Multistage-Adaptive Test Development
363(4)
23.3.1 Choosing the MST Design
363(1)
23.3.2 Assembling the MSTs
364(3)
23.3.3 Selecting Router Cut Scores
367(1)
23.4 Score Reporting
367(2)
23.5 Summary
369(2)
24 A Multistage Testing Approach to Group-Score Assessments
371(20)
Andreas Oranje
John Mazzeo
Xueli Xu
Edward Kulick
24.1 Targeted Testing
372(2)
24.2 Goals of the Study
374(1)
24.3 Methods
375(7)
24.3.1 Design, Sample, and Instrument
375(1)
24.3.2 Routing and Item Selection
376(3)
24.3.3 Scaling
379(2)
24.3.4 Estimating Scores
381(1)
24.4 Results
382(5)
24.4.1 Measurement Error
382(3)
24.4.2 Routing Accuracy
385(1)
24.4.3 General Outcomes
386(1)
24.5 Discussion
387(4)
24.5.1 Lessons Learned
388(1)
24.5.2 Recommendations and Further Research
389(2)
25 Controlling Multistage Testing Exposure Rates in International Large-Scale Assessments
391(20)
Haiwen Chen
Kentaro Yamamoto
Matthias von Davier
25.1 Item Exposure Rate Control for Multistage Adaptive Assessments
394(3)
25.2 Method: How to Compute and Adjust the Item Exposure Rates
397(6)
25.2.1 PIAAC Routing Diagram
397(3)
25.2.2 Observed Score Distribution
400(1)
25.2.3 Cutting Curves for Stage Test Booklets
401(2)
25.3 Data
403(1)
25.4 Results
404(4)
25.4.1 Stage 1 Exposure Rates
404(1)
25.4.2 Stage 2 Exposure Rates
405(3)
25.5 Conclusion
408(3)
26 Software Tools for Multistage Testing Simulations
411(10)
Kyung (Chris) T. Han
Michal Kosinski
26.1 MSTGen
411(3)
26.1.1 Functionality
411(1)
26.1.2 User Interface
412(1)
26.1.3 Input and Output Examples
412(1)
26.1.4 Performance, Availability, and Support
412(2)
26.2 R
414(6)
26.2.1 Functionality
414(3)
26.2.2 Using R for Simulating MST
417(2)
26.2.3 Availability and Support
419(1)
26.3 Conclusions
420(1)
VI Closing Remarks
421(18)
27 Past and Future of Multistage Testing in Educational Reform
423(16)
Isaac I. Bejar
27.1 Future of MST
426(4)
27.2 A Model-Based Three-Stage Design
430(3)
27.3 Item Generation and Automated Scoring and Broadly Accessible Test Content
433(4)
27.3.1 Producing Items and Test Forms More Efficiently
434(2)
27.3.2 Accessibility
436(1)
27.3.3 Automated Scoring
436(1)
27.4 Summary and Conclusions
437(2)
Bibliography 439(50)
Index 489
Duanli Yan, Alina A. von Davier, Charles Lewis