Authors:
Patricia Gilavert
and
Valdinei Freire
Affiliation:
School of Arts, Sciences and Humanities, University of São Paulo, São Paulo, Brazil
Keyword(s):
Computerized Adaptive Testing, Stop Criteria, Combine Stop Criteria, Threshold to Stop.
Abstract:
Computerized Adaptive Testing is an assessment approach that selects questions one after another while conditioning each selection on the previous questions and answers. CAT is evaluated mainly for its precision, the correctness of estimation of the examinee trait, and efficiency, the test length. The precision-efficiency trade-off depends mostly on two CAT components: an item selection criterion and a stop criterion. While much research is dedicated to the first, stop criteria lack relevant research. We contribute with a comprehensive evaluation of stop criteria. First, we test a variety of seven stop-criteria for different setups of item banks and estimation mechanism. Second, we contribute with a precision-efficiency trade-off method to evaluate stop criteria. Finally, we contribute with an experiment considering simulations over a myriad of synthetic item banks. We conclude in favor of the Fixed-Length criterion, as long it can be tuned to the item bank at hand; the Fixed-Length
criterion shows a competitive precision-efficiency trade-off curve in every scenario while presenting zero variance in test length. We also highlight that estimation mechanism and item-bank distribution have a influence over the performance of stop criteria.
(More)