Font Size: a A A

Application of item response theory to criterion-referenced measurement: An investigation of the effects of model choice, sample size, and test length on reliability and estimation accuracy

Posted on:1991-02-10Degree:Ph.DType:Dissertation
University:The University of Nebraska - LincolnCandidate:Pozehl, Bunny JoFull Text:PDF
GTID:1470390017952416Subject:Education
Abstract/Summary:
This study focused on the application of item response theory to criterion-referenced testing. The first purpose was to investigate the effects of model choice and reduced test length through optimal item selection methods on the reliability of a criterion-referenced examination. A second purpose was to investigate the effects of model choice, sample size, and reduced test length on the accuracy of ability and parameter estimation. Combinations of sample sizes (250 and 500 examinees) and test lengths (50 and 100 items) were used to study the effects on estimation accuracy with five different IRT models. LOGIST 5 was used to obtain item and ability estimates for the five estimation models.;Actual test data were obtained from a 1986 administration of the Psychiatric and Mental Health Nurse Certification Examination given by the American Nurses Association. A total of 2,039 examinees took this 150 item multiple choice test.;Optimal selection of items resulted in shortened versions of the examination that had reliability estimates comparable to and even higher in the case of the 100 item examination than the estimates obtained from the 150 item examination. Model performance for the full sample of examinees across the three test lengths showed the two- and three-parameter models provided the best model-data fit.;Comparison of estimation accuracy with the IRT models in the reduced sample size test length conditions revealed the one- and modified one-parameter models provided more accurate estimates than the two- or three-parameter models. Little difference was noted in the accuracy of the one-parameter model compared to the modified one-parameter model.;In conclusion, the more general models performed the best in the larger sample, while the one- and modified one-parameter models performed the best in the smaller samples. The negligible differences between the guessing and non-guessing models across the conditions were thought to be due to the lack of guessing in the data.
Keywords/Search Tags:Test, Item, Model, Sample size, Criterion-referenced, Estimation, Accuracy, Effects
Related items