Application of item response theory to criterion-referenced measurement: An investigation of the effects of model choice, sample size, and test length on reliability and estimation accuracy

Posted on:1991-02-10

Degree:Ph.D

Type:Dissertation

University:The University of Nebraska - Lincoln

Candidate:Pozehl, Bunny Jo

Full Text:PDF

GTID:1470390017952416

Subject:Education

Abstract/Summary:

PDF Full Text Request

This study focused on the application of item response theory to criterion-referenced testing. The first purpose was to investigate the effects of model choice and reduced test length through optimal item selection methods on the reliability of a criterion-referenced examination. A second purpose was to investigate the effects of model choice, sample size, and reduced test length on the accuracy of ability and parameter estimation. Combinations of sample sizes (250 and 500 examinees) and test lengths (50 and 100 items) were used to study the effects on estimation accuracy with five different IRT models. LOGIST 5 was used to obtain item and ability estimates for the five estimation models.;Actual test data were obtained from a 1986 administration of the Psychiatric and Mental Health Nurse Certification Examination given by the American Nurses Association. A total of 2,039 examinees took this 150 item multiple choice test.;Optimal selection of items resulted in shortened versions of the examination that had reliability estimates comparable to and even higher in the case of the 100 item examination than the estimates obtained from the 150 item examination. Model performance for the full sample of examinees across the three test lengths showed the two- and three-parameter models provided the best model-data fit.;Comparison of estimation accuracy with the IRT models in the reduced sample size test length conditions revealed the one- and modified one-parameter models provided more accurate estimates than the two- or three-parameter models. Little difference was noted in the accuracy of the one-parameter model compared to the modified one-parameter model.;In conclusion, the more general models performed the best in the larger sample, while the one- and modified one-parameter models performed the best in the smaller samples. The negligible differences between the guessing and non-guessing models across the conditions were thought to be due to the lack of guessing in the data.

Keywords/Search Tags:

Test, Item, Model, Sample size, Criterion-referenced, Estimation, Accuracy, Effects

PDF Full Text Request

Related items

1	The Item Parametersâ€™ Estimation Accuracy In IRT
2	Research On Sample Size Estimation For Accuracy Assessment Of Land Cover Products
3	A Patch Area Scaling Model Based On Large Sample
4	The Sample Size Requirements Of Equivalence Test Of The Two Population And Simulation Study
5	Homogeneity Test And Sample Size Determination For Disease Prevalence Rates Under Stratified Double-sampling Design
6	The Research Of Item Response Models Based On Response Times
7	Application Of IRT Multicategory Scoring Item
8	The Calculation Of Sample Size In The Homogeneity Test Of Finite Mixed Model
9	Model selection in linear mixed-effects models
10	Comprehensive Assessment And Determination Of Sample Size For Omics Study And Web-based Tool Development