| Testlet design has been widely adopted in educational and psychological assessment. A testlet is a cluster of items that share a common stimulus (e.g., a reading comprehension passage or a figure), and the possible local dependence among items within a testlet is called testlet-effect. Various models have been developed to take into account such testlet effect. Examples included the Rasch testlet model, two-parameter logistic Bayesian testlet model, and higher-order testlet model. However, these existing models all assume that an item is affected by only one single testlet effect. Therefore, they are essentially unidimensional testlet-effect models.In practice, multiple testlet effects may simultaneously affect item responses in a testlet. For example, in addition to common stimulus, items can be grouped according to their domains, knowledge units, or item format, such that multiple testlet effects are involved. In essence, an item measures multiple latent traits, in addition to the target latent trait that the test was designed to measure. Existing unidimensional testlet-effect models become inapplicable when multiple testlet effects are involved.In study1, we developed a class of item response models to account for multiple testlet effect, which can be called as (within-item) multidimensional testlet-effect models. The parameters can be estimated with marginal maximum likelihood estimation methods or Bayesian methods with Markov Chain Monte Carlo (MCMC) algorithms. A popular computer program for statistical models, WinBUGS, was used. A series of simulations were conducted to evaluate parameter recovery of the new model, consequences of model misspecification, and the effectiveness of model-data fit statistics. Results indicated that the parameters of the new model can be recovered fairly well; and ignoring the multiple testlet effects resulted in a biased estimation of item parameters. Additionally, it did little harm on parameter estimation to fit a more complicated model to data with a simple structure. In conclusion, the new model is feasible and flexible.In study2, the so-called generalized multidimensional testlet-effect models were developed, which extended the multidimensional testlet-effect models by separating discrimination parameters are applied to the target latent trait and testlet effects. A simulation study were aimed at researching differences between the generalized models and the initial models. The freeware WinBUGS was used for parameter estimations, too. Results showed that the generalized models provides a better fit to the simulation data excluding its greater complexity.Then we discussed the correlation among the new models, the hierarchical models and the higher-order models. Suggesting that the higher-order models or the hierarchical models was formally equivalent to the new models.At the end, an empirical example of the application of the multidimensional testlet-effect models and the generalized multidimensional testlet-effect models to2006Progress in International Reading Literacy Study is provided. |