Font Size: a A A

Multi-faceted Rasch analysis and native-English-speaker ratings of Japanese EFL essays

Posted on:2005-08-17Degree:Ed.DType:Dissertation
University:Temple UniversityCandidate:Schaefer, Edward JayFull Text:PDF
GTID:1455390008979318Subject:Education
Abstract/Summary:
In the area of EFL performance testing, establishing the validity and reliability of tests dependent on rater judgment is a challenging issue. Writing tests, for example, are increasingly included in high stakes tests such as TOEFL. The assessment of such tests involves rater judgment, which inevitably involves rater idiosyncrasies and biases. Multi-faceted Rasch analysis has proved to be a promising technique for exploring the validity of rating scales and performing bias analyses of raters in writing tests.; The present study employed both multi-faceted Rasch analysis and a questionnaire to investigate the factors which influence NES raters when they rate essays. Forty NES raters, recruited from among Assistant Language Teachers (ALTs) in the Tokyo area, rated 40 essays written by female Japanese university students on a single topic adapted from the TOEFL Test of Written English (TWE). The raters assessed the essays using a rating scale based on the Jacobs, Zinkgraf, Wormuth, Hartfiel, and Hughey (1981) scale, which included the categories of Content, Organization, Style and Quality of Expression, Language Use, Mechanics, and Fluency. The raters also answered a questionnaire designed to elicit their reactions to the essays. A multi-faceted Rasch analysis was performed on the rating scales and the results compared to the results of the questionnaire.; The Rasch analysis revealed two recurring bias patterns among some of the raters: in rater/category bias interactions, if Content and/or Organization was rated severely, then Language Use and/or Mechanics was rated leniently, and vice versa. In rater/writer bias interactions, there tended to be more severe or lenient bias towards high ability writers than low ability writers. Some raters also tended to rate higher ability writers more severely and low ability writers more leniently than expected. The results also revealed that while raters consistently indicated that they considered Organization and Content to be far more important than Language Use on the questionnaire, in the Rasch analysis Language Use was the most severely rated category.; This study suggests that a larger number of raters is useful in discovering systematic bias patterns among raters, and also that a combination of quantitative and qualitative techniques is a promising approach to gaining insights which might otherwise be indiscernible.
Keywords/Search Tags:Multi-faceted rasch analysis, Essays, Tests, Raters, Ability writers, Rating
Related items