| In the process of testing and assessment,aberrant response,caused by cheating or prior knowledge of the test content,can result in a mismatch between examinee ability and their demonstrated performance on test items.Aberrant response not only affects the accuracy of item parameter estimation and examinee ability estimation,but it can also have significant impacts on test quality and fairness.Therefore,identifying and controlling for aberrant response is of great importance.Currently,personal-fit statistics(PFS)based on change point analysis(CPA),including8(6)based on likelihood ratio test,8(6)based on Wald test,8(6)based on score test,and8(6)based on weighted residuals,have drawn broad attention.Compared with traditional aberrant detection methods,they not only can detect aberrant response data but also can locate the occurrence position to retain examinee response data to the maximum extent and reduce the loss of precision in subsequent analysis.However,current development of PFS based on CPA assumes a single change point,which is challenging to fulfill in practical testing and assessment,especially in large-scale assessments such as the Programme for International Student Assessment(PISA),where multiple change points are common.For example,examinee ability may change several times during the test due to factors including warm-up effects in the early stage of the test,fatigue in the middle stage,and accelerated answering in the later stage.As a result,there may be multiple change points in the examinee response sequence.Binary segmentation(BS)algorithm is the most widely used change point search method,which can traverse all change points in the time series data by cutting the sequence data multiple times.Its unique advantage lies in fully utilizing existing single-change-point statistics to transform multi-change-point problems into single-change-point ones for solving.Based on this,this paper aims to apply the BS algorithm to testing and assessment to address the multi-change-point problem in practical testing.This research consists of three sub-studies.Study 1 simulates and verifies the universality of PFS for detecting warm-up effects by testing8(6),(28(6),8(6),and8(6)through simulations.The second study builds upon the first by augmenting the response acceleration model to simulate multi-point scenarios in testing and then evaluates the statistical testing power of the BS-based multi-point analysis method.The third study aims to illustrate the feasibility of applying the BS-based multi-point analysis method for empirical research by detecting multi-point situations in the English final exam of junior high school students.In sum,these studies contribute to advancing the current understanding of multi-point analysis methods’effectiveness and utility in educational research.Based on the three studies,the following conclusions are drawn:(1)PFS based on CPA can be used to detect warm-up effects in examinees,where(28(6)performs the best,followed by8(6)and(28(6),while8(6)has the poorest detection effectiveness.(2)BS-based multi-point analysis method has some statistical power in detecting multi-point data but the detection effectiveness needs further optimization in future studies. |