BACKGROUND&OBJECTIVERandomized control trials(RCTs)are considered to be the gold standard for clinical research,and the level of evidence is the highest among various types of studies.Because randomization techniques allow many factors that affect outcome to be balanced,the conclusion is more certain.Randomization will inevitably involve ethical issues,especially in placebo-controlled trials,and if the test drug is effective,it will cause damage to placebo-treated subjects.However,in order to verify the effectiveness of new drugs,they had to damage the interests of a small number of subjects in exchange for the happiness of the whole human.Nevertheless,we still want to minimize the damage to the subject.Response-Adaptive Randomization(RAR)methods are based on the above principles,and constantly adjust the probability of grouping based on the pre-test information of grouping and outcome so that more subjects assigned to the superior group,while as few subjects as possible assigned to the placebo group.At present,there are many RARs.According to the data type,they could be classified into binary,continuous and survival data;according to the number of groups.there are two-arms and multi-arms;there are parametric methods,nonparametric methods and Bayesian methods.In this paper,we introduce the most classic thirteen kinds of methods suit for two-arm and binary outcome,among which there are three kinds of parameter methods and seven nonparametric methods.Which of the 13 methods is better?How to determine the parameter values of eight parametric methods?This study will answer these two important questions through comparative studies by simulation techniques.METHODSFirst,we introduce the existing 13 RARs.Second,Monte Carlo technique was used to simulate the statistical performance and ethicality of the 13 methods.The simulation process is based on the idea:1)whenever the subject is entered,the random number RV1 is generated based on the 0-1 uniform distribution.2)Calculate the grouping probability,P(A),based on the accumulated experimental information and the formula of the specific method.The P(A)comparing with RV1 determined which treatment the subject is treated;3)generating random number RV2 based on 0-1 uniform distribution and comparing the success rate of processing group A or B to determine the outcome;4)Until the total sample size is fully allocated.The simulation study was carried out in version 3.1.2 of the R language software environment.The simulation study is divided into two parts.The first part discusses how to determine the parameters of the eight parameterized randomization methods.The total sample size is 120;the combinations of success rate of group A and B are(PA = 0.7,PB = 0.6),(0.7,0.4)and(0.7,0.2).The second part compares the statistical performance and ethicality of the 13 randomization methods by simulation.The total sample size are considered to be 120,240 and 480;the two groups rate difference is 0(none),0.1,0.3 and 0.5,each includes multiple sets of combinations of success rates.Each combination has a simulation of 10,000 times.Statistical performance and ethical evaluation:1)the grouping ratio of group A:When the total sample size is completed,the number of subjects assigned to group A.Since the success rate of group A is higher in this paper,the higher the proportion indicates the more the number of people assigned to the superior group and the better the ethical.2)Total number of failures:the number of subjects are treated unsuccessfully including those from groups A and B,which also reflected the ethical nature of the randomization method,but the total number of failed subjects was more intuitive.The smaller number of failed tests shows the better ethical.3)Power:based on a variety of RARs to achieve the grouping,Fisher’s exact probability method to test the difference between the success rates;the smaller the power is means the RARs reduces the effectiveness of the statistical method,which is not conducive to the estimation of the treatment effect.4)Standard Error of Group ratio:the variation of grouping ratio in 10000 simulations;if the scale is large,it indicates that the grouping ratio is unstable;because the ratio also affects the total number of failures and thepower,the standard deviation of grouping ratio reflects the stability of RARs.RESULTHow to determinate the parameters of Response-adaptive randomizationSince SMLE and PW have no parameters,DBCD and ERAD only target to the optimal allocation.This part of the study only consider eight randomized methods with parameters,as shown in Table 1.Parameter y of DBCD:As the parameters γ increasing,the grouping ratio of group A,the total number of failures decreases,power remains at a specific level,and the standard deviation of the grouping ratio decreases.When y equal or larger than 4,all the performance indicators achieve stability.From the view of the formula,when the targeted ratio is greater than the actual grouping ratio,it will be assigned to group A with a greater probability.The parameter γ determines the degree of influence of the difference on the grouping probability.If the parameter is large enough,the DBCD will achieve the optimal allocation ratio,but if the parameter is ∞,the grouping probability will be 0 or 1,which will destroy the randomness.In order to ensure good statistical performance,ethical and randomness,it is recommended γ = 4.Parameter a in ERAD:when the parameters smaller than 0.5,the indicators are more stable,and the grouping ratio of group A is high,the total number of failures and the proportion of standard deviation is small;with the parameters increasing,the proportion of group A is decreasing,the total nunber of failures and the standard deviation increasing,test performance unchanged.Based on the formula,when the parameters closer to 1,the ability to achieve targeted ratio is weaker;when the parameters is larger than 0.5,it is insufficient to achieve the targeted ratio.When the parameters equal to 1,ERAD is equivalent to SMLE and is the worst case of ERAD;when the parameters equal to 0,the grouping probability is 0 or 1 and randomness has gone.According to the results we suggest parameter a equal to 0.5.The parameter u in RPW、DY、BD、DL、GDL and RRU:RPW,DY,BD,DL,GDL and RRU contain this parameter.When the parameters less than 10,the grouping ratio of group A is high,the total number of failures small,the performance lower,the standard deviation large.With the parameters increasing,the grouping ratio of group A and its standard deviation decreased rapidly,the total number of failures and the power increased.Last,these indexes became stable finally.The parameter has similar effects on these methods.The choice of the initial number of balls in the urn is important.If the initial ball is too small,it is easy to divide too many subjects into the group A,and the imbalance between the groups will lead to the reduction of the power of statistical methods;if the number of initial balls is too large,the Ethicality is weak and the number of total failures is more.Thus,considering to the number of adding balls,the appropriate amount of initial ball is important.We recommend parameter u equal to 5 in RPW,DY,BD and RRU,and 1 in DL and GDL,because the power is high,the adaptability is strong,the total number of failures is small,and the stability of grouping ratio is good.Parameter a and b in RPW:when a = 0,the grouping ratio of group A is small,the total number of failures is large,the test performance is high,the standard deviation is small;with a increasing,the grouping ratio and its standard deviation increase,the number of total failures and power showed a downward trend.When b<3,the grouping ratio and its standard deviation is large,the number of total failures fewer.As b increasing,the grouping ratio and its standard deviation decreases,the number of total failures increases.When b>3,the indicators are stabilized.The choice of a and b are also very important,but we should note the relative relationship between the three parameters.If a and b relative to u are too small,the effect of RPW principle is close to the equal randomization;a relative to b and u is too large,it is easy to lead to a too large grouping ratio and a low power.When u is constant,if the a is too small,the proportion of group A decreased,the total number of failures increased.Therefore,we should focus on the relationship between the three parameters.According the result,we suggest RPW(μ= 5,a=1,b=1)Parameter C in GDL:With the increasing of parameters,the grouping ratio and its standard deviation decreased,the total number of failures increased,the power did not change significantly.When the additional ball is pumped,the number of balls A and B added to the urn is the product of C and their own targeted ratio.When C is larger,the greater effect of the targeted ratio on the composition ratio of the balls in the urn.The final distribution ratio will approach the targeted ratio.If u increase,the impact of C is weakened,the final distribution ratio and the targeted ratio will also be biased.Therefore,in order to achieve the target of the optimal allocation,the choice of C and u need to be discussed in detail.The simulation results show that GDL(μ=1,C = 9)has good statistical performance and ethicality.Comparative study on randomizationsThe methods based on unknown parameters are followed by the Neman allocation and the optimal allocation,which are classified as SMLE1 and SMLE2,DBCD1 and DBCD2,ERAD1 and ERAD2,respectively.Because the non-randomness of PW is easy to lead to selection bias,the comparative study only considers the other 12 randomization methods and equal randomizaiton.Grouping ratio:When the rate difference is 0,the grouping ratio of other RARs are balanced except D Y and BD ranging from 0.54 to 0.55 and 0.51 to 0.54,respectively.When the rate difference is 0.1,the grouping ratio of the equal randomization is still 0.5.The situation of RARs based on unknown parameter is more complex.When PA +PB>1,the grouping ratio of RARs followed by the Neyman allocation are less than 0.50;when PA+ PB<1,the grouping ratios are greater than 0.50.The grouping ratio for SMLE2 is 0.51~0.52,and DBCD2 and ERAD2 are larger.DY and BD have the largest one(0.58~0.59),the DL principle is the second(0.54~0.59).The nonparametric method is larger than parameterized methods.When the rate difference is 0.3,the proportion of the group A is greater.With the increase of the success rates,the RARs based on unknown parameter,DY,BD,GDL and RRU are decreasing,but RPW and DL are incremental.When PA = 0.8 and PB = 0.5,the DL is 0.69,the same as BD and greater than DY.When the rate difference is 0.50,the degree of the grouping ratio is more obvious.As the total sample size increases,the final proportion of group A of DY,BD and RRU is increasing,and the other methods is unchanged.The number of total failures:When the rate difference is zero,all randomized methods have the same total number of failures.As the success rate increasing,the number of total failures decreases.When the rate difference is 0.1,the DY,BD and DL reduce the number of total failures.When the rate difference is 0.3,the number of total failures of the RARs begin to decrease.The method with unknown parameters based on the Neyman allocation only reduces the total number of failures when PA + PB<1,otherwise increase the total number of failures.The reduction of the number of total failures for non-parametric methods is greater than four,of which BD and DL reduce more,and the RARs with unknown parameter are slightly poor.When rate difference equal to 0.5,the largest number of rescued subjects up to 17.When the sample size is larger,the effect of nonparametric methods are more obvious.Power:When the rate difference is zero,the results of all randomized methods are very close.When the rate difference is 0.1,the sample size of 120 is relatively small,and the powers of all the randomization methods are low.The performance of SMLE2 and DBCD1 in the parameter method is slightly lower than that of ER,and the other methods are higher than ER.The results of non-parametric methods are lower than ER.When the rate difference is 0.3,the power of ER is 90.30%一96.90%.The power of DBCD1,ERAD1 and ERAD2 is higher than that of ER,which is 0.200%,0.200%and 0.067%respectively,while others are lower than ER.The results of nonparametric methods are lower than ER,RPW,DY,BD,DL,GDL and RRU were reduced by 0.233%,-2.267%,-2.100%,-0.900%,-0.633%and-1.533%,respectively.The performance of the non-parametric method is lower than that of the parametric method.The performance of RPW,DL and GDL is relatively high.When the rate difference is 0.5,the power of all randomized methods is 100%.With the increase of the total sample size,the power is greatly improved.The power of the RARs are closer to that of ER,but the average power of DY and BD and RRU is lower than ER.Standard Error of Group Ratio:When the rate difference is 0,the standard deviation of the grouping ratio of ER is 0.045~0.046,which is less than 10%.The variation of SMLE was largest in the parametric methods,0.046~0.057,and the degree of variation of DBCD and ERAD was 0.016~0.041 and 0.007~0.036 respectively.The variation of DY,BD and RRU is large,which is 0.096~0.096,0.007~0.095 and 0.101~0.107,respectively.The variations of RPW,DL and GDL are small,but slightly higher than ER and parametric methods,0.041-0.069,0.031-0.061 and 0.063~0.065 respectively.As the rate difference increases,the degree of variation of the parametric methods become larger,but remains at a low level;the non-parametric approach decreases.As the total sample size becomes larger,the variation of all randomization methods becomes smaller and smaller.Grouping Ratio vs Power:Compared with ER,the power of parameterized methods are not inferior,and the grouping ratio is higher.Among the parametric methods with optimal allocation,DBCD2 and ERAD2 are better than SMLE2.Although the parametric methods are superior to the nonparametric methods,the grouping ratio is obviously disadvantage.In most cases,RPW,DL and GDL has greater grouping ratio than the parametric methods,and RRU,DY and BD with higher grouping ratio.Although,the proportion are the larger,the test performances of DY and BD are relatively low.CONCLUSIONSIn this study,we discuss how to determine the value of parameters for eight response-adaptive randomizations with parameters by simulation study,and the statistical performance and ethnics of nine randomization methods were compared.We suggested for parameter selection:DBCD adopts γ = 4,ERAD adopts a = 0.5,RPW adopts u=5,α=1,b=1,DY,BD and RRU take u=5,DL principle take u = 1,GDL principle take the parameters.When selecting the parameters of the RPW and GDL takes u = 1,C=9.The relationship between the parameters should be taken into account,when make a choice for RPW and GDL.When the rate difference is greater than 0.3,the non-parametric method can meet the requirements of power and the grouping ratio is better than the parameter method.When the rate difference is less than 0.3,the parameter method DBCD2 and ERAD2 are more effective and the grouping ratio is ideal.The success rate is greater than 0.6,DL principle has advantages in both effectiveness and grouping ratio.The statistical performance and ethicality of the Response-Adaptive randomization method:None of the methods are both outstanding in performance and grouping ratios.The power of equal randomization is similar to those of parametric methods,DL and GDL,but slightly greater than RPW,DY,BD and RRU.However,in the case of grouping ratio,the response-adaptive randomization method has a great advantage. |