--- title: "factorial2x2: Design and Analysis of a 2x2 Factorial Trial with a Time-to-Event Endpoint" author: "Eric Leifer and James Troendle" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{factorial2x2} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ## Motivation for the factorial2x2 package The two-by-two factorial design randomizes subjects to receive either treatment *A* alone, treatment *B* alone, both treatment *A* and *B* *(AB)*, or neither treatment *(C)*. When the combined effect of *A* and *B* is less than the sum of the separate *A* and *B* effects, called a *subadditive interaction*, there can be low power to detect the *A* effect using an overall test which compares the *A* and *AB* groups to the *C* and *B* groups. In such an instance, a simple effect test of *A* vs. *C* may be useful since is not affected by an interaction. However, the simple effect tests can have low power since they use half the subjects of the overall test. Suppose we are interested benefit of *A* either by itself, or in combination with *B*. To exploit the sample size advantage of the overall test and robustness to a subadditive interaction of the simple tests, various combinations of overall and simple effects tests may be considered. We consider three multiple testing procedures: 1. The **Equal Allocation 3** procedure which allocates 1/3 of the significance level to testing each of the overall *A*, simple *A*, and simple *AB* effects. 2. The **Proportional Allocation 2** which allocates 2/3 of the significance level to testing the overall *A* effect and the remaining significance level to the simple *AB* effect. 3. The **Equal Allocation 2** which allocates 1/2 of the significance level to testing the simple *A* effect and 1/2 to the simple *AB* effect. As described in Leifer, Troendle, et al. (2020), these multiple testing procedures all use Dunnett-corrected critical values which exploit the correlations between the logrank test statistics. Similar procedures are defined for testing the overall *B* and simple *B* effects to be used in conjunction with results for the simple *AB* effect obtained as described above. `factorial2x2` has two main R functions: `fac2x2analyze` computes the hazard ratio, 95% confidence interval, and nominal p-value for the overall *A*, simple *A*, and simple *AB* hazard effects. It also performs signficance testing for the three multiple testing procedures. `fac2x2design` calculates the power for the overall and simple tests of the three multiple testing procedures. ## Examples using the *fac2x2analyze* function We present two examples using `fac2x2analyze`. ### Example 1 This example uses the simulated data in `simdata` which is a 4600-by-9 matrix which is loaded with the `factorial2x2` package. `simdata` corresponds to a simulated 2x2 factorial clinical trial of 4600 subjects. Subjects are simultaneously randomized to receive either treatment *A* or placebo as well as either treatment *B* or placebo. We are interested in testing and estimating the overall *A* and *B* effects, the simple *A* and *B* effects, and the simple *AB* effect. We consider the *A* and *B* research questions unrelated. As a consequence, we allocate a two-sided 0.05 significance level for testing the overall *A*, simple *A*, and simple *AB* procedures, as well as a two-sided 0.05 significance level for testing the overall *B*, simple *B*, and simple *AB* procedures. In the below output, we report the critical values for each of the three multiple testing procedures for both the Family 1 and Family 2 null hypotheses as described in Leifer, Troendle, et al. (2020). Below is the R code and output. We see the Equal Allocation 3 procedure detects the simple *A* and *AB* effects, the Proportional Allocation 2 procedure detects the simple *AB* effect, and the Equal Allocation 2 procedure detects the simple *A* and simple *AB* effects. Note that the examples which are provided in the `factorial2x2` manual use the 100-by-9 submatrix `simdataSub` which contains the first 100 rows of `simdata` for very rapid computation. `simdataSub` is also loaded with the `factorial2x2` package. ```{r, eval = F, echo = F, results = "hide"} # When I use knitr to run this whole file, I need to # "require" all the packages that are needed to run the code # since knitr uses a new environment. This is in contrast to # running the code chunks separately which use all of the packages # which are listed in the factorial2x2 package's DESCRIPTION file # on the DEPENDS line # require(factorial2x2) # require(survival) # require(mvtnorm) ``` ```{r, eval = F} time <- simdata[, 'time'] # follow-up time event <- simdata[, 'event'] # event indicator indA <- simdata[, 'indA'] # treatment A indicator indB <- simdata[, 'indB'] # treatment B indicator covmat <- simdata[, 6:10] # baseline covariates fac2x2analyze(time, event, indA, indB, covmat, alpha = 0.05, dig = 2, niter = 5) # simdata[, 6:10] corresponds to the baseline covariates which include # a history of cardiovascular disease (yes/no) and four indicator # variables which correspond to which of 5 clinical centers enrolled each of the participants $loghrAoverall [1] -0.1170805 # overall A effect log hazard ratio (HR) $seAoverall [1] 0.06258749 # standard error of overall A effect log HR $ZstatAoverall [1] -1.87067 # Z-statistic for overall A effect $pvalAoverall [1] 0.06139083 # nominal p-value for overall A effect $hrAoverall [1] 0.8895135 # HR for overall A effect $ciAoverall [1] 0.786823 1.005607 # 95% confidence interval for overall A effect HR $loghrAsimple [1] -0.2112048 # simple A effect log HR $seAsimple [1] 0.08655462 # standard error of simple A effect log HR $ZstatAsimple [1] -2.440133 # Z-statistic for simple A effect $pvalAsimple [1] 0.01468184 # nominal p-value for simple A effect $hrAsimple [1] 0.8096082 # HR for simple A effect $ciAsimple [1] 0.6832791 0.9592939 # 95% confidence interval for simple A effect HR $loghrABsimple [1] -0.2766681 # simple AB effect log HR $seABsimple [1] 0.08738966 # standard error of simple AB effect log HR $ZstatABsimple [1] -3.165914 # Z-statistic for simple AB effect $pvalABsimple [1] 0.001545967 # nominal p-value for simple AB effect $hrABsimple [1] 0.7583061 # HR for simple AB effect $ciABsimple [1] 0.6389355 0.8999785 # 95% confidence interval for simple AB effect HR $critEA3_A [1] -2.31 # critical value for Equal Allocation 3 procedure for Family 1 hypotheses $sigEA3_A [1] 0.02088815 # significance level corresponding to the critical value $resultEA3_A [1] "accept overall A" "reject simple A" "reject simple AB" # hypothesis tests results $critPA2_A [1] -2.13 # critical value for overall A effect for Proportional Allocation 2 $sigPA2_A [1] 0.03317161 # significance level corresponding to the critical value $critPA2_ab [1] -2.24 # critical value for simple AB effect for Proportional Allocation 2 $sigPA2_ab [1] 0.02509092 # significance level corresponding to the critical value $resultPA2_A [1] "accept overall A" "reject simple AB" # hypothesis tests results $critEA2_A [1] -2.22 # critical value for Equal Allocation 2 procedure $sigEA2_A [1] 0.02641877 # significance level corresponding to critical value $resultEA2_A [1] "reject simple A" "reject simple AB" # hypothesis tests results $corAa [1] 0.7274961 # correlation between overall A and simple A logrank statistics $corAab [1] 0.7164075 # correlation between overall A and simple AB logrank statistics $coraab [1] 0.4572905 # correlation between simple A and simple AB logrank statistics $loghrBoverall [1] -0.1664725 # overall B effect log hazard ratio (HR) $seBoverall [1] 0.06273747 # standard error of overall B effect log HR $ZstatBoverall [1] -2.653478 # Z-statistic for overall B effect $pvalBoverall [1] 0.007966693 # nominal p-value for overll B effect $hrBoverall [1] 0.8466461 # HR for overall B effect $ciBoverall [1] 0.7486842 0.9574258 # 95% confidence interval for overall B effect HR $loghrBsimple [1] -0.2673257 # simple B effect log HR $seBsimple [1] 0.08711762 # standard error of simple B effect log HR $ZstatBsimple [1] -3.068561 # Z-statistic for simple B effect log HR $pvalBsimple [1] 0.002150925 # nominal p-value for simple B effect $hrBsimple [1] 0.7654237 # HR for simple B effect $ciBsimple [1] 0.6452766 0.9079416 # 95% confidence interval for simple B effect HR $critEA3_B [1] -2.32 # critical value for Equal Allocation 3 procedure for Family 2 hypotheses $sigEA3_B [1] 0.02034088 # significance level corresponding to the critical value $resultEA3_B [1] "reject overall B" "reject simple B" # hypothesis tests results $critPA2_B [1] -2.13 # critical value for overall B effect for Proportional Allocation 2 $sigPA2_B [1] 0.03317161 # significance level corresponding to critical value $resultPA2_B [1] "reject overall B" # hypothesis test result $critEA2_B [1] -2.22 # critical value for Equal Allocation 2 procedure $sigEA2_B [1] 0.02641877 # significance level corresponding to the critical value $resultEA2_B [1] "reject simple B" # hypothesis test result $corBb [1] 0.7149066 # correlation between overall B and simple B logrank statistics $corBab [1] 0.7171287 # correlation between overall B and simple AB logrank statistics $corbab [1] 0.4575956 # correlation between simple B and simple AB logrank statistics ``` ### Example 2 This example reproduces the analysis of the COMBINE Study as reported in Section 4 of Lin, Gong, et al. (Biometrics, 2016). As described in that paper, the COMBINE study was a two-by-two factorial study enrolling 1226 alcohol dependent individuals who were able to abstain from alcohol for at least 4 days prior to the beginning of the trial. It tested the efficacy of the drug naltrexone both with and without a cognitive behavior intervention (CBI) in treating alcoholism. The trial had two co-primary endpoints: the percentage of days abstinent and the time to first heavy drinking day ($\ge 5$ standard drinks for men and $\ge 4$ for women). Each primary endpoint was allocated 0.025 two-sided significance level for efficacy testing. Below we reproduce the analysis for the time to first heavy drinking day. This analysis was adjusted for baseline percentage of days abstinent (within 30 days prior the participant's last drink) and research site. In the below output, the log hazard ratio estimates as well as their corresponding standard errors, Z-statistics, and p-values all agree with the results reported in Table 4 on p.1083 of Lin, Gong, et al. The data file `combine_data.txt` is available from the *Biometrics* website on the Wiley Online Library. ```{r, eval = F} # read the COMBINE data into an R data frame Combine <- read.table("c:\\combine_data.txt", header = T, nrows = 1226, na.strings ="", stringsAsFactors= T) dim(Combine) [1] 1226 9 dimnames(Combine)[[2]] [1] "ID" "AGE" "GENDER" "T0_PDA" "NALTREXONE" [6] "THERAPY" "site" "relapse" "futime" # create the baseline covariate variables T0_PDA <- Combine[,"T0_PDA"] # baseline percentage of days abstinent site_1 <- Combine[,"site"] == "site_1" # research site indicator variables site_2 <- Combine[,"site"] == "site_2" site_3 <- Combine[,"site"] == "site_3" site_4 <- Combine[,"site"] == "site_4" site_5 <- Combine[,"site"] == "site_5" site_6 <- Combine[,"site"] == "site_6" site_7 <- Combine[,"site"] == "site_7" site_8 <- Combine[,"site"] == "site_8" site_9 <- Combine[,"site"] == "site_9" site_10 <- Combine[,"site"] == "site_10" # combine the covariates into a single covariate matrix CombineCovMat <- cbind(T0_PDA, site_1, site_2, site_3, site_4, site_5, site_6, site_7, site_8, site_9, site_10) # define the other required variables relapse <- Combine[,"relapse"] # heavy drinking relapse indicator futime <- Combine[,"futime"] # time to first heavy drinking day or censoring NALTREXONE <- Combine[,"NALTREXONE"] # received naltrexone indicator THERAPY <- Combine[,"THERAPY"] # received cognitive behavioral intervention (CBI) indicator # reproduce the COMBINE analysis using fac2x2analyze fac2x2analyze(futime, relapse, NALTREXONE, THERAPY, CombineCovMat, alpha = 0.025, dig = 4) $loghrAoverall [1] -0.0847782 # log hazard rato estimate for the overall effect of naltrexone $seAoverall [1] 0.06854294 # std error of the log HR estimate for the overall effect of naltrexone $ZstatAoverall [1] -1.236863 # Z-statistic for the overall effect of naltrexone $pvalAoverall [1] 0.2161381 # p-value for the overall effect of naltrexone $hrAoverall [1] 0.918716 # hazard ratio estimate for the overall effect of naltrexone $ciAoverall [1] 0.8032234 1.0508149 # corresponding 95% confidence interval $loghrAsimple [1] -0.2517618 # log hazard rato estimate for the simple effect of naltrexone $seAsimple [1] 0.09786137 # std error of the log HR estimate for the simple effect of naltrexone $ZstatAsimple [1] -2.572637 # Z-statistic for the simple effect of naltrexone $pvalAsimple [1] 0.0100927 # p-value for the simple effect of naltrexone $hrAsimple [1] 0.7774299 # hazard ratio estimate for the simple effect of naltrexone $ciAsimple [1] 0.6417413 0.9418083 # corresponding 95% confidence interval $loghrABsimple [1] -0.09132675 # log hazard ratio estimate for the simple effect of naltrexone and CBI $seABsimple [1] 0.09553005 # std error of the log HR estimate for the simple effect of naltrexone # and CBI $ZstatABsimple [1] -0.9560003 # Z-statistic for the simple effect of naltrexone and CBI $pvalABsimple [1] 0.3390721 # p-value for the simple effect of naltrexone and CBI $hrABsimple [1] 0.9127194 # hazard ratio estimate for the simple effect of naltrexone and CBI $ciABsimple [1] 0.7568686 1.1006624 # corresponding 95% confidence interval $critEA3_A [1] -2.5811 # critical value for the three tests in Table 4 to provide # two-sided 0.025 familywise error; # slightly larger in absolute terms than the # critical value -2.573 reported on p.1083 of Lin, Gong, et al. $sigEA3_A [1] 0.009848605 $resultEA3_A [1] "accept overall A" "accept simple A" "accept simple AB" ``` ## Example using the *fac2x2design* function Here we use `fac2x2design` to compute the power for Scenario 5 in Table 2 from Leifer, Troendle, et al. (2019). ```{r, eval = F} n <- 4600 # total sample size rateC <- 0.0445 # one year event rate in the control group hrA <- 0.80 # simple A effect hazard ratio hrB <- 0.80 # simple B effect hazard ratio hrAB <- 0.72 # simple AB effect hazard ratio mincens <- 4.0 # minimum censoring time in years maxcens <- 8.4 # maximum censoring time in years fac2x2design(n, rateC, hrA, hrB, hrAB, mincens, maxcens, dig = 2, alpha = 0.05) $events [1] 954.8738 # expected number of events $evtprob # event probabilities for the C, A, B, and AB groups, respectively probC probA probB probAB 0.2446365 0.2012540 0.2012540 0.1831806 $powerEA3overallA [1] 0.5861992 # Equal Allocation 3's power to detect the overall A effect $powerEA3simpleA [1] 0.5817954 # Equal Allocation 3's power to detect the simple A effect $powerEA3simplAB [1] 0.9071236 # Equal Allocation 3's power to detect the simple AB effect $powerEA3anyA [1] 0.7060777 # Equal Allocation 3's power to detect either the overall A or simple A effects $powerPA2overallA [1] 0.6582819 # Proportional Allocation 2's power to detect the overall A effect $powerPA2simpleAB [1] 0.9197286 # Proportional Allocation 2's power to detect the simple AB effect $powerEA2simpleA [1] 0.6203837 # Equal Allocation 2's power to detect the simple A effect $powerEA2simpleAB [1] 0.9226679 # Equal Allocation 2's power to detect the simple AB effect $powerA [1] 0.7182932 # power to detect the overall A effect at the two-sided 0.05 level ``` ## References Leifer, E.S., Troendle, J.F., Kolecki, A., Follmann, D. Joint testing of overall and simple effect for the two-by-two factorial design. 2020. Submitted. Lin, D-Y., Gong, J., Gallo, P., et al. Simultaneous inference on treatment effects in survival studies with factorial designs. *Biometrics*. 2016; 72: 1078-1085. Slud, E.V. Analysis of factorial survival experiments. *Biometrics*. 1994; 50: 25-38.