This R Markdown document provides examples how to analyse a survival trial and provide inference throughout and at the end of the trial with rpact.
This tutorial provides two examples:
For a general introduction to “Inference in group-sequential designs”, please refer to the book “Group Sequential and Confirmatory Adaptive Designs in Clinical Trials” by Gernot Wassmer & Werner Brannath.
This tutorial only covers survival endpoints. Code for other endpoints is similar but the dataset needs to be provided in a different format (see ?getDataset
for details).
For details about the Gallium trial, we refer to the primary study publication: Marcus et al, N Engl J Med 2017; 377:1331-1344.
Trial characteristics:
Group-sequential design:
Results from standard inference at the futility interim analysis after 113 events:
Results from standard inference at the efficacy interim analysis after 245 events:
First, load the rpact package and define the group-sequential boundaries using the function getDesignGroupSequential
. Note that while the Gallium protocol specified a two-sided significance level of 5%, we implement this via a one-sided significance level of 2.5% as rpact (sensibly) only supports one-sided designs if futility interim analyses are specified.
First, load the rpact package
## [1] '3.2.1'
and define the design:
# FutilityBounds = c(0,-6) are on the Z-scale; a value of Z = 0 implies futility
# if the interim estimate is "in the wrong direction" (i.e., HR >= 1 here),
# a value of Z = -6 is essentially the same as Z = -Inf and implies no futility
# boundary for the second interim as per the Gallium design
design <- getDesignGroupSequential(informationRates = c(113,245,370)/370,
typeOfDesign = "asOF", sided = 1, alpha = 0.025,
futilityBounds = c(0,-6), bindingFutility = FALSE)
Note that bindingFutility = FALSE
has no impact because it is the default, so actually this could be omitted (same holds for sided = 1
and alpha = 0.025
).
Second, the results after the first and second interim are specified using the function getDataset
:
# overallLogRanks: One-sided logrank statistic or Z-score ( = log(HR)/SE) from Cox regression
results <- getDataset(
overallEvents = c(113,245),
overallLogRanks = c(-1.86,-3.225),
overallAllocationRatio = c(1, 1))
Finally, this is used for creating the adjusted inference using the function getAnalysisResults
(directionUpper = FALSE
is specified because the power is directed towards negative values of the logrank statistics):
adj_result <- getAnalysisResults(design = design,
dataInput = results,stage = 2,directionUpper = FALSE)
adj_result
## Analysis results (survival data of 2 groups, group sequential design):
##
## Design parameters:
## Information rates : 0.305, 0.662, 1.000
## Critical values : 3.891, 2.520, 1.992
## Futility bounds (non-binding) : 0.000, -Inf
## Cumulative alpha spending : 4.995e-05, 0.005879, 0.0250
## Local one-sided significance levels : 4.995e-05, 0.005861, 0.02318
## Significance level : 0.0250
## Test : one-sided
##
## User defined parameters:
## Direction upper : FALSE
##
## Default parameters:
## Normal approximation : TRUE
## Theta H0 : 1
##
## Stage results:
## Cumulative effect sizes : 0.7047, 0.6623, NA
## Stage-wise test statistics : -1.860, -2.673, NA
## Stage-wise p-values : 0.031443, 0.003762, NA
## Overall test statistics : -1.860, -3.225, NA
## Overall p-values : 0.0314428, 0.0006299, NA
##
## Analysis results:
## Actions : continue, reject and stop, NA
## Conditional rejection probability : 0.1373, 0.8616, NA
## Conditional power : NA, NA, NA
## Repeated confidence intervals (lower) : 0.3389, 0.4799, NA
## Repeated confidence intervals (upper) : 1.4653, 0.9139, NA
## Repeated p-values : 0.234459, 0.005409, NA
## Final stage : 2
## Final p-value : NA, 0.0006656, NA
## Final CIs (lower) : NA, 0.5157, NA
## Final CIs (upper) : NA, 0.8515, NA
## Median unbiased estimate : NA, 0.6626, NA
The output is explained as follows:
Critical values
are group-sequential efficacy boundary values on the \(z\)-scale, stage levels
are the corresponding one-sided local significance levels.Effect sizes
, Test statistics
, and $p$-values
refer to hazard ratio estimates, \(z\)-scores, and \(p\)-values obtained from the first interim analysis and results which would have been obtained after the second interim analysis if not all data up to the second interim analysis but only new data since the first interim had been included (i.e., per-stage results).Overall test statistics
are the given (overall, not per-stage) \(z\)-scores from each interim and Overall $p$-value
the corresponding one-sided \(p\)-values.RCIs
are repeated confidence intervals which provide valid (but conservative) inference at any stage of an ongoing or stopped group-sequential trial. Repeated $p$-values
are the corresponding \(p\)-values.Final $p$-value
is the final one-sided adjusted \(p\)-value based on the stagewise ordering of the sample space.Median unbiased estimate
and Final CIs
are the corresponding adjusted treatment effect estimate and the confidence interval for the hazard ratio at the interim analysis where the trial was stopped.Note that for this example, the adjusted final hazard ratio of
0.66 and the adjusted confidence interval of (0.52, 0.85) match the results from the conventional analysis almost exactly for the first two decimals. This is consistent with the finding that stopping a trial after 50% or more of the events had been collected has a negligible impact on estimation.
Monitoring ongoing trials is also possible with the function getAnalysisResults
introduced above. Repeated confidence intervals which provide valid (but conservative) inference at any stage of an ongoing or stopped group-sequential trial can be obtained using the same code as introduced in the previous example. Conditional power calculations require additional specification of the following arguments:
thetaH1
.nPlanned
).allocationRatioPlanned
for future interim stages (default is 1).Assume the same design as for the Gallium trial introduced above and the following hypothetical interim results:
Hypothetical results from standard inference at the futility interim analysis after 113 events:
Hypothetical results from standard inference at the efficacy interim analysis after 245 events:
Calculation of repeated confidence intervals and conditional power:
# 1) Specify results so far using function getDataset as before
results <- getDataset(
overallEvents = c(113,245),
overallLogRanks = c(-1.86,-1.716),
overallAllocationRatio = c(1, 1))
# 2) Calculate repeated confidence intervals and conditional power using
# the function getAnalysisResults as before
# Additional arguments for the conditional power calculation are
# - nPlanned: additional events from second interim until final analysis
# (370-245 for this trial)
# - thetaH1: True hazard ratio governing future stages
# (set to 0.74 here as per the original protocol assumptions)
interim_results <- getAnalysisResults(design = design,
dataInput = results,directionUpper = FALSE,
nPlanned = 370-245,thetaH1 = 0.74)
interim_results
## Analysis results (survival data of 2 groups, group sequential design):
##
## Design parameters:
## Information rates : 0.305, 0.662, 1.000
## Critical values : 3.891, 2.520, 1.992
## Futility bounds (non-binding) : 0.000, -Inf
## Cumulative alpha spending : 4.995e-05, 0.005879, 0.0250
## Local one-sided significance levels : 4.995e-05, 0.005861, 0.02318
## Significance level : 0.0250
## Test : one-sided
##
## User defined parameters:
## Direction upper : FALSE
## Planned sample size : NA, NA, 125
## Assumed effect under alternative : 0.74
##
## Default parameters:
## Normal approximation : TRUE
## Theta H0 : 1
## Planned allocation ratio : 1
##
## Stage results:
## Cumulative effect sizes : 0.7047, 0.8031, NA
## Stage-wise test statistics : -1.860, -0.617, NA
## Stage-wise p-values : 0.03144, 0.26865, NA
## Overall test statistics : -1.860, -1.716, NA
## Overall p-values : 0.03144, 0.04308, NA
##
## Analysis results:
## Actions : continue, continue, NA
## Conditional rejection probability : 0.1373, 0.1527, NA
## Conditional power : NA, NA, 0.7448
## Repeated confidence intervals (lower) : 0.3389, 0.5820, NA
## Repeated confidence intervals (upper) : 1.465, 1.108, NA
## Repeated p-values : 0.2345, 0.1013, NA
## Final stage : NA
## Final p-value : NA, NA, NA
## Final CIs (lower) : NA, NA, NA
## Final CIs (upper) : NA, NA, NA
## Median unbiased estimate : NA, NA, NA
As per the output above, the recommended action after the second interim analysis of this hypothetical trial would be to continue the trial, a repeated confidence interval for the hazard ratio is (0.58 to 1.11), and the conditional power to reach significance at the final analysis under protocol assumptions is 0.745. Final estimates and \(p\)-values are still missing as the trial has not stopped yet.
To obtain a plot of the conditional power over a range of alternatives you might call the rpact plot
function and specify the range for theta with thetaRange
. This produces the conditional power curve together with the likelihood function over the specified range:
Note that nPlanned
is from interim_results
and can optionally be changed.
If one only wants to calculate the conditional power, then it is computationally more efficient to call the functions getStageResults
and getConditionalPower
instead. The code below illustrates this by plotting the conditional power curve depending on the true treatment effect. The dashed vertical lines in the plot correspond to the protocol hazard ratio of 0.74 and the observed interim hazard ratio of 0.8.
# get stage results so far
stageResults <- getStageResults(design,results,directionUpper = FALSE)
# calculate condition power for true HR ranging from 0.6 to 1
hr <- seq(0.6,1,by = 0.01)
cpower <- rep(NA,length(hr))
for (i in 1:length(hr)) {
cpower[i] <- getConditionalPower(stageResults,nPlanned = 370-245,
thetaH1 = hr[i])$conditionalPower[3]
}
# Plot results
plot(hr,cpower,
type = "l",xlab = "True hazard ratio",
ylab = "Conditional power",
lwd = 2,ylim = c(0,1),axes = FALSE,
main = "Conditional power after second interim analysis")
axis(1, at = seq(0.6,1,by = 0.025)); axis(2,at = seq(0,1,by = 0.1))
abline(v = seq(0.6,1,by = 0.05),h = seq(0,1,by = 0.1),col = gray(0.9))
abline(v = c(0.74,0.8),lty = 2)