Free examples and use-cases:   rpact vignettes
rpact: Confirmatory Adaptive Clinical Trial Design and Analysis

Summary

This R Markdown document provides examples for simulating multi-arm multi-stage (MAMS) designs for testing means with rpact.

1 Introduction

This document provides examples for simulating multi-arm multi-stage (MAMS) designs for testing means in many-to-one comparisons. For designs with multiple arms, rpact enables the simulation of designs that use the closed combination testing principle. For a description of the methodology please refer to Part III of the book “Group Sequential and Confirmatory Adaptive Designs in Clinical Trials” by Gernot Wassmer & Werner Brannath. Essentially, we show in this vignette how to reproduce part of the simulation results provided in the paper “On Sample Size Determination in Multi-Arm Confirmatory Adaptive Designs” by Gernot Wassmer (Journal of Biopharmaceutical Statistics, 2011).

First, load the rpact package

library(rpact)
packageVersion("rpact") # version should be version 3.0.1 or later
## [1] '3.3.2'

2 Sample size calculation for the adaptive multi-arm situation

rpact enables the assessment of sample sizes in multiple arms including selection of treatment arms. We will first consider the simple case of a two-stage design with O’Brien & Fleming boundaries assuming three active treatment arms which are tested against control. Let the three treatment arms be referring to three different and increasing doses, “low”, “medium”, and “high”, say. We assume that the highest dose will have response difference 10 as compared to control, and that there will be a linear dose-response relationship. The standard deviation is assumed to be \(\sigma = 15\). At interim, the treatment arm with the highest observed response as compared to placebo is selected for testing at the second stage.

One way to adjust for the multiple comparison situation is to use the Bonferroni correction for testing the intersection tests in the closed system of hypotheses. It will show that using \(\alpha/3\) instead of \(\alpha\) for the sample size calculation for the highest dose in the two-arm fixed sample size case can serve as a reasonable first guess for the sample size for the multi-arm case. That is, for \(\alpha = 0.025\) and power \(1 - \beta = 90\%\) we calculate the sample size using the commands

nsFixed <- getSampleSizeMeans(alpha = 0.025 / 3, beta = 0.1, alternative = 10, stDev = 15)
kable(summary(nsFixed))

Sample size calculation for a continuous endpoint

Fixed sample analysis, significance level 0.83% (one-sided). The sample size was calculated for a two-sample t-test, H0: mu(1) - mu(2) = 0, H1: effect = 10, standard deviation = 15, power 90%.

Stage Fixed
Efficacy boundary (z-value scale) 2.394
Number of subjects 124.5
One-sided local significance level 0.0083
Efficacy boundary (t) 6.526

Legend:

  • (t): treatment effect scale

yielding 125 as the total number of subjects and hence n = 63 subjects per treatment arm in order to achieve the desired power. As a first guess for the multi-arm two-stage case we choose 30 per stage and treatment arm and use the following commands for evaluating the MAMS design. Note that plannedSubjects refers to the cumulative sample sizes over the stages per selected active arm:

designIN <- getDesignInverseNormal(kMax = 2, alpha = 0.025, typeOfDesign = "OF")
maxNumberOfIterations <- 1000
simBonfMAMS <- getSimulationMultiArmMeans(
    design = designIN,
    activeArms = 3,
    muMaxVector = c(10),
    stDev = 15,
    plannedSubjects = c(30, 60),
    intersectionTest = "Bonferroni",
    typeOfShape = "linear",
    typeOfSelection = "best",
    successCriterion = "all",
    maxNumberOfIterations = maxNumberOfIterations,
    seed = 1234
)
kable(summary(simBonfMAMS))

Simulation of a continuous endpoint (multi-arm design)

Sequential analysis with a maximum of 2 looks (inverse normal combination test design), overall significance level 2.5% (one-sided). The results were simulated for a multi-arm comparisons for means (3 treatments vs. control), H0: mu(i) - mu(control) = 0, power directed towards larger values, H1: mu_max = 10, standard deviation = 15, planned cumulative sample size = c(30, 60), effect shape = linear, intersection test = Bonferroni, selection = best, effect measure based on effect estimate, success criterion: all, simulation runs = 1000, seed = 1234.

Stage 1 2
Fixed weight 0.707 0.707
Efficacy boundary (z-value scale) 2.797 1.977
Stage Levels 0.0026 0.0240
Reject at least one 0.8850
Rejected arms per stage
Treatment arm 1 0.0130 0.0080
Treatment arm 2 0.1120 0.0890
Treatment arm 3 0.3030 0.4510
Success per stage 0.0060 0.8790
Exit probability for futility 0.0040
Expected number of subjects 179.4
Overall exit probability 0.0100
Stagewise number of subjects
Treatment arm 1 30.0 0.7
Treatment arm 2 30.0 5.6
Treatment arm 3 30.0 23.7
Control arm 30.0 30.0
Selected arms
Treatment arm 1 1.0000 0.0220
Treatment arm 2 1.0000 0.1850
Treatment arm 3 1.0000 0.7830
Number of active arms 3.000 1.000
Conditional power (achieved) 0.5785

Legend:

  • (i): treatment arm i

We see that the power, which is the probability to reject at least one of the three corresponding hypotheses, is about 88% if a linear dose-response relationship is assumed. Note that there is a small probability to stop the trial for futility which is due to the use of the Bonferroni correction yielding adjusted \(p\)-values equal to 1 at interim (making a rejection at stage 2 impossible).

Using the Dunnett test for testing the intersection hypotheses increases the power to about 90% which is obtained by selecting intersectionTest = "Dunnett":

simDunnettMAMS <- getSimulationMultiArmMeans(
    design = designIN,
    activeArms = 3,
    typeOfShape = "linear",
    muMaxVector = c(10),
    stDev = 15,
    plannedSubjects = c(30, 60),
    intersectionTest = "Dunnett",
    typeOfSelection = "best",
    successCriterion = "all",
    maxNumberOfIterations = maxNumberOfIterations,
    seed = 1234
)
kable(summary(simDunnettMAMS))

Simulation of a continuous endpoint (multi-arm design)

Sequential analysis with a maximum of 2 looks (inverse normal combination test design), overall significance level 2.5% (one-sided). The results were simulated for a multi-arm comparisons for means (3 treatments vs. control), H0: mu(i) - mu(control) = 0, power directed towards larger values, H1: mu_max = 10, standard deviation = 15, planned cumulative sample size = c(30, 60), effect shape = linear, intersection test = Dunnett, selection = best, effect measure based on effect estimate, success criterion: all, simulation runs = 1000, seed = 1234.

Stage 1 2
Fixed weight 0.707 0.707
Efficacy boundary (z-value scale) 2.797 1.977
Stage Levels 0.0026 0.0240
Reject at least one 0.8990
Rejected arms per stage
Treatment arm 1 0.0130 0.0080
Treatment arm 2 0.1140 0.0910
Treatment arm 3 0.3080 0.4580
Success per stage 0.0060 0.8930
Expected number of subjects 179.6
Overall exit probability 0.0060
Stagewise number of subjects
Treatment arm 1 30.0 0.7
Treatment arm 2 30.0 5.6
Treatment arm 3 30.0 23.7
Control arm 30.0 30.0
Selected arms
Treatment arm 1 1.0000 0.0220
Treatment arm 2 1.0000 0.1870
Treatment arm 3 1.0000 0.7850
Number of active arms 3.000 1.000
Conditional power (achieved) 0.5761

Legend:

  • (i): treatment arm i

Changing successCriterion = "all" to successCriterion = "atLeastOne" reduces the expected number of subjects considerably because the trial is stopped at interim in many more cases:

simDunnettMAMSatLeastOne <- getSimulationMultiArmMeans(
    design = designIN,
    activeArms = 3,
    typeOfShape = "linear",
    muMaxVector = c(10),
    stDev = 15,
    plannedSubjects = c(30, 60),
    intersectionTest = "Dunnett",
    typeOfSelection = "best",
    successCriterion = "atLeastOne",
    maxNumberOfIterations = maxNumberOfIterations,
    seed = 1234
)
kable(summary(simDunnettMAMSatLeastOne))

Simulation of a continuous endpoint (multi-arm design)

Sequential analysis with a maximum of 2 looks (inverse normal combination test design), overall significance level 2.5% (one-sided). The results were simulated for a multi-arm comparisons for means (3 treatments vs. control), H0: mu(i) - mu(control) = 0, power directed towards larger values, H1: mu_max = 10, standard deviation = 15, planned cumulative sample size = c(30, 60), effect shape = linear, intersection test = Dunnett, selection = best, effect measure based on effect estimate, success criterion: at least one, simulation runs = 1000, seed = 1234.

Stage 1 2
Fixed weight 0.707 0.707
Efficacy boundary (z-value scale) 2.797 1.977
Stage Levels 0.0026 0.0240
Reject at least one 0.8990
Rejected arms per stage
Treatment arm 1 0.0130 0.0080
Treatment arm 2 0.1140 0.0910
Treatment arm 3 0.3080 0.4580
Success per stage 0.3420 0.5570
Expected number of subjects 159.5
Overall exit probability 0.3420
Stagewise number of subjects
Treatment arm 1 30.0 0.8
Treatment arm 2 30.0 6.3
Treatment arm 3 30.0 22.9
Control arm 30.0 30.0
Selected arms
Treatment arm 1 1.0000 0.0180
Treatment arm 2 1.0000 0.1380
Treatment arm 3 1.0000 0.5020
Number of active arms 3.000 1.000
Conditional power (achieved) 0.3989

Legend:

  • (i): treatment arm i

For this example, we might conclude that choosing 30 subjects per treatment arm and stage is a reasonable choice. If, however, the effect sizes are smaller for the low and medium dose, the power might decrease and the sample size therefore should be increased. For example, assuming effect sizes of only 1 and 2 in the low and medium dose group, respectively, the test characteristics can be obtained by using the typeOfShape = userDefined option. The effect sizes of interest are specified through effectMatrix (which needs to be a