This R Markdown document provides examples for simulating multi-arm multi-stage (MAMS) designs for testing means with rpact.

This document provides examples for simulating multi-arm multi-stage
(MAMS) designs for testing means in many-to-one comparisons. For designs
with multiple arms, rpact enables the simulation of designs that use the
**closed combination testing principle**. For a description
of the methodology please refer to Part III of the book “Group Sequential and
Confirmatory Adaptive Designs in Clinical Trials” by Gernot Wassmer
& Werner Brannath. Essentially, we show in this vignette how to
reproduce part of the simulation results provided in the paper “On Sample Size
Determination in Multi-Arm Confirmatory Adaptive Designs” by Gernot
Wassmer (Journal of Biopharmaceutical Statistics, 2011).

**First, load the rpact package**

```
library(rpact)
packageVersion("rpact") # version should be version 3.0.1 or later
```

`## [1] '3.3.2'`

rpact enables the assessment of sample sizes in multiple arms including selection of treatment arms. We will first consider the simple case of a two-stage design with O’Brien & Fleming boundaries assuming three active treatment arms which are tested against control. Let the three treatment arms be referring to three different and increasing doses, “low”, “medium”, and “high”, say. We assume that the highest dose will have response difference 10 as compared to control, and that there will be a linear dose-response relationship. The standard deviation is assumed to be \(\sigma = 15\). At interim, the treatment arm with the highest observed response as compared to placebo is selected for testing at the second stage.

One way to adjust for the multiple comparison situation is to use the Bonferroni correction for testing the intersection tests in the closed system of hypotheses. It will show that using \(\alpha/3\) instead of \(\alpha\) for the sample size calculation for the highest dose in the two-arm fixed sample size case can serve as a reasonable first guess for the sample size for the multi-arm case. That is, for \(\alpha = 0.025\) and power \(1 - \beta = 90\%\) we calculate the sample size using the commands

```
<- getSampleSizeMeans(alpha = 0.025 / 3, beta = 0.1, alternative = 10, stDev = 15)
nsFixed kable(summary(nsFixed))
```

**Sample size calculation for a continuous
endpoint**

Fixed sample analysis, significance level 0.83% (one-sided). The sample size was calculated for a two-sample t-test, H0: mu(1) - mu(2) = 0, H1: effect = 10, standard deviation = 15, power 90%.

Stage | Fixed |
---|---|

Efficacy boundary (z-value scale) | 2.394 |

Number of subjects | 124.5 |

One-sided local significance level | 0.0083 |

Efficacy boundary (t) | 6.526 |

Legend:

*(t)*: treatment effect scale

yielding 125 as the total number of subjects and hence n = 63
subjects per treatment arm in order to achieve the desired power. As a
first guess for the multi-arm two-stage case we choose 30 per stage and
treatment arm and use the following commands for evaluating the MAMS
design. Note that `plannedSubjects`

refers to the
**cumulative sample sizes over the stages per selected active
arm**:

```
<- getDesignInverseNormal(kMax = 2, alpha = 0.025, typeOfDesign = "OF")
designIN <- 1000
maxNumberOfIterations <- getSimulationMultiArmMeans(
simBonfMAMS design = designIN,
activeArms = 3,
muMaxVector = c(10),
stDev = 15,
plannedSubjects = c(30, 60),
intersectionTest = "Bonferroni",
typeOfShape = "linear",
typeOfSelection = "best",
successCriterion = "all",
maxNumberOfIterations = maxNumberOfIterations,
seed = 1234
)kable(summary(simBonfMAMS))
```

**Simulation of a continuous endpoint (multi-arm
design)**

Sequential analysis with a maximum of 2 looks (inverse normal combination test design), overall significance level 2.5% (one-sided). The results were simulated for a multi-arm comparisons for means (3 treatments vs. control), H0: mu(i) - mu(control) = 0, power directed towards larger values, H1: mu_max = 10, standard deviation = 15, planned cumulative sample size = c(30, 60), effect shape = linear, intersection test = Bonferroni, selection = best, effect measure based on effect estimate, success criterion: all, simulation runs = 1000, seed = 1234.

Stage | 1 | 2 |
---|---|---|

Fixed weight | 0.707 | 0.707 |

Efficacy boundary (z-value scale) | 2.797 | 1.977 |

Stage Levels | 0.0026 | 0.0240 |

Reject at least one | 0.8850 | |

Rejected arms per stage | ||

Treatment arm 1 | 0.0130 | 0.0080 |

Treatment arm 2 | 0.1120 | 0.0890 |

Treatment arm 3 | 0.3030 | 0.4510 |

Success per stage | 0.0060 | 0.8790 |

Exit probability for futility | 0.0040 | |

Expected number of subjects | 179.4 | |

Overall exit probability | 0.0100 | |

Stagewise number of subjects | ||

Treatment arm 1 | 30.0 | 0.7 |

Treatment arm 2 | 30.0 | 5.6 |

Treatment arm 3 | 30.0 | 23.7 |

Control arm | 30.0 | 30.0 |

Selected arms | ||

Treatment arm 1 | 1.0000 | 0.0220 |

Treatment arm 2 | 1.0000 | 0.1850 |

Treatment arm 3 | 1.0000 | 0.7830 |

Number of active arms | 3.000 | 1.000 |

Conditional power (achieved) | 0.5785 |

Legend:

*(i)*: treatment arm i

We see that the power, which is the probability to reject at least one of the three corresponding hypotheses, is about 88% if a linear dose-response relationship is assumed. Note that there is a small probability to stop the trial for futility which is due to the use of the Bonferroni correction yielding adjusted \(p\)-values equal to 1 at interim (making a rejection at stage 2 impossible).

Using the Dunnett test for testing the intersection hypotheses
increases the power to about 90% which is obtained by selecting
`intersectionTest = "Dunnett"`

:

```
<- getSimulationMultiArmMeans(
simDunnettMAMS design = designIN,
activeArms = 3,
typeOfShape = "linear",
muMaxVector = c(10),
stDev = 15,
plannedSubjects = c(30, 60),
intersectionTest = "Dunnett",
typeOfSelection = "best",
successCriterion = "all",
maxNumberOfIterations = maxNumberOfIterations,
seed = 1234
)kable(summary(simDunnettMAMS))
```

**Simulation of a continuous endpoint (multi-arm
design)**

Sequential analysis with a maximum of 2 looks (inverse normal combination test design), overall significance level 2.5% (one-sided). The results were simulated for a multi-arm comparisons for means (3 treatments vs. control), H0: mu(i) - mu(control) = 0, power directed towards larger values, H1: mu_max = 10, standard deviation = 15, planned cumulative sample size = c(30, 60), effect shape = linear, intersection test = Dunnett, selection = best, effect measure based on effect estimate, success criterion: all, simulation runs = 1000, seed = 1234.

Stage | 1 | 2 |
---|---|---|

Fixed weight | 0.707 | 0.707 |

Efficacy boundary (z-value scale) | 2.797 | 1.977 |

Stage Levels | 0.0026 | 0.0240 |

Reject at least one | 0.8990 | |

Rejected arms per stage | ||

Treatment arm 1 | 0.0130 | 0.0080 |

Treatment arm 2 | 0.1140 | 0.0910 |

Treatment arm 3 | 0.3080 | 0.4580 |

Success per stage | 0.0060 | 0.8930 |

Expected number of subjects | 179.6 | |

Overall exit probability | 0.0060 | |

Stagewise number of subjects | ||

Treatment arm 1 | 30.0 | 0.7 |

Treatment arm 2 | 30.0 | 5.6 |

Treatment arm 3 | 30.0 | 23.7 |

Control arm | 30.0 | 30.0 |

Selected arms | ||

Treatment arm 1 | 1.0000 | 0.0220 |

Treatment arm 2 | 1.0000 | 0.1870 |

Treatment arm 3 | 1.0000 | 0.7850 |

Number of active arms | 3.000 | 1.000 |

Conditional power (achieved) | 0.5761 |

Legend:

*(i)*: treatment arm i

Changing `successCriterion = "all"`

to
`successCriterion = "atLeastOne"`

reduces the expected number
of subjects considerably because the trial is stopped at interim in many
more cases:

```
<- getSimulationMultiArmMeans(
simDunnettMAMSatLeastOne design = designIN,
activeArms = 3,
typeOfShape = "linear",
muMaxVector = c(10),
stDev = 15,
plannedSubjects = c(30, 60),
intersectionTest = "Dunnett",
typeOfSelection = "best",
successCriterion = "atLeastOne",
maxNumberOfIterations = maxNumberOfIterations,
seed = 1234
)kable(summary(simDunnettMAMSatLeastOne))
```

**Simulation of a continuous endpoint (multi-arm
design)**

Sequential analysis with a maximum of 2 looks (inverse normal combination test design), overall significance level 2.5% (one-sided). The results were simulated for a multi-arm comparisons for means (3 treatments vs. control), H0: mu(i) - mu(control) = 0, power directed towards larger values, H1: mu_max = 10, standard deviation = 15, planned cumulative sample size = c(30, 60), effect shape = linear, intersection test = Dunnett, selection = best, effect measure based on effect estimate, success criterion: at least one, simulation runs = 1000, seed = 1234.

Stage | 1 | 2 |
---|---|---|

Fixed weight | 0.707 | 0.707 |

Efficacy boundary (z-value scale) | 2.797 | 1.977 |

Stage Levels | 0.0026 | 0.0240 |

Reject at least one | 0.8990 | |

Rejected arms per stage | ||

Treatment arm 1 | 0.0130 | 0.0080 |

Treatment arm 2 | 0.1140 | 0.0910 |

Treatment arm 3 | 0.3080 | 0.4580 |

Success per stage | 0.3420 | 0.5570 |

Expected number of subjects | 159.5 | |

Overall exit probability | 0.3420 | |

Stagewise number of subjects | ||

Treatment arm 1 | 30.0 | 0.8 |

Treatment arm 2 | 30.0 | 6.3 |

Treatment arm 3 | 30.0 | 22.9 |

Control arm | 30.0 | 30.0 |

Selected arms | ||

Treatment arm 1 | 1.0000 | 0.0180 |

Treatment arm 2 | 1.0000 | 0.1380 |

Treatment arm 3 | 1.0000 | 0.5020 |

Number of active arms | 3.000 | 1.000 |

Conditional power (achieved) | 0.3989 |

Legend:

*(i)*: treatment arm i

For this example, we might conclude that choosing 30 subjects per
treatment arm and stage is a reasonable choice. If, however, the effect
sizes are smaller for the low and medium dose, the power might decrease
and the sample size therefore should be increased. For example, assuming
effect sizes of only 1 and 2 in the low and medium dose group,
respectively, the test characteristics can be obtained by using the
`typeOfShape = userDefined`

option. The effect sizes of
interest are specified through `effectMatrix`

(which needs to
be a