Non-compliance to treatment assignment is a common issue in randomized clinical trials (RCTs). The noncomplyR package provides convenient functions for using Bayesian methods to perform inference on the Complier Average Causal Effect, the focus of a compliance-based analysis. The models used in this package are based on Imbens and Rubin (1997). The package currently supports two types of outcome models: the Normal model and the Binary model. Conjugate priors are used to simplify the sampling procedure. Users can, however, supply their own hyperparameter values in order to affect the shape of the prior distribution, thus maintaining some flexibility in the modeling of prior information.
This document takes you through the basic workflow of a compliance-based analysis using the noncomplyR package. To illustrate the workflow, we will work through an analysis of a randomized trial investigating the effect of vitamin A supplementation on childhood mortality.
Individuals in this study were randomly assigned to either receive or
not receive vitamin A supplementation. However, some individuals in the
treatment arm did not receive the treatment. The data are contained in a
dataframe named vitaminA
. The dataset contains 23682
observations on 3 variables: survival (1 = lived, 0 = died), treatment
assignment (1 = vitamin A, 0 = control), and treatment received (1 =
vitamin A, 0 = no vitamin A). Note the order of the variables is
outcome, treatment assignment, treatment received. This is the variable
order that noncomplyR requires.
We now demonstrate how the main functions of the noncomplyR package can be applied in sequence to fit a non-compliance model and perform posterior-based inference on the CACE.
compliance_chain
The primary model-fitting function is
compliance_chain()
. This function uses the data
augmentation algorithm to obtain a sample from the posterior
distribution for the full set of model parameters. The following code
performs 1000 iterations of the data augmentation algorithm and discards
the initial 10. Note that we have specified the outcome_model as well as
the set of assumptions that we will be making when fitting the
model.
model_fit <- compliance_chain(vitaminA, outcome_model = "binary", exclusion_restriction = T,
strong_access = T, n_iter = 1000, n_burn = 10)
head(model_fit)
#> omega_c omega_n p_c0 p_c1 p_n
#> [1,] 0.7974922 0.2025078 0.9935898 0.9981105 0.9899783
#> [2,] 0.8027364 0.1972636 0.9938614 0.9986314 0.9880724
#> [3,] 0.8078972 0.1921028 0.9961371 0.9986386 0.9872045
#> [4,] 0.8070221 0.1929779 0.9969108 0.9983559 0.9822705
#> [5,] 0.7993206 0.2006794 0.9964803 0.9985936 0.9843990
#> [6,] 0.7997129 0.2002871 0.9960020 0.9985101 0.9828294
Since the hyper_parameters
argument was not specified,
compliance_chain
automatically uses a non-informative flat
prior.
cace
Once the model has been fit and a posterior sample obtained for the
full set of model parameters, the corresponding posterior sample for the
CACE can be obtained with the cace
function. This function
takes the following arguments: the results of a call to the
compliance_chain()
function, the outcome model used in
fitting the model, and a logical value indicating whether the Strong
Access Monotonicity assumption was made when fitting the model.
summarize_chain
Give the sample from the posterior, the user can directly compute
quantities of interest. For convenience, the
summarize_chain
function computes the posterior mean,
median, and 50%, 90%, and 95% credible intervals.
summarize_chain(cace_posterior)
#> Posterior Mean: 0.003
#> Posterior Median: 0.003
#> Posterior 50% Credible Interval: (0.002, 0.004)
#> Posterior 90% Credible Interval: (0.001, 0.005)
#> Posterior 95% Credible Interval: (0.001, 0.006)
This example illustrates the basic workflow of a compliance-based
analysis using noncomplyR: fit the full model using
compliance_chain()
, obtain the posterior for the CACE using
cace()
, and summarize using
summarize_cace()
.
The function defaults have been chosen so that a standard compliance-based analysis with non-informative priors can be easily performed. However, it may be the case that users have additional prior information that they would like to incorporate into the analysis. We now give further details about the (hyper)parameters underlying the models in this package, so that interested users can build prior information into the model by supplying their own hyperparameters.
The distribution of the compliance types Compliers, Never Takers, and Always Takers is modeled as a Multinomial distribution with probability parameters ωc, ωn, ωa. The prior distribution on these parameters is a Dirichlet distribution with hyperparameters γc, γn, γa. Note that if the Strong Access Monotonicity assumption holds, then (hyper)parameters for the Always Takers drop out and this reduces to a Binomial data model with conjugate Beta prior.
For the Binary outcome model, the distribution of the outcome within compliance type t under treatment assignment z is modeled as a Bernoulli random variable with probability parameter ptz. We then model the prior on this parameter using a Beta distribution with hyperparameters αtz, βtz.
For the Normal outcome model, the distribution of the outcome within compliance type t under treatment assignment z is modeled as a Normal distribution with mean μtz and variance σtz2. We assume the conjugate Normal-Inverse Gamma prior distribution for the mean and variance parameters. That is, the prior for σtz2 is Inverse Gamma with hyperparameters atz, btz and the prior for μtz conditional on sigmatz2 is Normal with prior mean θtz and prior variance σtz2Vtz.
For both outcome models described above, note that if the Exclusion Restriction holds then the Never Takers under both assignment to control and treatment can be modeled with a single set of (hyper)parameters, and similarly for the Always Takers. And, if the Strong Access Monotonicity assumption holds, then any hyper(parameters) related to the Always Takers drop out of the model.
Users can incorporate prior information in the fitting of the model
by supplying values to the hyper_parameters
argument of
compliance_chain()
. The number of values to supply and the
order in which they should be supplied will depend on the outcome model
and the set of assumptions made when fitting the model. The convention
for the ordering of the (hyper)parameters is the following:
As an example, consider the Binary outcome model with the Exclusion Restriction but without the Strong Access monotonicity assumption. The parameters for this model are
Based on the conventions described above, the full set of model parameters would be ordered as (ωc, ωn, ωa, pc0, pc1, pn, pa). The hyperparameters should therefore be ordered as (γc, γn, γa, αc0, βc0, αc1, βc1, αn, βn, αa, βa).
Now consider the Normal outcome model without the Exclusion Restriction but with the Strong Access monotonicity assumption. The parameters for this model
The full set of model parameters should be ordered as (ωc, ωn, μc0, σc02, μc1, σc12, μn0, σn02, μn1, σn12) based on the ordering conventions. The corresponding hyperparameters are then ordered as (γc, γn, θc0, Vc0, ac0, bc0, θc1, Vc1, ac1, bc1, θn0, Vn0, an0, bn0, θn1, Vn1, an1, bn1).
Guido W. Imbens and Donald B. Rubin. “Bayesian Inference for Causal Effects in Randomized Experiments with Noncompliance”. The Annals of Statistics. 25 (1), 1997.