1 Introduction

1.1 About the Data
1.2 IT Requirements
1.3 Prerequisites
1.4 Helpful Resources

2 Understanding Data from the Missingness Perspective

3 Multilevel Multiple Imputation via Fully Conditional Specification

1 Introduction

This document is the technical appendix to our paper “The Effect of Socioeconomic Determinants of Health on Prescription of ARNIs at Hospital Discharge”, in which we investigated the impact of socioeconomic factors on ARNI prescription at discharge, controlling for a variety of clinical factors. We performed our analyses using the Get with the Guidelines-Heart Failure (GWTG-HF) data set, which provides an incredible wealth of clinical data, but also presents three major challenges:

A high degree of missing data
A hierarchical data structure
A large volume of data

We used multilevel multiple imputation to address missing data and a mixed-effects logistic regression model to address the data’s hierarchical structure. We used parallel processing to decrease computational time on large processes amenable to parallelization.
This document will lay out not only what we did to obtain our final analysis, but also how we set up for large tasks and alternate methods we tried that were unsuccessful. We will have Rmarkdown run code whenever possible, but in some cases the run times are prohibitively long. In these cases we will screen shot console outputs for discussion. We hope this technical appendix will provide transparency to our analyses, as well as guidance for clinicians and researchers looking to perform similar analyses in the future. And as always, despite our best efforts, we may have made mistakes or overlooked more efficient ways of doing things. We would love to hear from you and share in any discussions, so please contact us at our Github page!

1.1 About the Data

The GWTG-HF dataset is available upon request to the AHA. An application for use is required. Proposals can be submitted at www.heart.org/qualityresearch. The R code used to generate our clean data from the raw GWTG-HF can be found in the same GitHub repository as this Rmarkdown file.

1.2 IT Requirements

This analysis was performed on the AHA’s Precision Medicine Platform (PMP) using an r5.12xlarge AWS EC2 instance. If you are also using the PMP for these kinds of analysis, I recommend discussing your task with IT support, as they may have to remove some of the automatic disconnection timers inherent to the platform due to the long run times required for some of the tasks. Note that the r5.12xlarge instance is a fairly large machine - it has 48 vCPU and 384GiB RAM. While having such a large machine is not necessary, it will greatly decrease the time it takes to complete the task.

1.3 Prerequisites

We assume proficiency with the statistical software R, as well as familiarity with generalized linear models and decision tree analysis.

1.4 Helpful Resources

There are a variety of very helpful online resources available for each of the three topics described above. Here are a few of them that we found the most helpful:
Multiple Imputation:

Flexible Impuation of Missing Data by Stef van Buuren (https://stefvanbuuren.name/fimd/)
Applied Missing Data Analysis with SPSS and (R)Studio by Heymans & Eekhout (https://bookdown.org/mwheymans/bookmi/)
Multiple Imputation in STATA by UCLA IDRE (https://stats.idre.ucla.edu/stata/seminars/mi_in_stata_pt1_new/)
miceVignettes by Gerko Vink (https://www.gerkovink.com/miceVignettes/)
ASRM Unit 9: Sampling, sample selection, missing data by Mikko Ronkko (https://www.youtube.com/watch?v=jZ897LOEfKM&list=PL6tc6IBlZmOV0LRP_Hihmc0D-JwF6viA4)

Mixed-effects Regression Models:

Mixed Effects Logistic Regression | R Data Analysis Examples by UCLA IDRE (https://stats.idre.ucla.edu/r/dae/mixed-effects-logistic-regression/)
Keep Calm and Learn Multilevel Logistic Modeling: A simplified 3-Step Procedure Using Stata, R, Mplus, and SPSS by Sommet & Morselli (https://www.rips-irsp.com/articles/10.5334/irsp.90/)
ASRM Unit 7: Multilevel models by Mikko Ronkko (https://www.youtube.com/playlist?list=PL6tc6IBlZmOVEofGo-Yz4MpdOp5K8yk_T)

Parallel Processing:

Wrapper function parlMICE by Schouten & Vink (https://www.gerkovink.com/parlMICE/Vignette_parlMICE.html)
Parallel Processing in R by Josh Errickson (https://dept.stat.lsa.umich.edu/~jerrick/courses/stat701/notes/parallel.html)
How to go parallel in R - basics + tips by Max Gordon (https://gforge.se/2015/02/how-to-go-parallel-in-r-basics-tips/)

2 Understanding Data from the Missingness Perspective

Before starting, I highly recommend Stef van Buuren’s free online book Flexible Imputation of Missing Data. He provides both a theoretical understanding of multiple imputation by chained equations and a comprehensive guide of how to properly perform the procedure. He is also the primary contributor to the R library MICE, which is the core library we will be using to perform multiple imputation.

2.1 Getting Started

Start by loading all the packages we’ll need into our Rmarkdown environment, followed by our dataset.

library(dplyr)
library(mice)
library(miceadds)
library(broom.mixed)
library(lme4)
library(multilevel)
library(parallel)

load('/mnt/workspace/rstudio_projects/data/gdmt_11.11_clean.RData')

dplyr is used for data manipulation. MICE is the core library used to implement multiple imputation. It has been documented extensively by Gerko Vink and Stef van Buuren in both case-based and book formats. The source code is publicly available on GitHub. Miceadds adds quite a few univariate imputation methods, but we use it because it provides a clean way to save imputation objects (mids), which can take quite some time to produce. Boom.mixed is necessary for MICE. lme4 is used to determine whether or not a hierarchical data structure is present and is later used as the core library to implement our mixed model. Multilevel is used to determine hierarchical data structure. Parallel will be used later on in the analysis in the model fitting step.

2.2 Exploring missingness patterns

Now lets explore our data a little bit:

str(df)

## 'data.frame':    136144 obs. of  40 variables:
##  $ SITE_ID               : Factor w/ 560 levels "10001","10140",..: 89 396 261 422 102 23 309 66 123 65 ...
##  $ GDMT_HFBB             : Factor w/ 2 levels "0","1": 2 2 2 2 2 2 2 2 1 2 ...
##  $ GDMT_ACEIARB          : Factor w/ 2 levels "0","1": 2 2 2 2 2 2 1 2 2 2 ...
##  $ GDMT_ARNI             : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
##  $ GDMT_MRA              : Factor w/ 2 levels "0","1": 2 1 2 NA 1 1 1 1 2 2 ...
##  $ GDMT_BBcontra         : Factor w/ 2 levels "0","1": 1 1 NA NA 1 NA 1 1 2 1 ...
##  $ GDMT_RAASIcontra      : Factor w/ 2 levels "0","1": 1 1 2 NA 2 1 2 1 1 2 ...
##  $ GDMT_MRAcontra        : Factor w/ 2 levels "0","1": 1 1 NA NA 1 1 1 1 1 1 ...
##  $ GDMT_ARNIcontra       : Factor w/ 2 levels "0","1": 1 1 NA NA 2 1 1 1 1 2 ...
##  $ GENDERi               : Factor w/ 2 levels "male","female": 2 1 1 1 1 2 1 1 1 1 ...
##  $ race2i                : Factor w/ 5 levels "Asian","Black",..: 5 5 5 2 5 5 5 5 3 3 ...
##  $ insurance             : Factor w/ 4 levels "Medicaid","Medicare",..: 2 3 1 NA 2 2 2 3 3 3 ...
##  $ AGEi                  : num  63 47 57 54 94 78 85 73 38 75 ...
##  $ PMHx_none             : Factor w/ 2 levels "0","1": 1 1 1 NA 1 1 1 1 1 1 ...
##  $ PMHx_CHF              : Factor w/ 2 levels "0","1": 2 1 2 NA 2 1 2 2 2 2 ...
##  $ PMHx_ESRD             : Factor w/ 2 levels "0","1": 1 1 1 NA 1 1 1 1 2 1 ...
##  $ PMHx_Cr2              : Factor w/ 2 levels "0","1": 1 1 1 NA 1 2 1 1 2 1 ...
##  $ PMHx_PPMICDCRT        : Factor w/ 2 levels "0","1": 2 1 1 NA 2 1 2 2 1 1 ...
##  $ PMHx_fibfl            : Factor w/ 2 levels "0","1": 1 1 1 NA 2 1 1 1 1 1 ...
##  $ GH_HEART_TRANSPLANTS  : Factor w/ 2 levels "0","1": 2 1 1 1 1 1 1 2 2 1 ...
##  $ RESIDENTS             : Factor w/ 2 levels "0","1": 2 1 1 1 2 2 1 2 2 2 ...
##  $ OH_EF                 : num  30 27 33 14 20 35 37 21 22 20 ...
##  $ OH_TRANSPLANT         : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
##  $ covar_ino             : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
##  $ inhospRx_BB           : Factor w/ 2 levels "0","1": NA NA 2 NA 2 2 NA NA 1 NA ...
##  $ inhospRx_MRA          : Factor w/ 2 levels "0","1": NA NA 2 NA 1 1 NA NA 2 NA ...
##  $ inhospRx_none         : Factor w/ 2 levels "0","1": NA NA 1 NA 1 1 NA NA 1 NA ...
##  $ inhospRx_ARNI         : Factor w/ 2 levels "0","1": NA NA 1 NA 1 1 NA NA 1 NA ...
##  $ inhospRx_ACEIARB      : Factor w/ 2 levels "0","1": NA NA 2 NA 2 1 NA NA 2 NA ...
##  $ covar_Kdisc           : num  NA NA NA NA 4.5 4.1 NA NA 5.1 NA ...
##  $ covar_Crdisc          : num  NA NA NA NA 1.2 1.6 NA NA 6.7 NA ...
##  $ OH_DISCBPSYST         : num  NA NA 111 NA 142 139 NA NA 119 NA ...
##  $ OH_DISCHR             : num  NA NA NA NA 59 73 NA NA 79 NA ...
##  $ zip_designation       : Factor w/ 4 levels "Rural","SmallTown",..: NA NA NA 3 3 2 NA NA 4 NA ...
##  $ population_total      : num  NA NA NA 35641 7223 ...
##  $ distress_score        : num  NA NA NA 49.2 47.9 ...
##  $ covar_OutsideZIPregion: Factor w/ 2 levels "0","1": NA NA NA 1 2 2 NA NA 2 NA ...
##  $ followup              : Factor w/ 2 levels "0","1": 2 2 2 2 2 NA 2 2 2 2 ...
##  $ OH_ARNI               : Factor w/ 2 levels "0","1": NA NA NA NA 1 1 NA NA 1 NA ...
##  $ DCtoContCare          : Factor w/ 2 levels "0","1": NA NA NA NA 1 2 NA NA 1 2 ...

#how many missing in each variable? 
lapply(df, function(x) 
       sum(is.na(x))
       )

## $SITE_ID
## [1] 0
## 
## $GDMT_HFBB
## [1] 3024
## 
## $GDMT_ACEIARB
## [1] 513
## 
## $GDMT_ARNI
## [1] 0
## 
## $GDMT_MRA
## [1] 7266
## 
## $GDMT_BBcontra
## [1] 25857
## 
## $GDMT_RAASIcontra
## [1] 2128
## 
## $GDMT_MRAcontra
## [1] 14237
## 
## $GDMT_ARNIcontra
## [1] 15122
## 
## $GENDERi
## [1] 322
## 
## $race2i
## [1] 0
## 
## $insurance
## [1] 9060
## 
## $AGEi
## [1] 0
## 
## $PMHx_none
## [1] 8504
## 
## $PMHx_CHF
## [1] 8504
## 
## $PMHx_ESRD
## [1] 8504
## 
## $PMHx_Cr2
## [1] 8504
## 
## $PMHx_PPMICDCRT
## [1] 8504
## 
## $PMHx_fibfl
## [1] 8504
## 
## $GH_HEART_TRANSPLANTS
## [1] 0
## 
## $RESIDENTS
## [1] 0
## 
## $OH_EF
## [1] 0
## 
## $OH_TRANSPLANT
## [1] 0
## 
## $covar_ino
## [1] 0
## 
## $inhospRx_BB
## [1] 55647
## 
## $inhospRx_MRA
## [1] 55647
## 
## $inhospRx_none
## [1] 55647
## 
## $inhospRx_ARNI
## [1] 55647
## 
## $inhospRx_ACEIARB
## [1] 55647
## 
## $covar_Kdisc
## [1] 96076
## 
## $covar_Crdisc
## [1] 94145
## 
## $OH_DISCBPSYST
## [1] 57867
## 
## $OH_DISCHR
## [1] 59064
## 
## $zip_designation
## [1] 59969
## 
## $population_total
## [1] 59969
## 
## $distress_score
## [1] 59969
## 
## $covar_OutsideZIPregion
## [1] 58122
## 
## $followup
## [1] 7650
## 
## $OH_ARNI
## [1] 91588
## 
## $DCtoContCare
## [1] 73568

#how many complete cases? 
sum(complete.cases(df))

## [1] 23687

Note that missingness ranges from 0 to 70%. To understand how this missingness will impact our statistical analysis, we need to better understand the mechanism of missingness. Data that are “missing completely at random” (MCAR) are frequently amenable to “simple” fixes such as complete case analysis (also known as listwise deletion). Data that are “missing at random” (MAR) are amenable to multiple imputation in most cases, or univariate imputation if the data is monotone. Data that are “missing not at random” (MNAR) present the most complex scenario and frequently require additional research into the cause of data missingness. Blindly imputing MNAR data can result in inaccurate estimates in the final model.
MICE has several built-in methods for diagnosing the mechanism of missingness, but due to the size of the GWTG-HF dataset, the visual assessments are not practical.

#what is the pattern of data missingness? 
md.pattern(df, plot=FALSE)

We have intentionally hidden the results of the above code, as the output makes the Rmarkdown somewhat unreadable. The first column indicates the number of observations with that pattern of missingness, the subsequent columns indicate if the associated variable is missing or not, and the final column shows the number of variables that are missing for that particular pattern. Because there are 40 variables and over 1500 patterns of missingness, this matrix is rather large, but is nevertheless worth reviewing in detail. After review, the data appear to be MAR, and the missingess pattern is not monotone. In addition, we can surmise that missingness happens at a systemic level, with some sites simply not reporting entire groups of variables, such as patients’ medical comorbidities, their discharge labs/vitals, etc. Such a pattern is amenable to multiple imputation by chained equations.

2.3 Exploring multilevel data structures

When interpreting data in the context of missingness, we also need to consider if a data has a multilevel structure (also referred to as hierarchical or clustered data). We know that patients are nested within hospitals, and it’s reasonable to suspect that culture, resources, and other unobserved variables at each site are associated with the characteristics of the patients they treat. We can assess this quantitatively by looking at the intraclass correlation coefficient (ICC) for incomplete variables.

#Find the variables with missing data and split the variable names 
#into lists based on class (factor vs numeric)

varlist <- colnames(df)
hasmiss <- list() 
miss_f <- list() 
miss_n <- list() 

for(i in 1:ncol(df)) { 
        if(with(df,sum(is.na(eval(as.name(varlist[i]))))) > 0) {
                hasmiss <- append(hasmiss, varlist[i])
        }
}

for(i in 1:length(hasmiss)) { 
        if (with(df,class(eval(as.name(hasmiss[[i]])))) == "factor") {
                miss_f <- append(miss_f, hasmiss[i])
        } else {
                miss_n <- append(miss_n, hasmiss[i])
        }
}

rm(varlist, hasmiss)

#calculate the ICC for each variable. Note the models used to calculate ICC are dependent 
#on the dependent variable's class (ANOVA for numeric variables, unconditional mean model 
#for factor variables)

fun_f <- function(i) {
        
        m0 <- glmer(eval(as.name(i)) ~ (1|SITE_ID),
                    data=df,
                    family='binomial',
                    control=glmerControl(optimizer='bobyqa', optCtrl=list(maxfun=2e5)))
        
        ICC <- m0@theta[1]^2/(m0@theta[1]^2 + (3.14159^2/3))
        
        print(i)
        print(ICC)
}

fun_n <- function(i) {
        
        ICC <- multilevel::ICC1(aov(eval(as.name(i)) ~ SITE_ID, data=df))
        
        print(i)
        print(ICC)
}

suppress <- lapply(miss_f, fun_f)
suppress <- lapply(miss_n, fun_n)

rm(miss_n, miss_f, fun_n, fun_f, suppress)

There is no cutoff for a “significant” ICC, but we see that SITE_ID alone explains more than 30% of the variance seen in many of the variables to be imputed; hence, clustering cannot be ignored.
To summarize what we have understood about our data up to this point:

The data seems to missing at random (MAR)
Missingness is not monotone
The data has a hierarchical structure with patients nested within sites

Any imputation technique we apply our data to needs to account for these three factors. Because the data is MAR and not monotone, multiple imputation arises as an attractive option for imputation. Because the data is hierarchical we will need to apply a multilevel multiple imputation model.

3 Multilevel Multiple Imputation via Fully Conditional Specification

We will implement our multilevel multiple imputation model via fully conditional specification (FCS). Stef van Buuren’s book comprehensively explains the theory and basic steps for this procedure, while the vignette on multilevel imputation by Gerko Vink provides a nice walk through. We adapt many techniques and terminology from these two sources.
Very broadly speaking, multiple imputation can be broken down into 3 steps, and these steps provide a nice outline for how we will approach the procedure. The steps are as follows:

The imputation step:
We estimate the distribution of missing values (we are NOT imputing point values). Random draws from that distribution are imputed into the missing value slots and this process is repeated multiple times to create multiple datasets with imputed values, whose point values form the distributions we estimated from our initial models. Each of these datasets with imputed values is referred to as an imputation.
We opted to estimate the distributions of the missing values through fully conditional specification (as opposed to joint modeling). In FCS we create a series of models for each independent variable with missing values and flip the models around so that the independent variable with missingess is used as the dependent variable for that model. For example, we saw previously that the variable GDMT_HFBB has 25,857 missing values. We will create a model with GDMT_BBcontra as the dependent variable, and use this model to estimate the distribution of missing values for GDMT_BBcontra. Note that the ICC for this model was 0.360 so the imputation model will need to take this clustering into account.
The fitting step:
We create a model to test our hypothesis, which in our case is testing if a patient is prescribed an ARNI at hospital discharge (GDMT_ARNI). Keep in mind that we may have to take a hierarchical data structure into account. This model is applied to every one of the iterations generated in the imputation step.
The pooling step:
We pool the results of each of the iteration’s models using Rubin’s rules to generate our final model.

3.1 Mulitiple Imputation using GLM-based Imputation Methods (unsuccessful)

In this section we describe how we implemented FCS using generalized linear models. The imputation model ultimately did not converge, so this approach was ultimately abandoned. Nevertheless, we learned much about multiple imputation through this failure, and so present it anyways.

3.1.1 Selecting imputation methods

The first task of the imputation step is to specify the kind of statistical model we are going to use to predict each of our independent variables with missingness. In MICE, this is referred to as choosing the imputation method. This is largely going to depend on the class of that variable (numeric vs factor/binary vs factor/nominal vs factor/ordinal), but it is also determined by the presence of a hierarchical data structure. The presence of clustering in our data greatly complicates multiple imputation, but luckily the MICE library comes with several imputation methods specifically for hierarchical data with 2 levels. These methods typically start with the prefix “2l.”
Note that there are no 2l methods for ordinal or nominal outcomes. In our case this caused a problem for the variables insurance and zip_designation. Our work around was to create binary variables based on nominal variables and impute the binary variable instead. We retained as much information as possible by continuing to use nominal variables as dependent variables in other imputation models. For some nominal variables, like zip_designation, creating a binary derivative was not possible, so it had to be dropped entirely from analysis.

#convert nominal variables to binary 
df <- df %>% 
  mutate(insured = ifelse(insurance %in% 'Medicaid' | 
                          insurance %in% 'Medicare' | 
                          insurance %in% 'Other', 1,  
                   ifelse(insurance %in% 'none', 0, 
                   ifelse(is.na(insurance), NA, 999))))
df$insured <- as.factor(df$insured)

We have now done sufficient set up to continue setting up imputation methods. First we set the cluster variable, which in our case is SITE_ID. Interestingly, MICE requires that cluster variables be changed to numeric class.

#set the cluster variable (note the cluster var must be numeric)
df$SITE_ID <- as.numeric(df$SITE_ID)
cluster_var=df$SITE_ID

Next we create the imputation vector.

impmethod <- character(ncol(df))
names(impmethod) <- colnames(df)

We then assign imputation methods to each variable based on its class. We arbitrarily chose an ICC < 0.05 as representing a negligible clustering effect, and this allows us to use a non-2l imputation method for variables that meet this criteria. This is desirable because all 2l methods call glmer() and assign the cluster variable a random intercept, which becomes computationally expensive for the complex models we use for imputation.

impmethod['SITE_ID'] <- "" 

impmethod['GDMT_HFBB'] <- "2l.binary" 
impmethod['GDMT_ACEIARB'] <- "2l.binary" 
impmethod['GDMT_ARNI'] <- "" 
impmethod['GDMT_MRA'] <- "2l.binary"

impmethod['GDMT_BBcontra'] <- "2l.binary"
impmethod['GDMT_RAASIcontra'] <- "2l.binary"
impmethod['GDMT_MRAcontra'] <- "2l.binary" 
impmethod['GDMT_ARNIcontra'] <- "2l.binary" 

impmethod['GENDERi'] <- "logreg" 
impmethod['race2i'] <- ""
impmethod['insurance'] <- ""
impmethod['AGEi'] <- ""

impmethod['PMHx_none'] <- "2l.binary" 
impmethod['PMHx_CHF'] <- "2l.binary" 
impmethod['PMHx_ESRD'] <- "2l.binary"
impmethod['PMHx_Cr2'] <- "2l.glm.bin"
impmethod['PMHx_PPMICDCRT'] <- "2l.binary"
impmethod['PMHx_fibfl'] <- "logreg"

impmethod['GH_HEART_TRANSPLANTS'] <- ""
impmethod['RESIDENTS'] <- ""

impmethod['OH_EF'] <- ""
impmethod['OH_TRANSPLANT'] <- ""
impmethod['covar_ino'] <- ""

impmethod['inhospRx_BB'] <- "2l.binary"
impmethod['inhospRx_MRA'] <- "2l.binary"
impmethod['inhospRx_none'] <- "2l.binary"
impmethod['inhospRx_ARNI'] <- "2l.binary"
impmethod['inhospRx_ACEIARB'] <- "2l.binary"

impmethod['covar_Kdisc'] <- "pmm"
impmethod['covar_Crdisc'] <- "pmm"
impmethod['OH_DISCBPSYST'] <- "pmm"
impmethod['OH_DISCHR'] <- "pmm"

impmethod['zip_designation'] <- ""
impmethod['population_total'] <- "2l.pmm"
impmethod['distress_score'] <- "2l.pmm"

impmethod['covar_OutsideZIPregion'] <- "2l.binary"
impmethod['followup'] <- "2l.binary"
impmethod['DCtoContCare'] <- "2l.binary"
impmethod['insured'] <- "2l.binary"

print(impmethod)

##                SITE_ID              GDMT_HFBB           GDMT_ACEIARB 
##                     ""            "2l.binary"            "2l.binary" 
##              GDMT_ARNI               GDMT_MRA          GDMT_BBcontra 
##                     ""            "2l.binary"            "2l.binary" 
##       GDMT_RAASIcontra         GDMT_MRAcontra        GDMT_ARNIcontra 
##            "2l.binary"            "2l.binary"            "2l.binary" 
##                GENDERi                 race2i              insurance 
##               "logreg"                     ""                     "" 
##                   AGEi              PMHx_none               PMHx_CHF 
##                     ""            "2l.binary"            "2l.binary" 
##              PMHx_ESRD               PMHx_Cr2         PMHx_PPMICDCRT 
##            "2l.binary"           "2l.glm.bin"            "2l.binary" 
##             PMHx_fibfl   GH_HEART_TRANSPLANTS              RESIDENTS 
##               "logreg"                     ""                     "" 
##                  OH_EF          OH_TRANSPLANT              covar_ino 
##                     ""                     ""                     "" 
##            inhospRx_BB           inhospRx_MRA          inhospRx_none 
##            "2l.binary"            "2l.binary"            "2l.binary" 
##          inhospRx_ARNI       inhospRx_ACEIARB            covar_Kdisc 
##            "2l.binary"            "2l.binary"                  "pmm" 
##           covar_Crdisc          OH_DISCBPSYST              OH_DISCHR 
##                  "pmm"                  "pmm"                  "pmm" 
##        zip_designation       population_total         distress_score 
##                     ""               "2l.pmm"               "2l.pmm" 
## covar_OutsideZIPregion               followup                OH_ARNI 
##            "2l.binary"            "2l.binary"                     "" 
##           DCtoContCare                insured 
##            "2l.binary"            "2l.binary"

The default imputation method for each variable is "“, which indicates to MICE that an imputation model should not be fit to that variable. We explicitly assigned”" where appropriate to make sure that we considered every variable in the dataset, but an explicit assignment was not necessary. Notice that variables might be assigned "" because they have no missing values or because they are nominal. There is value in keeping nominal variables in the dataset because they can still be used to impute other variables, even though they themselves cannot be imputed.

3.1.2 Specifying the predictor matrix

The next step required in FCS is to determine which variables belong in each imputation model. This is done through the prediction matrix, which is created via the following:

pm <- make.predictorMatrix(df)
print(pm)

##                        SITE_ID GDMT_HFBB GDMT_ACEIARB GDMT_ARNI GDMT_MRA
## SITE_ID                      0         1            1         1        1
## GDMT_HFBB                    1         0            1         1        1
## GDMT_ACEIARB                 1         1            0         1        1
## GDMT_ARNI                    1         1            1         0        1
## GDMT_MRA                     1         1            1         1        0
## GDMT_BBcontra                1         1            1         1        1
## GDMT_RAASIcontra             1         1            1         1        1
## GDMT_MRAcontra               1         1            1         1        1
## GDMT_ARNIcontra              1         1            1         1        1
## GENDERi                      1         1            1         1        1
## race2i                       1         1            1         1        1
## insurance                    1         1            1         1        1
## AGEi                         1         1            1         1        1
## PMHx_none                    1         1            1         1        1
## PMHx_CHF                     1         1            1         1        1
## PMHx_ESRD                    1         1            1         1        1
## PMHx_Cr2                     1         1            1         1        1
## PMHx_PPMICDCRT               1         1            1         1        1
## PMHx_fibfl                   1         1            1         1        1
## GH_HEART_TRANSPLANTS         1         1            1         1        1
## RESIDENTS                    1         1            1         1        1
## OH_EF                        1         1            1         1        1
## OH_TRANSPLANT                1         1            1         1        1
## covar_ino                    1         1            1         1        1
## inhospRx_BB                  1         1            1         1        1
## inhospRx_MRA                 1         1            1         1        1
## inhospRx_none                1         1            1         1        1
## inhospRx_ARNI                1         1            1         1        1
## inhospRx_ACEIARB             1         1            1         1        1
## covar_Kdisc                  1         1            1         1        1
## covar_Crdisc                 1         1            1         1        1
## OH_DISCBPSYST                1         1            1         1        1
## OH_DISCHR                    1         1            1         1        1
## zip_designation              1         1            1         1        1
## population_total             1         1            1         1        1
## distress_score               1         1            1         1        1
## covar_OutsideZIPregion       1         1            1         1        1
## followup                     1         1            1         1        1
## OH_ARNI                      1         1            1         1        1
## DCtoContCare                 1         1            1         1        1
## insured                      1         1            1         1        1
##                        GDMT_BBcontra GDMT_RAASIcontra GDMT_MRAcontra
## SITE_ID                            1                1              1
## GDMT_HFBB                          1                1              1
## GDMT_ACEIARB                       1                1              1
## GDMT_ARNI                          1                1              1
## GDMT_MRA                           1                1              1
## GDMT_BBcontra                      0                1              1
## GDMT_RAASIcontra                   1                0              1
## GDMT_MRAcontra                     1                1              0
## GDMT_ARNIcontra                    1                1              1
## GENDERi                            1                1              1
## race2i                             1                1              1
## insurance                          1                1              1
## AGEi                               1                1              1
## PMHx_none                          1                1              1
## PMHx_CHF                           1                1              1
## PMHx_ESRD                          1                1              1
## PMHx_Cr2                           1                1              1
## PMHx_PPMICDCRT                     1                1              1
## PMHx_fibfl                         1                1              1
## GH_HEART_TRANSPLANTS               1                1              1
## RESIDENTS                          1                1              1
## OH_EF                              1                1              1
## OH_TRANSPLANT                      1                1              1
## covar_ino                          1                1              1
## inhospRx_BB                        1                1              1
## inhospRx_MRA                       1                1              1
## inhospRx_none                      1                1              1
## inhospRx_ARNI                      1                1              1
## inhospRx_ACEIARB                   1                1              1
## covar_Kdisc                        1                1              1
## covar_Crdisc                       1                1              1
## OH_DISCBPSYST                      1                1              1
## OH_DISCHR                          1                1              1
## zip_designation                    1                1              1
## population_total                   1                1              1
## distress_score                     1                1              1
## covar_OutsideZIPregion             1                1              1
## followup                           1                1              1
## OH_ARNI                            1                1              1
## DCtoContCare                       1                1              1
## insured                            1                1              1
##                        GDMT_ARNIcontra GENDERi race2i insurance AGEi PMHx_none
## SITE_ID                              1       1      1         1    1         1
## GDMT_HFBB                            1       1      1         1    1         1
## GDMT_ACEIARB                         1       1      1         1    1         1
## GDMT_ARNI                            1       1      1         1    1         1
## GDMT_MRA                             1       1      1         1    1         1
## GDMT_BBcontra                        1       1      1         1    1         1
## GDMT_RAASIcontra                     1       1      1         1    1         1
## GDMT_MRAcontra                       1       1      1         1    1         1
## GDMT_ARNIcontra                      0       1      1         1    1         1
## GENDERi                              1       0      1         1    1         1
## race2i                               1       1      0         1    1         1
## insurance                            1       1      1         0    1         1
## AGEi                                 1       1      1         1    0         1
## PMHx_none                            1       1      1         1    1         0
## PMHx_CHF                             1       1      1         1    1         1
## PMHx_ESRD                            1       1      1         1    1         1
## PMHx_Cr2                             1       1      1         1    1         1
## PMHx_PPMICDCRT                       1       1      1         1    1         1
## PMHx_fibfl                           1       1      1         1    1         1
## GH_HEART_TRANSPLANTS                 1       1      1         1    1         1
## RESIDENTS                            1       1      1         1    1         1
## OH_EF                                1       1      1         1    1         1
## OH_TRANSPLANT                        1       1      1         1    1         1
## covar_ino                            1       1      1         1    1         1
## inhospRx_BB                          1       1      1         1    1         1
## inhospRx_MRA                         1       1      1         1    1         1
## inhospRx_none                        1       1      1         1    1         1
## inhospRx_ARNI                        1       1      1         1    1         1
## inhospRx_ACEIARB                     1       1      1         1    1         1
## covar_Kdisc                          1       1      1         1    1         1
## covar_Crdisc                         1       1      1         1    1         1
## OH_DISCBPSYST                        1       1      1         1    1         1
## OH_DISCHR                            1       1      1         1    1         1
## zip_designation                      1       1      1         1    1         1
## population_total                     1       1      1         1    1         1
## distress_score                       1       1      1         1    1         1
## covar_OutsideZIPregion               1       1      1         1    1         1
## followup                             1       1      1         1    1         1
## OH_ARNI                              1       1      1         1    1         1
## DCtoContCare                         1       1      1         1    1         1
## insured                              1       1      1         1    1         1
##                        PMHx_CHF PMHx_ESRD PMHx_Cr2 PMHx_PPMICDCRT PMHx_fibfl
## SITE_ID                       1         1        1              1          1
## GDMT_HFBB                     1         1        1              1          1
## GDMT_ACEIARB                  1         1        1              1          1
## GDMT_ARNI                     1         1        1              1          1
## GDMT_MRA                      1         1        1              1          1
## GDMT_BBcontra                 1         1        1              1          1
## GDMT_RAASIcontra              1         1        1              1          1
## GDMT_MRAcontra                1         1        1              1          1
## GDMT_ARNIcontra               1         1        1              1          1
## GENDERi                       1         1        1              1          1
## race2i                        1         1        1              1          1
## insurance                     1         1        1              1          1
## AGEi                          1         1        1              1          1
## PMHx_none                     1         1        1              1          1
## PMHx_CHF                      0         1        1              1          1
## PMHx_ESRD                     1         0        1              1          1
## PMHx_Cr2                      1         1        0              1          1
## PMHx_PPMICDCRT                1         1        1              0          1
## PMHx_fibfl                    1         1        1              1          0
## GH_HEART_TRANSPLANTS          1         1        1              1          1
## RESIDENTS                     1         1        1              1          1
## OH_EF                         1         1        1              1          1
## OH_TRANSPLANT                 1         1        1              1          1
## covar_ino                     1         1        1              1          1
## inhospRx_BB                   1         1        1              1          1
## inhospRx_MRA                  1         1        1              1          1
## inhospRx_none                 1         1        1              1          1
## inhospRx_ARNI                 1         1        1              1          1
## inhospRx_ACEIARB              1         1        1              1          1
## covar_Kdisc                   1         1        1              1          1
## covar_Crdisc                  1         1        1              1          1
## OH_DISCBPSYST                 1         1        1              1          1
## OH_DISCHR                     1         1        1              1          1
## zip_designation               1         1        1              1          1
## population_total              1         1        1              1          1
## distress_score                1         1        1              1          1
## covar_OutsideZIPregion        1         1        1              1          1
## followup                      1         1        1              1          1
## OH_ARNI                       1         1        1              1          1
## DCtoContCare                  1         1        1              1          1
## insured                       1         1        1              1          1
##                        GH_HEART_TRANSPLANTS RESIDENTS OH_EF OH_TRANSPLANT
## SITE_ID                                   1         1     1             1
## GDMT_HFBB                                 1         1     1             1
## GDMT_ACEIARB                              1         1     1             1
## GDMT_ARNI                                 1         1     1             1
## GDMT_MRA                                  1         1     1             1
## GDMT_BBcontra                             1         1     1             1
## GDMT_RAASIcontra                          1         1     1             1
## GDMT_MRAcontra                            1         1     1             1
## GDMT_ARNIcontra                           1         1     1             1
## GENDERi                                   1         1     1             1
## race2i                                    1         1     1             1
## insurance                                 1         1     1             1
## AGEi                                      1         1     1             1
## PMHx_none                                 1         1     1             1
## PMHx_CHF                                  1         1     1             1
## PMHx_ESRD                                 1         1     1             1
## PMHx_Cr2                                  1         1     1             1
## PMHx_PPMICDCRT                            1         1     1             1
## PMHx_fibfl                                1         1     1             1
## GH_HEART_TRANSPLANTS                      0         1     1             1
## RESIDENTS                                 1         0     1             1
## OH_EF                                     1         1     0             1
## OH_TRANSPLANT                             1         1     1             0
## covar_ino                                 1         1     1             1
## inhospRx_BB                               1         1     1             1
## inhospRx_MRA                              1         1     1             1
## inhospRx_none                             1         1     1             1
## inhospRx_ARNI                             1         1     1             1
## inhospRx_ACEIARB                          1         1     1             1
## covar_Kdisc                               1         1     1             1
## covar_Crdisc                              1         1     1             1
## OH_DISCBPSYST                             1         1     1             1
## OH_DISCHR                                 1         1     1             1
## zip_designation                           1         1     1             1
## population_total                          1         1     1             1
## distress_score                            1         1     1             1
## covar_OutsideZIPregion                    1         1     1             1
## followup                                  1         1     1             1
## OH_ARNI                                   1         1     1             1
## DCtoContCare                              1         1     1             1
## insured                                   1         1     1             1
##                        covar_ino inhospRx_BB inhospRx_MRA inhospRx_none
## SITE_ID                        1           1            1             1
## GDMT_HFBB                      1           1            1             1
## GDMT_ACEIARB                   1           1            1             1
## GDMT_ARNI                      1           1            1             1
## GDMT_MRA                       1           1            1             1
## GDMT_BBcontra                  1           1            1             1
## GDMT_RAASIcontra               1           1            1             1
## GDMT_MRAcontra                 1           1            1             1
## GDMT_ARNIcontra                1           1            1             1
## GENDERi                        1           1            1             1
## race2i                         1           1            1             1
## insurance                      1           1            1             1
## AGEi                           1           1            1             1
## PMHx_none                      1           1            1             1
## PMHx_CHF                       1           1            1             1
## PMHx_ESRD                      1           1            1             1
## PMHx_Cr2                       1           1            1             1
## PMHx_PPMICDCRT                 1           1            1             1
## PMHx_fibfl                     1           1            1             1
## GH_HEART_TRANSPLANTS           1           1            1             1
## RESIDENTS                      1           1            1             1
## OH_EF                          1           1            1             1
## OH_TRANSPLANT                  1           1            1             1
## covar_ino                      0           1            1             1
## inhospRx_BB                    1           0            1             1
## inhospRx_MRA                   1           1            0             1
## inhospRx_none                  1           1            1             0
## inhospRx_ARNI                  1           1            1             1
## inhospRx_ACEIARB               1           1            1             1
## covar_Kdisc                    1           1            1             1
## covar_Crdisc                   1           1            1             1
## OH_DISCBPSYST                  1           1            1             1
## OH_DISCHR                      1           1            1             1
## zip_designation                1           1            1             1
## population_total               1           1            1             1
## distress_score                 1           1            1             1
## covar_OutsideZIPregion         1           1            1             1
## followup                       1           1            1             1
## OH_ARNI                        1           1            1             1
## DCtoContCare                   1           1            1             1
## insured                        1           1            1             1
##                        inhospRx_ARNI inhospRx_ACEIARB covar_Kdisc covar_Crdisc
## SITE_ID                            1                1           1            1
## GDMT_HFBB                          1                1           1            1
## GDMT_ACEIARB                       1                1           1            1
## GDMT_ARNI                          1                1           1            1
## GDMT_MRA                           1                1           1            1
## GDMT_BBcontra                      1                1           1            1
## GDMT_RAASIcontra                   1                1           1            1
## GDMT_MRAcontra                     1                1           1            1
## GDMT_ARNIcontra                    1                1           1            1
## GENDERi                            1                1           1            1
## race2i                             1                1           1            1
## insurance                          1                1           1            1
## AGEi                               1                1           1            1
## PMHx_none                          1                1           1            1
## PMHx_CHF                           1                1           1            1
## PMHx_ESRD                          1                1           1            1
## PMHx_Cr2                           1                1           1            1
## PMHx_PPMICDCRT                     1                1           1            1
## PMHx_fibfl                         1                1           1            1
## GH_HEART_TRANSPLANTS               1                1           1            1
## RESIDENTS                          1                1           1            1
## OH_EF                              1                1           1            1
## OH_TRANSPLANT                      1                1           1            1
## covar_ino                          1                1           1            1
## inhospRx_BB                        1                1           1            1
## inhospRx_MRA                       1                1           1            1
## inhospRx_none                      1                1           1            1
## inhospRx_ARNI                      0                1           1            1
## inhospRx_ACEIARB                   1                0           1            1
## covar_Kdisc                        1                1           0            1
## covar_Crdisc                       1                1           1            0
## OH_DISCBPSYST                      1                1           1            1
## OH_DISCHR                          1                1           1            1
## zip_designation                    1                1           1            1
## population_total                   1                1           1            1
## distress_score                     1                1           1            1
## covar_OutsideZIPregion             1                1           1            1
## followup                           1                1           1            1
## OH_ARNI                            1                1           1            1
## DCtoContCare                       1                1           1            1
## insured                            1                1           1            1
##                        OH_DISCBPSYST OH_DISCHR zip_designation population_total
## SITE_ID                            1         1               1                1
## GDMT_HFBB                          1         1               1                1
## GDMT_ACEIARB                       1         1               1                1
## GDMT_ARNI                          1         1               1                1
## GDMT_MRA                           1         1               1                1
## GDMT_BBcontra                      1         1               1                1
## GDMT_RAASIcontra                   1         1               1                1
## GDMT_MRAcontra                     1         1               1                1
## GDMT_ARNIcontra                    1         1               1                1
## GENDERi                            1         1               1                1
## race2i                             1         1               1                1
## insurance                          1         1               1                1
## AGEi                               1         1               1                1
## PMHx_none                          1         1               1                1
## PMHx_CHF                           1         1               1                1
## PMHx_ESRD                          1         1               1                1
## PMHx_Cr2                           1         1               1                1
## PMHx_PPMICDCRT                     1         1               1                1
## PMHx_fibfl                         1         1               1                1
## GH_HEART_TRANSPLANTS               1         1               1                1
## RESIDENTS                          1         1               1                1
## OH_EF                              1         1               1                1
## OH_TRANSPLANT                      1         1               1                1
## covar_ino                          1         1               1                1
## inhospRx_BB                        1         1               1                1
## inhospRx_MRA                       1         1               1                1
## inhospRx_none                      1         1               1                1
## inhospRx_ARNI                      1         1               1                1
## inhospRx_ACEIARB                   1         1               1                1
## covar_Kdisc                        1         1               1                1
## covar_Crdisc                       1         1               1                1
## OH_DISCBPSYST                      0         1               1                1
## OH_DISCHR                          1         0               1                1
## zip_designation                    1         1               0                1
## population_total                   1         1               1                0
## distress_score                     1         1               1                1
## covar_OutsideZIPregion             1         1               1                1
## followup                           1         1               1                1
## OH_ARNI                            1         1               1                1
## DCtoContCare                       1         1               1                1
## insured                            1         1               1                1
##                        distress_score covar_OutsideZIPregion followup OH_ARNI
## SITE_ID                             1                      1        1       1
## GDMT_HFBB                           1                      1        1       1
## GDMT_ACEIARB                        1                      1        1       1
## GDMT_ARNI                           1                      1        1       1
## GDMT_MRA                            1                      1        1       1
## GDMT_BBcontra                       1                      1        1       1
## GDMT_RAASIcontra                    1                      1        1       1
## GDMT_MRAcontra                      1                      1        1       1
## GDMT_ARNIcontra                     1                      1        1       1
## GENDERi                             1                      1        1       1
## race2i                              1                      1        1       1
## insurance                           1                      1        1       1
## AGEi                                1                      1        1       1
## PMHx_none                           1                      1        1       1
## PMHx_CHF                            1                      1        1       1
## PMHx_ESRD                           1                      1        1       1
## PMHx_Cr2                            1                      1        1       1
## PMHx_PPMICDCRT                      1                      1        1       1
## PMHx_fibfl                          1                      1        1       1
## GH_HEART_TRANSPLANTS                1                      1        1       1
## RESIDENTS                           1                      1        1       1
## OH_EF                               1                      1        1       1
## OH_TRANSPLANT                       1                      1        1       1
## covar_ino                           1                      1        1       1
## inhospRx_BB                         1                      1        1       1
## inhospRx_MRA                        1                      1        1       1
## inhospRx_none                       1                      1        1       1
## inhospRx_ARNI                       1                      1        1       1
## inhospRx_ACEIARB                    1                      1        1       1
## covar_Kdisc                         1                      1        1       1
## covar_Crdisc                        1                      1        1       1
## OH_DISCBPSYST                       1                      1        1       1
## OH_DISCHR                           1                      1        1       1
## zip_designation                     1                      1        1       1
## population_total                    1                      1        1       1
## distress_score                      0                      1        1       1
## covar_OutsideZIPregion              1                      0        1       1
## followup                            1                      1        0       1
## OH_ARNI                             1                      1        1       0
## DCtoContCare                        1                      1        1       1
## insured                             1                      1        1       1
##                        DCtoContCare insured
## SITE_ID                           1       1
## GDMT_HFBB                         1       1
## GDMT_ACEIARB                      1       1
## GDMT_ARNI                         1       1
## GDMT_MRA                          1       1
## GDMT_BBcontra                     1       1
## GDMT_RAASIcontra                  1       1
## GDMT_MRAcontra                    1       1
## GDMT_ARNIcontra                   1       1
## GENDERi                           1       1
## race2i                            1       1
## insurance                         1       1
## AGEi                              1       1
## PMHx_none                         1       1
## PMHx_CHF                          1       1
## PMHx_ESRD                         1       1
## PMHx_Cr2                          1       1
## PMHx_PPMICDCRT                    1       1
## PMHx_fibfl                        1       1
## GH_HEART_TRANSPLANTS              1       1
## RESIDENTS                         1       1
## OH_EF                             1       1
## OH_TRANSPLANT                     1       1
## covar_ino                         1       1
## inhospRx_BB                       1       1
## inhospRx_MRA                      1       1
## inhospRx_none                     1       1
## inhospRx_ARNI                     1       1
## inhospRx_ACEIARB                  1       1
## covar_Kdisc                       1       1
## covar_Crdisc                      1       1
## OH_DISCBPSYST                     1       1
## OH_DISCHR                         1       1
## zip_designation                   1       1
## population_total                  1       1
## distress_score                    1       1
## covar_OutsideZIPregion            1       1
## followup                          1       1
## OH_ARNI                           1       1
## DCtoContCare                      0       1
## insured                           1       0

Each row is an imputation model with the first column of each row indicating the variable to be imputed. The columns are potential indpendent variables to be included in that imputation model. 0’s indicate the variable in that column will be excluded from the imputation model, while 1 indicates the variable be assigned a fixed effect. Note that 0’s are automatically assigned to the matrix diagonal as variables should not be used to impute themselves. Similarly, note that every other variable has been assigned a fixed effect (“1”) by default.
As a general rule, every variable that is going to be used in the final model needs to be included in the imputation model. The only time to definitely exclude a variable is if it is collinear with another variable. This will definitely occur for “insured”, the binary variable we generated from nominal variable “insurance” a few steps ago. Hence, we need to remove “insurance” from the imputation model for “insured”. Similarly “insurance” provides more information than “insured” so we use “insurance” as an independent variable in all other imputation models instead of “insured”. We can arrange this via the following:

pm['insured','insurance'] <- 0
pm[,'insured'] <- 0

Outside of variables that induce collinearity, we usually want to include as much data as possible when creating imputation models, even if that variable is not going to be used in your final model. You may need to exclude variables from imputation models when the imputation models have become exceedingly complex (oversaturated), but the GTWG-HF dataset is sufficiently large that this is not necessarily. If your dataset is not large, remember that you should never exclude any variable from your imputation models that you plan to include in your final model.
The final set up required before initiating imputation is to set the cluster variable within the predictor matrix. This is done by assigning the cluster variable a value of -2 in all imputation models.

pm[,'SITE_ID'] <- -2
pm['SITE_ID','SITE_ID'] <- 0

3.1.3 Do a test run of your imputation model

We have now done all the set up required for multiple imputation, but before implementing the full imputation process it is advisable to carry out some a test case first, just to make sure the models have been specified correctly. As a test run, I set the number of imputations (m) to 1 and the iteration number (maxit) to 2. Verbose will allow us to see where we are in the computation process. control1 and control2 are options for glmer() (for hierchical models) and lmer() (for non-hierarchical models), respectively. Setting these parameters can help reduce run time. Most importantly, make sure to specify nAGQ=0; this may reduce the accuracy of estimates, but will greatly reduce run time. It is inadvisable to set nAGQ to 0 for the actual model, but doing so for test run shouldn’t cause a problem. Finally, we have the computer keep track of how long it takes the process to run using Sys.time().

start_time <- Sys.time() 
test <- mice(df, m=1,maxit=2, 
             predictorMatrix=pm, method=impmethod, cluster_var=cluster_var, 
             nAGQ=0, seed=88, verbose=2,
             control1=glmerControl(optimizer='bobyqa', optCtrl=list(maxfun=2e5), calc.derivs=TRUE),
             control2=lmerControl(optimizer='nloptwrap', optCtrl=list(maxfun=2e5)))
end_time <- Sys.time() 
end_time - start_time

This will produce a mids object, which is a series of datasets containing imputed values. We can check to see if our model has been specified properly by looking at the trace plots.

plot(test)

We have provided some screen shots of a few of the plots. At minimum make sure the plots are not stationary for at least one of each kind of imputation method you used.

TO DO : insert screen shots of test plots

We see that there are no stationary plots, so at least our model is not exhibiting flat line behavior. We won’t know if the imputations are reasonable until we create the full model. However, note the above task took 2 hours for a single imputation with only 2 iterations. Our full model will require at least 45 imputations with >10 iterations- this would take weeks. To shorten our computational time, we can use parallel processing. R is single-threaded, meaning it will only use a single core for processing tasks, no matter how complex they maybe. Luckily, the imputations performed by MICE are independent of each other, so each imputation can be assigned to a different core in our machine and the datasets can be combined into a single object at that the end. This has been written into MICE using the wrapper function parlMICE(). Hence, we can specify our full imputation task as follows:

start_time <- Sys.time() 
print(start_time)
impobj <- parlmice(df, n.core=46, n.imp.core = 1, cl.type="FORK",
                   maxit=11, predictorMatrix=pm, method=impmethod,
                   cluster_var=cluster_var, 
                   nAGQ=0, cluster.seed=88, 
                   control1=glmerControl(optimizer='bobyqa', optCtrl=list(maxfun=2e5), calc.derivs=TRUE),
                   control2=lmerControl(optimizer='nloptwrap', optCtrl=list(maxfun=2e5)))
end_time <- Sys.time() 
end_time - start_time

Our machine had 48 cores, and in general you should use no more than n-1 of the cores on your machine or you’ll risk being unable to do anything else while the task is running. Additionally cluster type (cl.type) is going to depend on the operating system of your machine. If you are using Windows, you will specify PSOCK, whereas on Linux machines, FORK is preferable. The number of imputations (equivalent to “m” in the non-parallel version of mice()) R will perform is equal to n.core * n.imp.core. We specified that each core only run 1 imputation as we wanted to reduce computation time as much as possible, but if your machine has fewer cores than you the number of imputations you want to run, you can increase the n.imp.core parameter at the cost of geometrically increasing your run time.
We have made instructed Rmarkdown not to run the code above, as the run time ended up being 9.3 days even with our large machine.

The Effect of Socioeconomic Determinaints of Health on ARNI Prescription at Hospital Discharge:
An application of multilevel multiple imputation, mixed-effects logistic regression, and parallel processing using GWTG-HF data

Jeffrey S Tran

29 November 2021

1 Introduction

1.1 About the Data

1.2 IT Requirements

1.3 Prerequisites

1.4 Helpful Resources

2 Understanding Data from the Missingness Perspective

2.1 Getting Started

2.2 Exploring missingness patterns

2.3 Exploring multilevel data structures

3 Multilevel Multiple Imputation via Fully Conditional Specification

3.1 Mulitiple Imputation using GLM-based Imputation Methods (unsuccessful)

3.1.1 Selecting imputation methods

3.1.2 Specifying the predictor matrix

3.1.3 Do a test run of your imputation model

The Effect of Socioeconomic Determinaints of Health on ARNI Prescription at Hospital Discharge: An application of multilevel multiple imputation, mixed-effects logistic regression, and parallel processing using GWTG-HF data

Jeffrey S Tran

29 November 2021

1 Introduction

1.1 About the Data

1.2 IT Requirements

1.3 Prerequisites

1.4 Helpful Resources

2 Understanding Data from the Missingness Perspective

2.1 Getting Started

2.2 Exploring missingness patterns

2.3 Exploring multilevel data structures

3 Multilevel Multiple Imputation via Fully Conditional Specification

3.1 Mulitiple Imputation using GLM-based Imputation Methods (unsuccessful)

3.1.1 Selecting imputation methods

3.1.2 Specifying the predictor matrix

3.1.3 Do a test run of your imputation model

The Effect of Socioeconomic Determinaints of Health on ARNI Prescription at Hospital Discharge:
An application of multilevel multiple imputation, mixed-effects logistic regression, and parallel processing using GWTG-HF data