### 5.1.2 Primary (*a priori*) model

The *a priori* model for the *RSV Exposure* studies was as follows:

**\(y^{ij} = \gamma_{00} + \gamma_{10} x_{1}^{ij} + \gamma_{20} x_{2}^{ij} + \gamma_{30} x_{3}^{ij} + \gamma_{40} x_{4}^{ij} + \gamma_{50} x_{5}^{ij} + \gamma_{01} w_{1}^{j} + \gamma_{02} w_{2}^{j} + \gamma_{60} (x_{1}^{ij}\times x_{2}^{ij}) + \gamma_{11} (x_{1}^{ij}\times w_{2}^{j}) + e^{ij} + u^j\)**

where y^{ij} = log_{e}OR for effect size estimate i within study j; \(\gamma\)_{00} = conditional mean of the distribution of population effect sizes; \(\gamma\)_{10} …\(\gamma\)_{60} = regression parameters for covariates (x^{ij}) whose values vary within studies; \(\gamma\)_{01}, \(\gamma\)_{02} = regression parameters for covariates (w^{j}) whose values are constant within studies; \(\gamma\)_{11} = regression parameter for a cross-level interaction; e^{ij} = within-study sampling error; u^{j} = between-study deviations from the population average effect size. The model covariates (effect modifiers) are defined as follows:

**x**= study design: 0=exposure group membership determined by RSV-LRTI medical event; 1=exposure group membership determined by viral surveillance_{1}^{ij}**x**= age at outcome ascertainment: 0=preschool years (mean/median age < 5); 1=primary school years (mean/median age 5-12); 2=adolescence/adulthood (mean/median age \(\geq\) 13)_{2}^{ij}**x**= genetic confounding: 0=no clear attempt to control potential genetic confounding; 1=confounding controlled at least somewhat_{3}^{ij}**x**= co-infection confounding: 0=no attempt to control potential confounding due to co-infections at time of RSV+ LRTI ascertainment; 1= confounding controlled at least somewhat_{4}^{ij}**x**= neonatal health confounding: 0=no attempt to control potential confounding due to infant neonatal health; 1= confounding controlled at least somewhat_{5}^{ij}**w**= participant risk: 0=study included children regardless of risk status; 1=study included only children at risk for wheezing illness beyond having an LRTI_{1}^{j}**w**= exposure ascertainment period: 0=study exposure ascertainment extends beyond 12 months of age; 1= exposure ascertainment entirely within the first 12 months of life._{2}^{j}

Prior to conducting our robust variance estimation (RVE) meta-regression, it is critical to evaluate the distributions of the potential effect modifiers (covariates in the above equation). Our RVE models calculate degrees of freeedom (*df*) using Satterthwaite approximation. Estimates with *df* < 4 should not be trusted (Tipton, 2015). Imbalance and sparse cells in the covariates could be a cause of insufficient *df*.

There is notable imbalance in the variables coding age category at outcome ascertainment (i.e., relatively few estimates measured in Adolescence/Adulthood) and control of confouding due to co-infection (i.e., few studies minimized this potential confounder). We will run the *a priori* model as is and note whether the *df* indicate a problem.

`( toutagecat2 <- with( whorsvexp, table( outagecat2f ) ) )`

```
## outagecat2f
## Adolesc/Adult Preschool School Age
## 16 67 46
```

`( tminanycoinf <- with( whorsvexp, table( minanycoinff ) ) )`

```
## minanycoinff
## Limited Not limited
## 20 109
```

```
par( mfrow = c(1,1))
barplot( toutagecat2, main = 'Age Category at Outcome Measurement' )
```

`barplot( tminanycoinf, main = 'Any Evidence Co-Infection Confounding Reduced' )`

**Model 1.1:** We run the *a priori* model for RSV-LRTI exposure studies. First, we set the reference level for all effect modifiers to their modal values. Consequently, the intercept represents the conditional mean odds ratio estimate when all effect modifiers are held at their most common value (aOR_{+}).

```
## Set reference levels for all effect modifiers to the modal factor category
whorsvexp$outagecat2f <- relevel( whorsvexp$outagecat2f, ref = 'Preschool' )
whorsvexp$minanycoinff <- relevel( whorsvexp$minanycoinff, ref = 'Not limited' )
whorsvexp$minanyneof <- relevel( whorsvexp$minanyneof, ref = 'Not limited' )
model1.1 <- robu( logor ~ surveilf*outagecat2f + surveilf*exp012f + minanygenf + minanycoinff + minanyneof + participantriskf,
data = whorsvexp,
studynum = studyid,
var.eff.size = se_logor^2,
rho = .8,
small = T )
( model1.1out <- data.frame( Parameters = model1.1$reg_table$labels,
Estimate = round( model1.1$reg_table$b.r, digits = 2),
StdErr = round( model1.1$reg_table$SE, digits = 2 ),
DF = round( model1.1$reg_table$dfs, digits = 2 ),
Lower_CI = round( model1.1$reg_table$CI.L, digits = 2 ),
Upper_CI = round( model1.1$reg_table$CI.U, digits = 2 ) ) )
```

```
## Parameters Estimate StdErr DF Lower_CI Upper_CI
## 1 X.Intercept. 1.19 0.28 10.98 0.56 1.81
## 2 surveilfSurveil 0.15 0.25 3.53 -0.57 0.87
## 3 outagecat2fAdolesc.Adult 0.78 0.64 1.91 -2.11 3.66
## 4 outagecat2fSchool.Age -0.38 0.24 13.19 -0.88 0.13
## 5 exp012fBeyond.12.months -0.56 0.16 8.13 -0.93 -0.20
## 6 minanygenfNot.limited 0.54 0.21 10.70 0.07 1.00
## 7 minanycoinffLimited 0.18 0.43 3.76 -1.03 1.40
## 8 minanyneofLimited 0.22 0.19 12.06 -0.20 0.63
## 9 participantriskfRisk.Based -0.34 0.15 11.45 -0.65 -0.02
## 10 surveilfSurveil.outagecat2fAdolesc.Adult -1.54 0.70 5.71 -3.26 0.18
## 11 surveilfSurveil.outagecat2fSchool.Age -0.04 0.32 4.99 -0.85 0.78
## 12 surveilfSurveil.exp012fBeyond.12.months 0.17 0.32 4.44 -0.68 1.02
```

The *df* for the estimate of the effect modifier coding age category at outcome ascertainment (**outagecat2f**) and comparing effect sizes measured in adolescence/adulthood to estimates measured in preschool are too small to be trusted (*df*=1.91). This may be due to the fact that there is high imbalance across the levels of this factor, with few estimates measuring effects in adolescence/adulthood. Additionally, the effect modifier coding whether effect sizes were adjusted for non-RSV co-infections (**minanycoinff**) also had *df* < 4. The likely consequence is that the confidence interval around the **minanycoinf** estimate is too narrow (Tipton, 2015). As the confidence interval for **minanycoinf** easily contains 0 (the null value) and is likely too narrow, we can clearly not reject the null hypothesis.

**Model 1.2:** We rerun Model 1.1 after collapsing the school age and adolescence/adulthood categories for the outcome age variable. It is now a binary variable called **preschoolf**. This effect modifier now compares effect size esitmates measured during the preschool years to estimates measured outside of the preschool years: preschool vs. schoolage/adolescence/adulthood. This may fix the df problem. Additionally, we dropped the two interaction terms as there was no evidence that they were significant and they had relatively small *df*. Because the **minanycoinf** effect modifier tests an * a priori * hypothesis, we retained it in our model even though it has *df* < 4.

```
( model1.2 <- robu( logor ~ surveilf + preschoolf + exp012f + minanygenf + minanycoinff + minanyneof + participantriskf,
data = whorsvexp,
studynum = studyid,
var.eff.size = se_logor^2,
rho = .8,
small = T )
)
```

```
## RVE: Correlated Effects Model with Small-Sample Corrections
##
## Model: logor ~ surveilf + preschoolf + exp012f + minanygenf + minanycoinff + minanyneof + participantriskf
##
## Number of studies = 35
## Number of outcomes = 129 (min = 1 , mean = 3.69 , median = 3 , max = 12 )
## Rho = 0.8
## I.sq = 64.22221
## Tau.sq = 0.2497
##
## Estimate StdErr t-value dfs P(|t|>) 95% CI.L 95% CI.U Sig
## 1 X.Intercept. 0.895 0.301 2.976 8.26 0.0171 0.2053 1.5842 **
## 2 surveilfSurveil 0.101 0.182 0.558 8.14 0.5918 -0.3163 0.5192
## 3 preschoolfPreschool 0.307 0.208 1.475 17.83 0.1578 -0.1305 0.7436
## 4 exp012fBeyond.12.months -0.446 0.171 -2.609 13.10 0.0215 -0.8143 -0.0770 **
## 5 minanygenfNot.limited 0.533 0.224 2.383 11.50 0.0354 0.0434 1.0222 **
## 6 minanycoinffLimited 0.179 0.388 0.461 3.84 0.6694 -0.9152 1.2729
## 7 minanyneofLimited 0.176 0.199 0.886 11.60 0.3934 -0.2582 0.6101
## 8 participantriskfRisk.Based -0.359 0.154 -2.325 12.23 0.0381 -0.6941 -0.0232 **
## ---
## Signif. codes: < .01 *** < .05 ** < .10 *
## ---
## Note: If df < 4, do not trust the results
```

With the exception of **minanycofinff**, all effect modifiers have df > 4 in Model 1.2. In both versions of the model (Model 1.1 and Model 1.2), the full confidence interval for the intercept—representing the adjusted weighted mean log odds ratio (log_{e}OR_{+}) when all effect modifiers are set to the modal category—is positive, suggesting a positive association between RSV-LRTI exposure and subsequent wheezing illness. We can convert the intercept estimate to the OR scale by exponentiating it:

```
model1.2.est <- round( exp( c(model1.2$reg_table$b.r[1]) ), digits = 2 )
model1.2.lci <- round( exp( c(model1.2$reg_table$CI.L[1]) ), digits = 2 )
model1.2.uci <- round( exp( c(model1.2$reg_table$CI.U[1]) ), digits = 2 )
## Estimates on the OR scale
( model1.2.table <- data.frame(parameter=model1.2$reg_table$labels[1], AdjustedMeanOddsRatio=model1.2.est[1], LCI=model1.2.lci[1], UCI=model1.2.uci[1] ) )
```

```
## parameter AdjustedMeanOddsRatio LCI UCI
## 1 X.Intercept. 2.45 1.23 4.88
```

The adjusted (conditional) weighted mean OR = 2.45 (95% CI: 1.23, 4.88) when holding all effect modifiers at their modal levels.

Effect estimates were larger among studies that did not control genetic confounding: \(\hat\gamma\)=0.53, 95% CI [0.04, 1.02]. This is consistent with the hypothesis that RSV-LRTI exposure is at least partly a marker of genetic susceptibility rather than a strictly causal factor. There was no evidence of effect modification by whether estimates were based on analyses controlling for non-RSV co-infections or neonatal health markers. Although these effect modifiers did not represent primary hypotheses, estimates from studies using a targeted enrollment strategy (i.e., enrolling only those with a known risk factor for wheezing illness other than RSV-LRTI exposure: **participantriskf**) and those in which the exposure ascertainment period was not entirely limited to the first 12 months of life (**exp012f**) yielded smaller effect sizes.

**Model 1.3:** To gauge how much smaller the expected weighted mean effect estimate when controlling for genetic confounding, we reran Model 1.2 changing the reference level of the **minanygenf** variable so that *not* controlling for genetic confounding was the reference category and controlling genetic confounding *at least somewhat* was the comparison level. The intercept in this model represents the log_{e}OR_{+} estimate among studies that did not control genetic confounding, holding all other effect modifiers at their modal categories.

```
## Create a new factor for the genetic confounding effect modifier with "Not controlled" as the reference level
whorsvexp$minanygenrev <- factor(whorsvexp$minanygen,levels=0:1,labels=c('Not controlled','Controlled'))
model1.3 <- robu( logor ~ surveilf + preschoolf + exp012f + minanygenrev + minanycoinff + minanyneof + participantriskf,
data = whorsvexp,
studynum = studyid,
var.eff.size = se_logor^2,
rho = .8,
small = T )
print( model1.3 )
```

```
## RVE: Correlated Effects Model with Small-Sample Corrections
##
## Model: logor ~ surveilf + preschoolf + exp012f + minanygenrev + minanycoinff + minanyneof + participantriskf
##
## Number of studies = 35
## Number of outcomes = 129 (min = 1 , mean = 3.69 , median = 3 , max = 12 )
## Rho = 0.8
## I.sq = 64.22221
## Tau.sq = 0.2497
##
## Estimate StdErr t-value dfs P(|t|>) 95% CI.L 95% CI.U Sig
## 1 X.Intercept. 1.428 0.252 5.660 9.10 0.000297 0.858 1.9971 ***
## 2 surveilfSurveil 0.101 0.182 0.558 8.14 0.591828 -0.316 0.5192
## 3 preschoolfPreschool 0.307 0.208 1.475 17.83 0.157751 -0.130 0.7436
## 4 exp012fBeyond.12.months -0.446 0.171 -2.609 13.10 0.021512 -0.814 -0.0770 **
## 5 minanygenrevControlled -0.533 0.224 -2.383 11.50 0.035367 -1.022 -0.0434 **
## 6 minanycoinffLimited 0.179 0.388 0.461 3.84 0.669416 -0.915 1.2729
## 7 minanyneofLimited 0.176 0.199 0.886 11.60 0.393354 -0.258 0.6101
## 8 participantriskfRisk.Based -0.359 0.154 -2.325 12.23 0.038059 -0.694 -0.0232 **
## ---
## Signif. codes: < .01 *** < .05 ** < .10 *
## ---
## Note: If df < 4, do not trust the results
```

```
model1.3.est <- round( exp( c(model1.3$reg_table$b.r[1]) ), digits = 2 )
model1.3.lci <- round( exp( c(model1.3$reg_table$CI.L[1]) ), digits = 2 )
model1.3.uci <- round( exp( c(model1.3$reg_table$CI.U[1]) ), digits = 2 )
## Estimates on the OR scale
( model1.3.table <- data.frame(parameter=model1.3$reg_table$labels[1], ConditionalOddsRatio=model1.3.est[1], LCI=model1.3.lci[1], UCI=model1.3.uci[1] ) )
```

```
## parameter ConditionalOddsRatio LCI UCI
## 1 X.Intercept. 4.17 2.36 7.37
```

The intercept in Model 1.3 is considerably larger when it represents the log_{e}OR_{+} among studies not controlling for genetic confounding compared to Model 1.2 when it represents studies that did control fo genetic confounding. The point estimate for the intercept increases from aOR_{+}=2.45 in Model 1.2 to aOR_{+}=4.17 in Model 1.3. The intercept is positive and signifcant in both models, indicating a positive conditional mean effect of RSV-LRTI on subsequent wheezing illness. But we expect a substantial drop in the weighted mean adjusted odds ratio (aOR_{+}) for the effect of RSV-LRTI on subsequent wheezing illness when studies do something to control for genetic confounding.

Below we visualize the modifying effect of controlling for genetic confounding at least somewhat by plotting boxplots for the observared effect size estimates by whether or not the estimates controlled for genetic confounding. We superimpose the point estimates to show the observed distributions with the point size proportional to the estimate’s inverse variance (i.e., more precise estimates have larger points). Additionally, we superimpose red diamonds, the center of which represents the log_{e}OR_{+} estimates based on models 1.2 and 1.3. The bottoms and tops of the diamonds represent, respectively, the lower and upper bound 95% confidence intervals. Finally, we highlight with a red triangle the estimate from the study by Poorisrisak et al. 2010 because this was the only estimate that eliminated genetic confounding by comparing monozygotic twins who were discordant for RSV-LRTI hospitalization. This point estimate was smaller than 88% (67/76) of the estimates that only partly controlled for genetic confounding. As the sample size was relatively small in this study, there was considerable uncertainty in this estimate with plausible estimates ranging from RSV-LRTI hospitalization being protective (lower 95% CI = 0.36) to highly damaging (upper 95% CI = 4.00).

```
whorsvexp$minanygenplot <- factor( whorsvexp$minanygen, levels = 0:1, labels=c('Not at all', 'At least somewhat') )
mingenfull <- with( whorsvexp, ifelse( studyname == 'Poorisrisak 2010', 1, 0 ) )#
mingenfullf <- factor( mingenfull, levels=0:1, labels=c('Not fully minimized','Fully minimized' ) )
x1 <- c(1,.9,1,1.1)
y1 <- c(model1.3$reg_table$CI.L[1],model1.3$reg_table$b.r[1], model1.3$reg_table$CI.U[1],model1.3$reg_table$b.r[1])
df1 <- data.frame(x1,y1)
x2 <- c(2,1.9,2,2.1)
y2 <- c(model1.2$reg_table$CI.L[1],model1.2$reg_table$b.r[1], model1.2$reg_table$CI.U[1],model1.2$reg_table$b.r[1])
df2 <- data.frame(x2,y2)
#tiff( 'Fig3.tiff', res=300, height = 8, width = 10, units = 'in' )
( box2 <- ggplot( whorsvexp, aes( x=minanygenplot, y=logor ) ) +
geom_boxplot( outlier.shape = NA ) +
geom_polygon(data=df1, aes(x=x1, y=y1 ), fill=NA, color='firebrick', size=1.25) +
geom_polygon(data=df2, aes(x=x2, y=y2 ), fill=NA, color='firebrick', size=1.25) +
annotate("text", x = 2.43, y =whorsvexp[ whorsvexp$studyname=='Poorisrisak 2010','logor'],
label = "Poorisrisak 2010", color='red', fontface='bold', size = 4.5) +
annotate("text", x = 1, y = max(whorsvexp$logor*1.1), size=5, label = paste( "aOR+ =",
round(exp(model1.3$reg_table$b.r[1]), digits=2),
"(95% CI:",round(exp(model1.3$reg_table$CI.L[1]), digits=2),
",",round(exp(model1.3$reg_table$CI.U[1]), digits=2),
")"),
color='firebrick', fontface='bold') +
annotate("text", x = 2, y =max(whorsvexp$logor*1.1), size=5, label = paste( "aOR+ =",
round(exp(model1.2$reg_table$b.r[1]), digits=2),
"(95% CI:",
round(exp(model1.2$reg_table$CI.L[1]), digits=2),
",",round(exp(model1.2$reg_table$CI.U[1]), digits=2),
")"),
color='firebrick', fontface='bold') +
geom_point( aes(size = whorsvexp$w ), position = position_jitterdodge(),
show.legend = F ) +
geom_point(data=whorsvexp[whorsvexp$studyname=='Poorisrisak 2010', ], aes(x=2.25, y=logor), colour='red', shape=17) +
ggtitle( 'Observed Effect Size Distributions and Conditional Mean Effect Sizes\nby whether Estimates Controlled for Genetic Confounding') +
ylab( 'Natural Log of the Odds Ratio' ) +
xlab( 'Controlled for Genetic Confounding?' ) +
theme( plot.title = element_text( size=14, face='bold', hjust=.5 ),
axis.title.x = element_text( size=14, face='bold' ),
axis.title.y = element_text( size=14, face='bold' ),
axis.text.x = element_text( size=13, face='bold'),
axis.text.y = element_text( size=13, face = 'bold')) +
theme(legend.position = "none") +
geom_hline(yintercept = 0, linetype = 'dashed', size = 1, colour = 1 )
)
```

`#dev.off()`