Applied Bayesian Analyses in R

Part 3

Sven De Maeyer

New example data `WritingData.RData`

Experimental study on Writing instructions
2 conditions:
- Control condition (Business as usual)
- Experimental condition (Observational learning)

Your Turn

Open WritingData.RData
Estimate 3 models with SecondVersion as dependent variable
- M1: fixed effect of FirstVersion_GM + random effect of Class ((1|Class))
- M2: M1 + random effect of FirstVersion_GM ((1 + FirstVersion_GM |Class))
- M3: M2 + fixed effect of Experimental_condition
Compare the models on their fit
What do we learn?
Make a summary of the best fitting model

Divergent transitions…

Exploring the parameter space…

Divergent transitions…

Something to worry about!
Essentially: sampling of parameter estimate values went wrong
Fixes:
- sometimes fine-tuning the sampling algorithm (e.g., control = list(adapt_delta = 0.9)) works
- sometimes you need more informative priors
- sometimes the model is just not a good model

Our fix here for model 3

M3 <- brm(
  SecondVersion ~ FirstVersion_GM + Experimental_condition + (1 + FirstVersion_GM |Class),
  data = WritingData,
  backend = "cmdstanr",
  cores = 4,
  control = list(adapt_delta = 0.9),
  seed = 1975 
)

Our fix here for model 3

Interpretation of results…

Different ways to summarize our results:

visually
credible intervals (eti & hdi)
rope + hdi rule
hypothesis tests

Visually summarizing the posterior distribution

Functions in `bayesplot` package

mcmc_areas() function
mcmc_areas_ridges() function
mcmc_intervals() function

The `mcmc_areas()` function

Gives a posterior distribution including a certain credible interval that you can set manually with the prob argument:

mcmc_areas(
  M3,
  pars = c(
    "b_FirstVersion_GM",
    "b_Experimental_condition"
  ),
  prob = .89
)

The `mcmc_areas()` function

The `mcmc_areas_ridges()` function

Almost similar to the previous, only the horizontal spacing changes a bit…

Meanwhile, see how you can easily change the color scheme for bayesplot graphs

color_scheme_set(scheme = "red")

mcmc_areas_ridges(
  M3,
  pars = c(
    "b_FirstVersion_GM",
    "b_Experimental_condition"
  ),
  prob = .89
)

The `mcmc_areas_ridges()` function

The `mcmc_intervals()` function

Summarizes the posterior as a horizontal bar with identifiers for two CI.

Here we set one for a 50% and one for a 89% CI

color_scheme_set(scheme = "gray")

mcmc_intervals(
  M3,
  pars = c(
    "b_FirstVersion_GM",
    "b_Experimental_condition"
  ),
  prob = .5,
  prob_outer = .89
)

The `mcmc_intervals()` function

Manually create visualizations

Powercombo: as_draws_df() + ggplot2 + ggdist

What does as_draws_df() do?

posterior_PD <- as_draws_df(M3)
head(posterior_PD)

Manually create visualizations

# A draws_df: 6 iterations, 1 chains, and 49 variables
  b_Intercept b_FirstVersion_GM b_Experimental_condition sd_Class__Intercept
1         113              0.98                      4.7                 5.2
2         112              1.47                      5.1                 2.8
3         113              1.07                      3.5                 2.8
4         111              1.09                      4.7                 4.2
5         112              1.08                      4.0                 1.7
6         111              1.04                      5.0                 2.4
  sd_Class__FirstVersion_GM cor_Class__Intercept__FirstVersion_GM sigma
1                      0.93                                  0.77   7.4
2                      1.06                                  0.68   8.1
3                      0.88                                  0.79   7.1
4                      0.75                                  0.74   8.0
5                      0.94                                  0.78   7.1
6                      0.97                                  0.82   8.1
  r_Class[1,Intercept]
1                 1.84
2                -1.59
3                -0.60
4                 0.95
5                 0.81
6                 0.38
# ... with 41 more variables
# ... hidden reserved variables {'.chain', '.iteration', '.draw'}

Use draws to create a plot using `ggdist` geoms

ggdist package has a set of functions to visualize a distribution

An example

Before we start, set our own plot theme (not so necessary)

# Setting a plotting theme
theme_set(theme_linedraw() +
            theme(
              text = element_text(family = "Times", size = 14),
              panel.grid = element_blank()
              )
)

An example

We use posterior_PD as a starting point (our draws)

library(ggdist)

Plot <- ggplot(
  posterior_PD,
  aes(x = b_Experimental_condition)
  ) +
  stat_halfeye()

Plot + scale_y_continuous(name = "", breaks = NULL)

An example

Change the CI’s

Change the CI’s to 50% and 89%

Plot <- ggplot(
  posterior_PD,
  aes(x = b_Experimental_condition)
  ) +
  stat_halfeye(
    .width = c(.50,.89)
  )

Plot + scale_y_continuous(name = "", breaks = NULL)

Change the CI’s

Use another visualization

Let’s make a dotplot… (research shows this is best interpreted) with 100 dots

Plot <- ggplot(
  posterior_PD,
  aes(x = b_Experimental_condition)
  ) +
  stat_dotsinterval(
    quantiles = 100,
    .width = c(.50,.89)
  )

Plot + scale_y_continuous(name = "", breaks = NULL)

Use another visualization

Plot two parameters each in a facet

We use pivot_longer(everything()) to stack information on multiple parameters

posterior_PD %>% 
  select(
    b_Experimental_condition, b_FirstVersion_GM
  ) %>% 
  pivot_longer(everything()) %>%
  ggplot(
    aes(x = value)
  ) +
  stat_halfeye(
    .width = c(.50,.89)
  ) +
facet_wrap(name ~ .) +
scale_y_continuous(name = "", breaks = NULL)

Visualize calculated predictions based on posterior

Our example: 2 groups according to Experimental_condition

How to visualize the posterior probability of averages for both groups?

posterior_PD %>% 
  select(
    b_Intercept, b_Experimental_condition
  ) %>% 
  mutate(
    Mean_Control_condition = b_Intercept,
    Mean_Experimental_condition = b_Intercept + b_Experimental_condition
  ) %>% 
  select(
    Mean_Control_condition, Mean_Experimental_condition
  ) %>% 
  pivot_longer(everything()) %>%
  ggplot(
    aes(x = value, color = name, fill = name)
  ) +
  stat_halfeye(
    .width = c(.50,.89),
    alpha = .40
  ) + 
  scale_y_continuous(name = "", breaks = NULL)

Visualize calculated predictions based on posterior

Hypothetical Outcome Plots (HOPs)

Code: see separate script called HOP_script.R

# A tibble: 6 × 3
   draw     X Pred1
  <int> <dbl> <dbl>
1     1 -15    95.8
2     1 -14.9  95.9
3     1 -14.8  96.0
4     1 -14.7  96.1
5     1 -14.6  96.2
6     1 -14.5  96.3

Visualizing Random Effects

Plotting the residuals

To plot differences between classes we can use class-specific residuals:

head(as_draws_df(M3) %>% 
  select(ends_with(",Intercept]")) %>%
  select(1:3),
  5
)

# A tibble: 5 × 3
  `r_Class[1,Intercept]` `r_Class[2,Intercept]` `r_Class[3,Intercept]`
                   <dbl>                  <dbl>                  <dbl>
1                  1.84                  1.36                  -4.38  
2                 -1.59                  3.12                   0.0290
3                 -0.595                 2.02                  -0.347 
4                  0.945                 2.50                  -2.55  
5                  0.815                -0.0404                -1.59

Plotting the residuals

as_draws_df(M3) %>% 
  select(ends_with(",Intercept]")) %>%
  pivot_longer(starts_with("r_Class")) %>% 
  mutate(sigma_i = value) %>%
  ggplot(aes(x = sigma_i, y = reorder(name, sigma_i))) +
  stat_pointinterval(
    point_interval = mean_qi, 
    .width = .89, 
    size = 1/6) +
  scale_y_discrete(expression(italic(j)), breaks = NULL) +
  labs(x = expression(italic(u)[italic(j)])) +
  coord_flip()

Plotting the residuals

ICC estimation

head(
  as_draws_df(M3) %>%
    mutate(
      ICC = (sd_Class__Intercept^2/(sd_Class__Intercept^2 + sigma^2))) %>%
    select(sigma, sd_Class__Intercept, ICC), 
  5
  )

# A tibble: 5 × 3
  sigma sd_Class__Intercept    ICC
  <dbl>               <dbl>  <dbl>
1  7.36                5.24 0.336 
2  8.07                2.77 0.106 
3  7.10                2.76 0.131 
4  8.01                4.18 0.214 
5  7.11                1.71 0.0545

ICC estimation

as_draws_df(M3) %>%
  mutate(
    ICC = (sd_Class__Intercept^2/(sd_Class__Intercept^2 + sigma^2))
    ) %>%
  select(ICC) %>%                           
  ggplot(aes(x = ICC)) +                    
   stat_dotsinterval(
     quantiles = 100,
     .width = c(.50,.89)
   ) +
   scale_x_continuous("marginal posterior", 
                      breaks = seq(.00,.60,by =.05)) + 
   scale_y_continuous("ICC", breaks = NULL)

ICC estimation

HOP per higher level unit

Code: see separate script called HOP_MixedEffects_script.R

Reporting stats about the posterior

Credible Intervals

ETI: Equal Tailed Interval

HDI: Highest Density Interval

Concept of ROPE

ROPE: Region Of Practical Equivalence

Set a region of parameter values that can be considered equivalent to having no effect

in standard effect sizes the advised default is a range of -0.1 to 0.1
this equals 1/2 of a small effect size (Cohen, 1988)
all parameter values in that range are set equal to no effect

ROPE + HDI

ROPE + HDI rule

95% of HDI out of ROPE \(\rightarrow\) \(H_0\) rejected
95% of HDI all in ROPE \(\rightarrow\) \(H_0\) not rejected
95% of HDI partially out of ROPE \(\rightarrow\) undecided

Applying the HDI + ROPE rule with `bayestestR` package

We can use the equivalence_test() function of the bayestestR package

library(bayestestR)
equivalence_test(M3)

# Test for Practical Equivalence

  ROPE: [-1.68 1.68]

Parameter              |       H0 | inside ROPE |          95% HDI
------------------------------------------------------------------
Intercept              | Rejected |      0.00 % | [109.18, 113.43]
FirstVersion_GM        | Accepted |    100.00 % |     [0.48, 1.39]
Experimental_condition | Rejected |      0.00 % |     [1.71, 7.47]

Visualizing the HDI + ROPE rule

We visualize the equivalence_test() by adding plot( )

equivalence_test(M3) %>%
  plot()

Probability of Direction (PD) with `parameters` package

library(parameters)
model_parameters(
  M3,
  ci_method = "hdi",
  rope_range = c(-1.8,1.8), #sd MarathonTimeM = 17.76 so 17.76*0.1 
  test = c("rope", "pd")
  )

# Fixed Effects

Parameter              | Median |           95% CI |     pd | % in ROPE |  Rhat |     ESS
-----------------------------------------------------------------------------------------
(Intercept)            | 111.36 | [109.34, 113.58] |   100% |        0% | 1.000 | 1925.00
FirstVersion_GM        |   0.94 | [  0.49,   1.40] | 99.98% |      100% | 1.001 | 1718.00
Experimental_condition |   4.54 | [  1.77,   7.52] | 99.75% |     0.21% | 0.999 | 1814.00

# Sigma

Parameter | Median |       95% CI |   pd | % in ROPE |  Rhat |     ESS
----------------------------------------------------------------------
sigma     |   7.68 | [7.19, 8.20] | 100% |        0% | 1.001 | 5235.00

Outro

Some books

Some books

Some free online books

Bayes Rules!:

https://www.bayesrulesbook.com/

Or this book:

https://vasishth.github.io/bayescogsci/book/

Rens van de Schoot

In Nature Reviews

THE Podcast

If you like running - like I do - this could be a great companion on your run!

https://www.learnbayesstats.com/

Site on creating the graphs

There are many blogs and websites that you can consult if you want to find out more about making graphs.

One that I often fall back to is:

http://mjskay.github.io/tidybayes/

Questions?

Do not hesitate to contact me!

sven.demaeyer@uantwerpen.be

Applied Bayesian Analyses in R

New example data WritingData.RData

Your Turn

Divergent transitions…

Divergent transitions…

Our fix here for model 3

Our fix here for model 3

Interpretation of results…

Visually summarizing the posterior distribution

Functions in bayesplot package

The mcmc_areas() function

The mcmc_areas() function

The mcmc_areas_ridges() function

The mcmc_areas_ridges() function

The mcmc_intervals() function

The mcmc_intervals() function

Manually create visualizations

Manually create visualizations

Use draws to create a plot using ggdist geoms

An example

An example

An example

Change the CI’s

Change the CI’s

Use another visualization

Use another visualization

Plot two parameters each in a facet

Plot two parameters each in a facet

Visualize calculated predictions based on posterior

Visualize calculated predictions based on posterior

Hypothetical Outcome Plots (HOPs)

Visualizing Random Effects

Plotting the residuals

Plotting the residuals

Plotting the residuals

ICC estimation

ICC estimation

ICC estimation

HOP per higher level unit

Reporting stats about the posterior

Credible Intervals

Concept of ROPE

ROPE + HDI

Applying the HDI + ROPE rule with bayestestR package

Visualizing the HDI + ROPE rule

Probability of Direction (PD) with parameters package

Outro

Some books

Some books

Some free online books

Rens van de Schoot

THE Podcast

Site on creating the graphs

Questions?

THANK YOU!

New example data `WritingData.RData`

Functions in `bayesplot` package

The `mcmc_areas()` function

The `mcmc_areas()` function

The `mcmc_areas_ridges()` function

The `mcmc_areas_ridges()` function

The `mcmc_intervals()` function

The `mcmc_intervals()` function

Use draws to create a plot using `ggdist` geoms

Applying the HDI + ROPE rule with `bayestestR` package

Probability of Direction (PD) with `parameters` package