TuringModels

Beta-binomial

This beta-binomial model is m11.5 in Statistical Rethinking 1st Edition. In Statistical Rethinking 2nd Edition it is m12.1 and defined as

AiBetaBinomial(Ni,pi,θ)logit(pi)=αGID[i]αjNormal(0,1.5)θ=ϕ+2ϕExponential(1) \begin{aligned} A_i &\sim \text{BetaBinomial}(N_i, \overline{p}_i, \theta) \\ \text{logit}(\overline{p}_i) &= \alpha_{\text{GID}[i]} \\ \alpha_j &\sim \text{Normal}(0, 1.5) \\ \theta &= \phi + 2 \\ \phi &\sim \text{Exponential(1)} \end{aligned}

  1. Data
  2. Model
  3. Output
  4. Original output

Data

import CSV
import Random
import TuringModels

using DataFrames

Random.seed!(1)

data_path = joinpath(TuringModels.project_root, "data", "UCBadmit.csv")
df = CSV.read(data_path, DataFrame; delim=';')
12×5 DataFrame
 Row │ dept     gender   admit  reject  applications
     │ String1  String7  Int64  Int64   Int64
─────┼───────────────────────────────────────────────
   1 │ A        male       512     313           825
   2 │ A        female      89      19           108
   3 │ B        male       353     207           560
   4 │ B        female      17       8            25
   5 │ C        male       120     205           325
   6 │ C        female     202     391           593
   7 │ D        male       138     279           417
   8 │ D        female     131     244           375
   9 │ E        male        53     138           191
  10 │ E        female      94     299           393
  11 │ F        male        22     351           373
  12 │ F        female      24     317           341

Model

using StatsFuns: logistic
using Turing

@model function m11_5(admit, applications)
  θ ~ truncated(Exponential(1), 0, Inf)
  α ~ Normal(0, 2)

  # alpha and beta for the BetaBinomial must be provided.
  # The two parameterizations are related by
  # alpha = prob * theta, and beta = (1-prob) * theta.
  # See https://github.com/rmcelreath/rethinking/blob/master/man/dbetabinom.Rd

  prob = logistic(α)
  alpha = prob * θ
  beta = (1 - prob) * θ
  admit .~ BetaBinomial.(applications, alpha, beta)
end

model = m11_5(df.admit, df.applications);

Output

chns = sample(model, NUTS(), 1000)
Chains MCMC chain (1000×14×1 Array{Float64, 3}):

Iterations        = 501:1:1500
Number of chains  = 1
Samples per chain = 1000
Wall duration     = 5.5 seconds
Compute duration  = 5.5 seconds
parameters        = θ, α
internals         = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size

Summary Statistics
  parameters      mean       std   naive_se      mcse         ess      rhat   ess_per_sec
      Symbol   Float64   Float64    Float64   Float64     Float64   Float64       Float64

           θ    2.7681    0.9337     0.0295    0.0374    769.4660    1.0002      139.8012
           α   -0.3660    0.3067     0.0097    0.0103   1057.6928    1.0004      192.1680

Quantiles
  parameters      2.5%     25.0%     50.0%     75.0%     97.5%
      Symbol   Float64   Float64   Float64   Float64   Float64

           θ    1.3553    2.1160    2.6309    3.2988    4.8986
           α   -1.0371   -0.5501   -0.3631   -0.1651    0.2537

using StatsPlots

StatsPlots.plot(chns)
"/home/runner/work/TuringModels.jl/TuringModels.jl/__site/assets/models/beta-binomial/code/output/chns.svg"

Original output

"""
        mean   sd  5.5% 94.5% n_eff Rhat
theta   2.74 0.96  1.43  4.37  3583    1
a       -0.37 0.31 -0.87  0.12  3210    1
""";