TuringModels

Beta-binomial

This beta-binomial model is m11.5 in Statistical Rethinking 1st Edition. In Statistical Rethinking 2nd Edition it is m12.1 and defined as

AiBetaBinomial(Ni,pi,θ)logit(pi)=αGID[i]αjNormal(0,1.5)θ=ϕ+2ϕExponential(1) \begin{aligned} A_i &\sim \text{BetaBinomial}(N_i, \overline{p}_i, \theta) \\ \text{logit}(\overline{p}_i) &= \alpha_{\text{GID}[i]} \\ \alpha_j &\sim \text{Normal}(0, 1.5) \\ \theta &= \phi + 2 \\ \phi &\sim \text{Exponential(1)} \end{aligned}

  1. Data
  2. Model
  3. Output
  4. Original output

Data

import CSV
import Random
import TuringModels

using DataFrames

Random.seed!(1)

data_path = joinpath(TuringModels.project_root, "data", "UCBadmit.csv")
df = CSV.read(data_path, DataFrame; delim=';')
12×5 DataFrame
 Row │ dept       gender     admit  reject  applications
     │ InlineSt…  InlineSt…  Int64  Int64   Int64
─────┼───────────────────────────────────────────────────
   1 │ A          male         512     313           825
   2 │ A          female        89      19           108
   3 │ B          male         353     207           560
   4 │ B          female        17       8            25
   5 │ C          male         120     205           325
   6 │ C          female       202     391           593
   7 │ D          male         138     279           417
   8 │ D          female       131     244           375
   9 │ E          male          53     138           191
  10 │ E          female        94     299           393
  11 │ F          male          22     351           373
  12 │ F          female        24     317           341

Model

using StatsFuns: logistic
using Turing

@model function m11_5(admit, applications)
  θ ~ truncated(Exponential(1), 0, Inf)
  α ~ Normal(0, 2)

  # alpha and beta for the BetaBinomial must be provided.
  # The two parameterizations are related by
  # alpha = prob * theta, and beta = (1-prob) * theta.
  # See https://github.com/rmcelreath/rethinking/blob/master/man/dbetabinom.Rd

  prob = logistic(α)
  alpha = prob * θ
  beta = (1 - prob) * θ
  admit .~ BetaBinomial.(applications, alpha, beta)
end

model = m11_5(df.admit, df.applications);

Output

chns = sample(model, NUTS(), 1000)
Chains MCMC chain (1000×14×1 Array{Float64, 3}):

Iterations        = 501:1:1500
Number of chains  = 1
Samples per chain = 1000
Wall duration     = 4.85 seconds
Compute duration  = 4.85 seconds
parameters        = α, θ
internals         = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size

Summary Statistics
  parameters      mean       std   naive_se      mcse        ess      rhat   ess_per_sec
      Symbol   Float64   Float64    Float64   Float64    Float64   Float64       Float64

           θ    2.7853    1.0053     0.0318    0.0303   835.9390    0.9990      172.3230
           α   -0.3711    0.3363     0.0106    0.0117   960.6136    1.0020      198.0238

Quantiles
  parameters      2.5%     25.0%     50.0%     75.0%     97.5%
      Symbol   Float64   Float64   Float64   Float64   Float64

           θ    1.2359    2.0549    2.6934    3.3417    5.2711
           α   -1.0433   -0.5914   -0.3674   -0.1478    0.2438

using StatsPlots

StatsPlots.plot(chns)

Original output

"""
        mean   sd  5.5% 94.5% n_eff Rhat
theta   2.74 0.96  1.43  4.37  3583    1
a       -0.37 0.31 -0.87  0.12  3210    1
""";