TuringModels

Africa first candidate model

In Edition 2 on page 244, McElreath defines the model as

log(yi)Normal(μi,σ)μi=α+β(rir)αNormal(1,1)βNormal(0,1)σExponential(1) \begin{aligned} \log(y_i) &\sim \text{Normal}(\mu_i, \sigma) \\ \mu_i &= \alpha + \beta(r_i - \overline{r}) \\ \alpha &\sim \text{Normal}(1, 1) \\ \beta &\sim \text{Normal}(0, 1) \\ \sigma &\sim \text{Exponential}(1) \end{aligned}

To define this in Turing.jl, we change it to

σExponential(1)αNormal(1,0.1)βNormal(0,0.3)μi=α+β(rir)yiLogitNormal(μi,σ) \begin{aligned} \sigma &\sim \text{Exponential(1)} \\ \alpha &\sim \text{Normal}(1, 0.1) \\ \beta &\sim \text{Normal}(0, 0.3) \\ \mu_i &= \alpha + \beta(r_i - \overline{r}) \\ y_i &\sim \text{LogitNormal}(\mu_i, \sigma) \end{aligned}

where yiy_i is the GDPs for nation ii, rir_i is terrain ruggedness for nation ii and r\overline{r} is the mean of the ruggedness in the whole sample.

McElreath calls two models m8.1. We have tried to get close to the precis output on page 245, but both m8.1 seem to give a different output for σ\sigma. For \sigma and \beta, our results on this page correspond to the resutls by McElreath.

Data

using DataFrames
using Turing
using TuringModels

import CSV

data_path = joinpath(TuringModels.project_root, "data", "rugged.csv")
df = CSV.read(data_path, DataFrame)

df.log_gdp = log.(df.rgdppc_2000)
dropmissing!(df)

df = select(df, :log_gdp, :rugged, :cont_africa);

df.log_gdp_std = df.log_gdp ./ mean(df.log_gdp)
df.rugged_std = df.rugged ./ maximum(df.rugged)

first(df, 8)
8×5 DataFrame
 Row │ log_gdp  rugged   cont_africa  log_gdp_std  rugged_std
     │ Float64  Float64  Int64        Float64      Float64
─────┼────────────────────────────────────────────────────────
   1 │ 7.49261    0.858            1     1.00276    0.138342
   2 │ 6.43238    1.78             1     0.860869   0.287004
   3 │ 6.86612    0.141            1     0.918918   0.0227346
   4 │ 6.90617    0.236            1     0.924278   0.0380522
   5 │ 8.9493     0.181            1     1.19772    0.0291841
   6 │ 7.04582    0.197            1     0.942968   0.0317639
   7 │ 7.36241    0.224            1     0.985338   0.0361174
   8 │ 7.54046    0.515            1     1.00917    0.0830377

Model

@model function model_fn(log_gdp_std, rugged_std, mean_rugged)
    α ~ Normal(1, 0.1)
    β ~ Normal(0, 0.3)
    σ ~ Exponential(1)

    μ = α .+ β * (rugged_std .- mean_rugged)
    log_gdp_std ~ MvNormal(μ, σ)
end

model = model_fn(df.log_gdp, df.rugged_std, mean(df.rugged_std));

Output

chns = sample(model, NUTS(), MCMCThreads(), 1000, 3)
Chains MCMC chain (1000×15×3 Array{Float64, 3}):

Iterations        = 501:1:1500
Number of chains  = 3
Samples per chain = 1000
Wall duration     = 6.04 seconds
Compute duration  = 5.2 seconds
parameters        = α, σ, β
internals         = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size

Summary Statistics
  parameters      mean       std   naive_se      mcse         ess      rhat   ess_per_sec
      Symbol   Float64   Float64    Float64   Float64     Float64   Float64       Float64

           α    1.0773    0.1020     0.0019    0.0016   3933.1725    0.9997      756.2339
           β   -0.0036    0.3041     0.0056    0.0047   4361.3315    1.0009      838.5563
           σ    6.1941    0.6380     0.0116    0.0099   4279.9277    1.0001      822.9048

Quantiles
  parameters      2.5%     25.0%     50.0%     75.0%     97.5%
      Symbol   Float64   Float64   Float64   Float64   Float64

           α    0.8754    1.0076    1.0786    1.1478    1.2798
           β   -0.6023   -0.2118    0.0015    0.2058    0.5884
           σ    5.1002    5.7461    6.1492    6.5780    7.6333

using StatsPlots

StatsPlots.plot(chns)

Original output

From page 245:

meansd5.5%94.5%
α1.000.010.981.02
β0.000.05-0.090.09
σ0.140.010.120.15