Heights problem
We estimate simple linear regression model with a half-T prior.
using DynamicHMCModels
ProjDir = rel_path_d("..", "scripts", "04")
cd(ProjDir)
Import the dataset.
howell1 = CSV.read(rel_path("..", "data", "Howell1.csv"), delim=';');
df = convert(DataFrame, howell1);
height | weight | age | male | |
---|---|---|---|---|
Float64⍰ | Float64⍰ | Float64⍰ | Int64⍰ | |
1 | 151.765 | 47.8256 | 63.0 | 1 |
2 | 139.7 | 36.4858 | 63.0 | 0 |
3 | 136.525 | 31.8648 | 65.0 | 0 |
4 | 156.845 | 53.0419 | 41.0 | 1 |
5 | 145.415 | 41.2769 | 51.0 | 0 |
6 | 163.83 | 62.9926 | 35.0 | 1 |
7 | 149.225 | 38.2435 | 32.0 | 0 |
8 | 168.91 | 55.48 | 27.0 | 1 |
9 | 147.955 | 34.8699 | 19.0 | 0 |
10 | 165.1 | 54.4877 | 54.0 | 1 |
11 | 154.305 | 49.8951 | 47.0 | 0 |
12 | 151.13 | 41.2202 | 66.0 | 1 |
13 | 144.78 | 36.0322 | 73.0 | 0 |
14 | 149.9 | 47.7 | 20.0 | 0 |
15 | 150.495 | 33.8493 | 65.3 | 0 |
16 | 163.195 | 48.5627 | 36.0 | 1 |
17 | 157.48 | 42.3258 | 44.0 | 1 |
18 | 143.942 | 38.3569 | 31.0 | 0 |
19 | 121.92 | 19.6179 | 12.0 | 1 |
20 | 105.41 | 13.948 | 8.0 | 0 |
21 | 86.36 | 10.4893 | 6.5 | 0 |
22 | 161.29 | 48.9879 | 39.0 | 1 |
23 | 156.21 | 42.7227 | 29.0 | 0 |
24 | 129.54 | 23.5868 | 13.0 | 1 |
⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
Use only adults and standardize
df2 = filter(row -> row[:age] >= 18, df);
height | weight | age | male | |
---|---|---|---|---|
Float64⍰ | Float64⍰ | Float64⍰ | Int64⍰ | |
1 | 151.765 | 47.8256 | 63.0 | 1 |
2 | 139.7 | 36.4858 | 63.0 | 0 |
3 | 136.525 | 31.8648 | 65.0 | 0 |
4 | 156.845 | 53.0419 | 41.0 | 1 |
5 | 145.415 | 41.2769 | 51.0 | 0 |
6 | 163.83 | 62.9926 | 35.0 | 1 |
7 | 149.225 | 38.2435 | 32.0 | 0 |
8 | 168.91 | 55.48 | 27.0 | 1 |
9 | 147.955 | 34.8699 | 19.0 | 0 |
10 | 165.1 | 54.4877 | 54.0 | 1 |
11 | 154.305 | 49.8951 | 47.0 | 0 |
12 | 151.13 | 41.2202 | 66.0 | 1 |
13 | 144.78 | 36.0322 | 73.0 | 0 |
14 | 149.9 | 47.7 | 20.0 | 0 |
15 | 150.495 | 33.8493 | 65.3 | 0 |
16 | 163.195 | 48.5627 | 36.0 | 1 |
17 | 157.48 | 42.3258 | 44.0 | 1 |
18 | 143.942 | 38.3569 | 31.0 | 0 |
19 | 161.29 | 48.9879 | 39.0 | 1 |
20 | 156.21 | 42.7227 | 29.0 | 0 |
21 | 146.4 | 35.4936 | 56.0 | 1 |
22 | 148.59 | 37.9033 | 45.0 | 0 |
23 | 147.32 | 35.4652 | 19.0 | 0 |
24 | 147.955 | 40.313 | 29.0 | 1 |
⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
Show the first six rows of the dataset.
first(df2, 6)
height | weight | age | male | |
---|---|---|---|---|
Float64⍰ | Float64⍰ | Float64⍰ | Int64⍰ | |
1 | 151.765 | 47.8256 | 63.0 | 1 |
2 | 139.7 | 36.4858 | 63.0 | 0 |
3 | 136.525 | 31.8648 | 65.0 | 0 |
4 | 156.845 | 53.0419 | 41.0 | 1 |
5 | 145.415 | 41.2769 | 51.0 | 0 |
6 | 163.83 | 62.9926 | 35.0 | 1 |
Half-T for σ
, see below.
struct HeightsProblem{TY <: AbstractVector, Tν <: Real}
"Observations."
y::TY
"Degrees of freedom for prior on sigma."
ν::Tν
end;
Then make the type callable with the parameters as a single argument.
function (problem::HeightsProblem)(θ)
@unpack y, ν = problem # extract the data
@unpack μ, σ = θ
loglikelihood(Normal(μ, σ), y) + logpdf(TDist(ν), σ)
end;
Setup problem with data and inits.
obs = convert(Vector{Float64}, df2[:height]);
p = HeightsProblem(obs, 1.0);
p((μ = 178, σ = 5.0,))
-5170.976519811121
Write a function to return properly dimensioned transformation.
problem_transformation(p::HeightsProblem) =
as((σ = asℝ₊, μ = as(Real, 100, 250)), )
Wrap the problem with a transformation, then use Flux for the gradient.
P = TransformedLogDensity(problem_transformation(p), p)
∇P = LogDensityRejectErrors(ADgradient(:ForwardDiff, P));
LogDensityRejectErrors{InvalidLogDensityException,LogDensityProblems.ForwardDiffLogDensity{TransformedLogDensity{TransformVariables.TransformNamedTuple{(:σ, :μ),Tuple{TransformVariables.ShiftedExp{true,Float64},TransformVariables.ScaledShiftedLogistic{Int64}}},Main.ex-m4.1d.HeightsProblem{Array{Float64,1},Float64}},ForwardDiff.GradientConfig{ForwardDiff.Tag{getfield(LogDensityProblems, Symbol("##1#2")){TransformedLogDensity{TransformVariables.TransformNamedTuple{(:σ, :μ),Tuple{TransformVariables.ShiftedExp{true,Float64},TransformVariables.ScaledShiftedLogistic{Int64}}},Main.ex-m4.1d.HeightsProblem{Array{Float64,1},Float64}}},Float64},Float64,2,Array{ForwardDiff.Dual{ForwardDiff.Tag{getfield(LogDensityProblems, Symbol("##1#2")){TransformedLogDensity{TransformVariables.TransformNamedTuple{(:σ, :μ),Tuple{TransformVariables.ShiftedExp{true,Float64},TransformVariables.ScaledShiftedLogistic{Int64}}},Main.ex-m4.1d.HeightsProblem{Array{Float64,1},Float64}}},Float64},Float64,2},1}}}}(ForwardDiff AD wrapper for TransformedLogDensity of dimension 2, w/ chunk size 2)
Tune and sample.
chain, NUTS_tuned = NUTS_init_tune_mcmc(∇P, 1000);
MCMC, adapting ϵ (75 steps)
6.0e-5 s/step ...done
MCMC, adapting ϵ (25 steps)
0.00012 s/step ...done
MCMC, adapting ϵ (50 steps)
0.00015 s/step ...done
MCMC, adapting ϵ (100 steps)
5.2e-5 s/step ...done
MCMC, adapting ϵ (200 steps)
4.4e-5 s/step ...done
MCMC, adapting ϵ (400 steps)
4.1e-5 s/step ...done
MCMC, adapting ϵ (50 steps)
5.8e-5 s/step ...done
MCMC (1000 steps)
8.9e-5 s/step ...done
(NUTS_Transition{Array{Float64,1},Float64}[NUTS_Transition{Array{Float64,1},Float64}([1.994720739108177, -0.5631089300311671], -1221.0826263333533, 2, DynamicHMC.AdjacentTurn, 0.9724704685537139, 7), NUTS_Transition{Array{Float64,1},Float64}([2.087077672796857, -0.5449826604398107], -1221.050156427235, 2, DynamicHMC.DoubledTurn, 0.9328733702680135, 3), NUTS_Transition{Array{Float64,1},Float64}([2.0474230144326393, -0.5646466637867887], -1220.2015645744245, 2, DynamicHMC.DoubledTurn, 1.0, 3), NUTS_Transition{Array{Float64,1},Float64}([2.0440709806099324, -0.5530043370576546], -1219.4467050900128, 4, DynamicHMC.AdjacentTurn, 0.9865045611413966, 19), NUTS_Transition{Array{Float64,1},Float64}([2.037530820797745, -0.5616908222879327], -1219.430313699857, 1, DynamicHMC.AdjacentTurn, 0.9720542709078493, 3), NUTS_Transition{Array{Float64,1},Float64}([2.04764457464594, -0.5555898359494956], -1219.2847363564272, 1, DynamicHMC.AdjacentTurn, 0.9829499021155087, 3), NUTS_Transition{Array{Float64,1},Float64}([2.0075093740363052, -0.542400586989145], -1220.5887210552032, 2, DynamicHMC.DoubledTurn, 0.8690377720747441, 3), NUTS_Transition{Array{Float64,1},Float64}([2.0047249601348156, -0.5440863882169099], -1220.6890582460055, 1, DynamicHMC.DoubledTurn, 1.0, 1), NUTS_Transition{Array{Float64,1},Float64}([2.0815080248882425, -0.5613856865111064], -1221.178420899533, 3, DynamicHMC.AdjacentTurn, 0.9559856239973943, 11), NUTS_Transition{Array{Float64,1},Float64}([2.0381733410512215, -0.5508217132504845], -1220.5804862395444, 2, DynamicHMC.DoubledTurn, 0.9161646911460958, 3) … NUTS_Transition{Array{Float64,1},Float64}([2.0331011995821338, -0.5744695862769543], -1220.354832112565, 1, DynamicHMC.DoubledTurn, 1.0, 1), NUTS_Transition{Array{Float64,1},Float64}([2.099221697547267, -0.5668797492279168], -1220.9815511477561, 3, DynamicHMC.AdjacentTurn, 0.9902877308072917, 11), NUTS_Transition{Array{Float64,1},Float64}([2.066530898772849, -0.549862963257278], -1220.522462930944, 2, DynamicHMC.DoubledTurn, 1.0, 3), NUTS_Transition{Array{Float64,1},Float64}([2.079548088175076, -0.5412522108475912], -1220.792734563697, 2, DynamicHMC.DoubledTurn, 0.8991442423134822, 3), NUTS_Transition{Array{Float64,1},Float64}([2.0469371522056727, -0.5396309322619934], -1221.5675665090794, 2, DynamicHMC.DoubledTurn, 0.9423252301703918, 3), NUTS_Transition{Array{Float64,1},Float64}([2.022620966962236, -0.5404221179890323], -1221.8616816275683, 2, DynamicHMC.DoubledTurn, 0.8939151945920937, 3), NUTS_Transition{Array{Float64,1},Float64}([2.022620966962236, -0.5404221179890323], -1225.710081828512, 2, DynamicHMC.DoubledTurn, 0.7280513952466654, 3), NUTS_Transition{Array{Float64,1},Float64}([1.964527605968526, -0.5617962525003308], -1222.7995291879267, 2, DynamicHMC.DoubledTurn, 0.8889986475195508, 3), NUTS_Transition{Array{Float64,1},Float64}([2.1001427998747078, -0.5550870410178825], -1221.6120409032712, 4, DynamicHMC.AdjacentTurn, 0.9921250266977075, 27), NUTS_Transition{Array{Float64,1},Float64}([2.021066869996862, -0.5354853469340258], -1222.0186094540213, 2, DynamicHMC.AdjacentTurn, 0.932245063111601, 7)], NUTS sampler in 2 dimensions
stepsize (ϵ) ≈ 0.805
maximum depth = 10
Gaussian kinetic energy, √diag(M⁻¹): [0.036595615443562254, 0.012125721081556478]
)
We use the transformation to obtain the posterior from the chain.
posterior = TransformVariables.transform.(Ref(problem_transformation(p)), get_position.(chain));
1000-element Array{NamedTuple{(:σ, :μ),Tuple{Float64,Float64}},1}:
(σ = 7.350150131776505, μ = 154.4242628365013)
(σ = 8.061322885753835, μ = 155.05438774235822)
(σ = 7.747909107876504, μ = 154.37094918639053)
(σ = 7.721981334156189, μ = 154.77514917016993)
(σ = 7.671643131538068, μ = 154.47344896014653)
(σ = 7.749625926453404, μ = 154.68527490500372)
(σ = 7.444752144443818, μ = 155.14439821968602)
(σ = 7.424051705748457, μ = 155.08562459342804)
(σ = 8.016548958850887, μ = 154.4840348890945)
(σ = 7.67657390151768, μ = 154.85106793838517)
⋮
(σ = 8.159816635201654, μ = 154.29356765769734)
(σ = 7.8973787354404, μ = 154.8844304717932)
(σ = 8.000852417372156, μ = 155.18445013252105)
(σ = 7.74414560584879, μ = 155.2410161502807)
(σ = 7.558108549039558, μ = 155.21340884891666)
(σ = 7.558108549039558, μ = 155.21340884891666)
(σ = 7.131542896010126, μ = 154.46979152589668)
(σ = 8.16733612387344, μ = 154.7027475879425)
(σ = 7.546371637990715, μ = 155.38576437860715)
Extract the parameter posterior means: β
,
posterior_μ = mean(last, posterior)
154.61321360849348
then σ
:
posterior_σ = mean(first, posterior)
7.7577066705866695
Effective sample sizes (of untransformed draws)
ess = mapslices(effective_sample_size,
get_position_matrix(chain); dims = 1)
NUTS-specific statistics
NUTS_statistics(chain)
cmdstan_result = "
Iterations = 1:1000
Thinning interval = 1
Chains = 1,2,3,4
Samples per chain = 1000
Empirical Posterior Estimates:
Mean SD Naive SE MCSE ESS
sigma 7.7641872 0.29928194 0.004732063 0.0055677898 1000
mu 154.6055177 0.41989355 0.006639100 0.0085038356 1000
Quantiles:
2.5% 25.0% 50.0% 75.0% 97.5%
sigma 7.21853 7.5560625 7.751355 7.9566775 8.410391
mu 153.77992 154.3157500 154.602000 154.8820000 155.431000
";
"\nIterations = 1:1000\nThinning interval = 1\nChains = 1,2,3,4\nSamples per chain = 1000\n\nEmpirical Posterior Estimates:\n Mean SD Naive SE MCSE ESS\nsigma 7.7641872 0.29928194 0.004732063 0.0055677898 1000\n mu 154.6055177 0.41989355 0.006639100 0.0085038356 1000\n\nQuantiles:\n 2.5% 25.0% 50.0% 75.0% 97.5%\nsigma 7.21853 7.5560625 7.751355 7.9566775 8.410391\n mu 153.77992 154.3157500 154.602000 154.8820000 155.431000\n"
Extract the parameter posterior means: β
,
[posterior_μ, posterior_σ]
2-element Array{Float64,1}:
154.61321360849348
7.7577066705866695
end of m4.5d.jl#- This page was generated using Literate.jl.