Load Julia packages (libraries) needed for the snippets in chapter 0
using StatisticalRethinking
using CmdStan, StanMCMCChain
gr(size=(500,800))Plots.GRBackend()CmdStan uses a tmp directory to store the output of cmdstan
ProjDir = rel_path("..", "chapters", "04")
cd(ProjDir)snippet 4.7
howell1 = CSV.read(joinpath(dirname(Base.pathof(StatisticalRethinking)), "..", "data", "Howell1.csv"), delim=';')
df = convert(DataFrame, howell1);Use only adults
df2 = filter(row -> row[:age] >= 18, df)
female_df = filter(row -> row[:male] == 0, df2)
male_df = filter(row -> row[:male] == 1, df2)| height | weight | age | male | |
|---|---|---|---|---|
| Float64⍰ | Float64⍰ | Float64⍰ | Int64⍰ | |
| 1 | 151.765 | 47.8256 | 63.0 | 1 |
| 2 | 156.845 | 53.0419 | 41.0 | 1 |
| 3 | 163.83 | 62.9926 | 35.0 | 1 |
| 4 | 168.91 | 55.48 | 27.0 | 1 |
| 5 | 165.1 | 54.4877 | 54.0 | 1 |
| 6 | 151.13 | 41.2202 | 66.0 | 1 |
| 7 | 163.195 | 48.5627 | 36.0 | 1 |
| 8 | 157.48 | 42.3258 | 44.0 | 1 |
| 9 | 161.29 | 48.9879 | 39.0 | 1 |
| 10 | 146.4 | 35.4936 | 56.0 | 1 |
| 11 | 147.955 | 40.313 | 29.0 | 1 |
| 12 | 161.925 | 55.1114 | 30.0 | 1 |
| 13 | 160.655 | 47.8823 | 24.0 | 1 |
| 14 | 151.765 | 49.4132 | 30.0 | 1 |
| 15 | 162.865 | 49.3848 | 24.0 | 1 |
| 16 | 171.45 | 56.5573 | 52.0 | 1 |
| 17 | 154.305 | 41.2485 | 55.0 | 1 |
| 18 | 146.7 | 42.4 | 20.0 | 1 |
| 19 | 157.48 | 44.6505 | 18.0 | 1 |
| 20 | 165.735 | 58.5984 | 42.0 | 1 |
| 21 | 164.465 | 45.8978 | 50.0 | 1 |
| 22 | 161.29 | 52.2198 | 31.0 | 1 |
| 23 | 152.4 | 36.4858 | 79.3 | 1 |
| 24 | 163.83 | 55.9336 | 35.0 | 1 |
| ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
Plot the densities.
density(df2[:height], lab="All heights")
Is it bi-modal?
density!(female_df[:height], lab="Female heights")
density!(male_df[:height], lab="Male heights")
Define the Stan language model
heightsmodel = "
// Inferring a Rate
data {
int N;
real<lower=0> h[N];
}
parameters {
real<lower=0> sigma;
real<lower=0,upper=250> mu;
}
model {
// Priors for mu and sigma
mu ~ normal(178, 20);
sigma ~ uniform( 0 , 50 );
// Observed heights
h ~ normal(mu, sigma);
}
""\n// Inferring a Rate\ndata {\n int N;\n real<lower=0> h[N];\n}\nparameters {\n real<lower=0> sigma;\n real<lower=0,upper=250> mu;\n}\nmodel {\n // Priors for mu and sigma\n mu ~ normal(178, 20);\n sigma ~ uniform( 0 , 50 );\n\n // Observed heights\n h ~ normal(mu, sigma);\n}\n"Define the Stanmodel and set the output format to :mcmcchain.
stanmodel = Stanmodel(name="heights", monitors = ["mu", "sigma"],model=heightsmodel,
output_format=:mcmcchain)Input data for cmdstan
heightsdata = [
Dict("N" => length(df2[:height]), "h" => df2[:height])
]=====> /home/travis/build/StanJulia/StatisticalRethinking.jl/docs/build/04
File /home/travis/build/StanJulia/StatisticalRethinking.jl/docs/build/04/tmp/heights.stan will be updated.
1-element Array{Dict{String,Any},1}:
Dict("N"=>352,"h"=>Union{Missing, Float64}[151.765, 139.7, 136.525, 156.845, 145.415, 163.83, 149.225, 168.91, 147.955, 165.1 … 156.21, 160.655, 146.05, 156.21, 152.4, 162.56, 142.875, 162.56, 156.21, 158.75])Sample using cmdstan
rc, chn, cnames = stan(stanmodel, heightsdata, ProjDir, diagnostics=false,
CmdStanDir=CMDSTAN_HOME)Describe the draws
describe(chn)
--- Translating Stan model to C++ code ---
bin/stanc /home/travis/build/StanJulia/StatisticalRethinking.jl/docs/build/04/tmp/heights.stan --o=/home/travis/build/StanJulia/StatisticalRethinking.jl/docs/build/04/tmp/heights.hpp
Model name=heights_model
Input file=/home/travis/build/StanJulia/StatisticalRethinking.jl/docs/build/04/tmp/heights.stan
Output file=/home/travis/build/StanJulia/StatisticalRethinking.jl/docs/build/04/tmp/heights.hpp
--- Linking C++ model ---
g++ -Wall -I . -isystem stan/lib/stan_math/lib/eigen_3.3.3 -isystem stan/lib/stan_math/lib/boost_1.66.0 -isystem stan/lib/stan_math/lib/sundials_3.1.0/include -std=c++1y -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION -Wno-unused-function -Wno-uninitialized -I src -isystem stan/src -isystem stan/lib/stan_math/ -DFUSION_MAX_VECTOR_SIZE=12 -Wno-unused-local-typedefs -DEIGEN_NO_DEBUG -DNO_FPRINTF_OUTPUT -pipe src/cmdstan/main.cpp -O3 -o /home/travis/build/StanJulia/StatisticalRethinking.jl/docs/build/04/tmp/heights -include /home/travis/build/StanJulia/StatisticalRethinking.jl/docs/build/04/tmp/heights.hpp stan/lib/stan_math/lib/sundials_3.1.0/lib/libsundials_nvecserial.a stan/lib/stan_math/lib/sundials_3.1.0/lib/libsundials_cvodes.a stan/lib/stan_math/lib/sundials_3.1.0/lib/libsundials_idas.a
Length of data array is not equal to nchains,
all chains will use the first data dictionary.
Calling /home/travis/cmdstan/bin/stansummary to infer across chains.
Inference for Stan model: heights_model
4 chains: each with iter=(1000,1000,1000,1000); warmup=(0,0,0,0); thin=(1,1,1,1); 4000 iterations saved.
Warmup took (0.028, 0.027, 0.027, 0.026) seconds, 0.11 seconds total
Sampling took (0.057, 0.036, 0.040, 0.041) seconds, 0.17 seconds total
Mean MCSE StdDev 5% 50% 95% N_Eff N_Eff/s R_hat
lp__ -892 2.4e-02 1.00 -894 -891 -891 1.8e+03 1.0e+04 1.0e+00
accept_stat__ 0.92 1.5e-03 0.10 0.72 0.96 1.0 4.5e+03 2.6e+04 1.0e+00
stepsize__ 0.78 2.4e-02 0.034 0.75 0.80 0.83 2.0e+00 1.1e+01 3.8e+13
treedepth__ 2.1 4.7e-02 0.79 1.0 2.0 4.0 2.8e+02 1.6e+03 1.0e+00
n_leapfrog__ 5.8 6.3e-01 7.2 1.0 3.0 19 1.3e+02 7.6e+02 1.0e+00
divergent__ 0.00 -nan 0.00 0.00 0.00 0.00 -nan -nan -nan
energy__ 893 3.5e-02 1.4 891 892 895 1.7e+03 9.7e+03 1.0e+00
sigma 7.8 5.8e-03 0.29 7.3 7.8 8.3 2.6e+03 1.5e+04 1.0e+00
mu 155 6.7e-03 0.42 154 155 155 3.9e+03 2.2e+04 1.0e+00
Samples were drawn using hmc with nuts.
For each parameter, N_Eff is a crude measure of effective sample size,
and R_hat is the potential scale reduction factor on split chains (at
convergence, R_hat=1).
Iterations = 1:1000
Thinning interval = 1
Chains = 1,2,3,4
Samples per chain = 1000
Empirical Posterior Estimates:
Mean SD Naive SE MCSE ESS
sigma 7.774819 0.29394661 0.0046477040 0.0056923954 1000
mu 154.609931 0.41523612 0.0065654595 0.0061268403 1000
Quantiles:
2.5% 25.0% 50.0% 75.0% 97.5%
sigma 7.21527 7.57163 7.767005 7.9676525 8.376470
mu 153.80978 154.32500 154.616000 154.8822500 155.445175Plot the density of posterior draws
density(chn, lab="All heights")
This page was generated using Literate.jl.