clip_07.0s

Load Julia packages (libraries) needed for the snippets in chapter 0

using StatisticalRethinking
using CmdStan, StanMCMCChain
gr(size=(500,800))
Plots.GRBackend()

CmdStan uses a tmp directory to store the output of cmdstan

ProjDir = rel_path("..", "chapters", "04")
cd(ProjDir)

snippet 4.7

howell1 = CSV.read(joinpath(dirname(Base.pathof(StatisticalRethinking)), "..", "data", "Howell1.csv"), delim=';')
df = convert(DataFrame, howell1);

Use only adults

df2 = filter(row -> row[:age] >= 18, df)
female_df = filter(row -> row[:male] == 0, df2)
male_df = filter(row -> row[:male] == 1, df2)

165 rows × 4 columns

heightweightagemale
Float64⍰Float64⍰Float64⍰Int64⍰
1151.76547.825663.01
2156.84553.041941.01
3163.8362.992635.01
4168.9155.4827.01
5165.154.487754.01
6151.1341.220266.01
7163.19548.562736.01
8157.4842.325844.01
9161.2948.987939.01
10146.435.493656.01
11147.95540.31329.01
12161.92555.111430.01
13160.65547.882324.01
14151.76549.413230.01
15162.86549.384824.01
16171.4556.557352.01
17154.30541.248555.01
18146.742.420.01
19157.4844.650518.01
20165.73558.598442.01
21164.46545.897850.01
22161.2952.219831.01
23152.436.485879.31
24163.8355.933635.01

Plot the densities.

density(df2[:height], lab="All heights")
130 140 150 160 170 180 0.00 0.01 0.02 0.03 0.04 All heights

Is it bi-modal?

density!(female_df[:height], lab="Female heights")
density!(male_df[:height], lab="Male heights")
130 140 150 160 170 180 0.00 0.02 0.04 0.06 0.08 All heights Female heights Male heights

Define the Stan language model

heightsmodel = "
// Inferring a Rate
data {
  int N;
  real<lower=0> h[N];
}
parameters {
  real<lower=0> sigma;
  real<lower=0,upper=250> mu;
}
model {
  // Priors for mu and sigma
  mu ~ normal(178, 20);
  sigma ~ uniform( 0 , 50 );

  // Observed heights
  h ~ normal(mu, sigma);
}
"
"\n// Inferring a Rate\ndata {\n  int N;\n  real<lower=0> h[N];\n}\nparameters {\n  real<lower=0> sigma;\n  real<lower=0,upper=250> mu;\n}\nmodel {\n  // Priors for mu and sigma\n  mu ~ normal(178, 20);\n  sigma ~ uniform( 0 , 50 );\n\n  // Observed heights\n  h ~ normal(mu, sigma);\n}\n"

Define the Stanmodel and set the output format to :mcmcchain.

stanmodel = Stanmodel(name="heights", monitors = ["mu", "sigma"],model=heightsmodel,
  output_format=:mcmcchain)

Input data for cmdstan

heightsdata = [
  Dict("N" => length(df2[:height]), "h" => df2[:height])
]
=====> /home/travis/build/StanJulia/StatisticalRethinking.jl/docs/build/04


File /home/travis/build/StanJulia/StatisticalRethinking.jl/docs/build/04/tmp/heights.stan will be updated.

1-element Array{Dict{String,Any},1}:
 Dict("N"=>352,"h"=>Union{Missing, Float64}[151.765, 139.7, 136.525, 156.845, 145.415, 163.83, 149.225, 168.91, 147.955, 165.1  …  156.21, 160.655, 146.05, 156.21, 152.4, 162.56, 142.875, 162.56, 156.21, 158.75])

Sample using cmdstan

rc, chn, cnames = stan(stanmodel, heightsdata, ProjDir, diagnostics=false,
  CmdStanDir=CMDSTAN_HOME)

Describe the draws

describe(chn)


--- Translating Stan model to C++ code ---
bin/stanc  /home/travis/build/StanJulia/StatisticalRethinking.jl/docs/build/04/tmp/heights.stan --o=/home/travis/build/StanJulia/StatisticalRethinking.jl/docs/build/04/tmp/heights.hpp
Model name=heights_model
Input file=/home/travis/build/StanJulia/StatisticalRethinking.jl/docs/build/04/tmp/heights.stan
Output file=/home/travis/build/StanJulia/StatisticalRethinking.jl/docs/build/04/tmp/heights.hpp

--- Linking C++ model ---
g++ -Wall -I . -isystem stan/lib/stan_math/lib/eigen_3.3.3 -isystem stan/lib/stan_math/lib/boost_1.66.0 -isystem stan/lib/stan_math/lib/sundials_3.1.0/include -std=c++1y -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION -Wno-unused-function -Wno-uninitialized -I src -isystem stan/src -isystem stan/lib/stan_math/ -DFUSION_MAX_VECTOR_SIZE=12 -Wno-unused-local-typedefs -DEIGEN_NO_DEBUG -DNO_FPRINTF_OUTPUT -pipe   src/cmdstan/main.cpp  -O3 -o /home/travis/build/StanJulia/StatisticalRethinking.jl/docs/build/04/tmp/heights -include /home/travis/build/StanJulia/StatisticalRethinking.jl/docs/build/04/tmp/heights.hpp stan/lib/stan_math/lib/sundials_3.1.0/lib/libsundials_nvecserial.a stan/lib/stan_math/lib/sundials_3.1.0/lib/libsundials_cvodes.a stan/lib/stan_math/lib/sundials_3.1.0/lib/libsundials_idas.a

Length of data array is not equal to nchains,
all chains will use the first data dictionary.

Calling /home/travis/cmdstan/bin/stansummary to infer across chains.

Inference for Stan model: heights_model
4 chains: each with iter=(1000,1000,1000,1000); warmup=(0,0,0,0); thin=(1,1,1,1); 4000 iterations saved.

Warmup took (0.028, 0.027, 0.027, 0.026) seconds, 0.11 seconds total
Sampling took (0.057, 0.036, 0.040, 0.041) seconds, 0.17 seconds total

                Mean     MCSE  StdDev    5%   50%   95%    N_Eff  N_Eff/s    R_hat
lp__            -892  2.4e-02    1.00  -894  -891  -891  1.8e+03  1.0e+04  1.0e+00
accept_stat__   0.92  1.5e-03    0.10  0.72  0.96   1.0  4.5e+03  2.6e+04  1.0e+00
stepsize__      0.78  2.4e-02   0.034  0.75  0.80  0.83  2.0e+00  1.1e+01  3.8e+13
treedepth__      2.1  4.7e-02    0.79   1.0   2.0   4.0  2.8e+02  1.6e+03  1.0e+00
n_leapfrog__     5.8  6.3e-01     7.2   1.0   3.0    19  1.3e+02  7.6e+02  1.0e+00
divergent__     0.00     -nan    0.00  0.00  0.00  0.00     -nan     -nan     -nan
energy__         893  3.5e-02     1.4   891   892   895  1.7e+03  9.7e+03  1.0e+00
sigma            7.8  5.8e-03    0.29   7.3   7.8   8.3  2.6e+03  1.5e+04  1.0e+00
mu               155  6.7e-03    0.42   154   155   155  3.9e+03  2.2e+04  1.0e+00

Samples were drawn using hmc with nuts.
For each parameter, N_Eff is a crude measure of effective sample size,
and R_hat is the potential scale reduction factor on split chains (at
convergence, R_hat=1).

Iterations = 1:1000
Thinning interval = 1
Chains = 1,2,3,4
Samples per chain = 1000

Empirical Posterior Estimates:
         Mean        SD       Naive SE       MCSE      ESS
sigma   7.774819 0.29394661 0.0046477040 0.0056923954 1000
   mu 154.609931 0.41523612 0.0065654595 0.0061268403 1000

Quantiles:
         2.5%     25.0%      50.0%      75.0%       97.5%
sigma   7.21527   7.57163   7.767005   7.9676525   8.376470
   mu 153.80978 154.32500 154.616000 154.8822500 155.445175

Plot the density of posterior draws

density(chn, lab="All heights")
6.5 7.0 7.5 8.0 8.5 9.0 0.0 0.2 0.4 0.6 0.8 1.0 1.2 sigma Sample value Density 153 154 155 156 0.00 0.25 0.50 0.75 1.00 mu Sample value Density

This page was generated using Literate.jl.