clip_07.1s

Load Julia packages (libraries) needed for the snippets in chapter 0

using StatisticalRethinking
using CmdStan, StanMCMCChain
gr(size=(500,800))
Plots.GRBackend()

CmdStan uses a tmp directory to store the output of cmdstan

ProjDir = rel_path("..", "chapters", "04")
cd(ProjDir)

snippet 4.7

howell1 = CSV.read(joinpath(dirname(Base.pathof(StatisticalRethinking)), "..", "data", "Howell1.csv"), delim=';')
df = convert(DataFrame, howell1);

Use only adults

df2 = filter(row -> row[:age] >= 18, df)

352 rows × 4 columns

heightweightagemale
Float64⍰Float64⍰Float64⍰Int64⍰
1151.76547.825663.01
2139.736.485863.00
3136.52531.864865.00
4156.84553.041941.01
5145.41541.276951.00
6163.8362.992635.01
7149.22538.243532.00
8168.9155.4827.01
9147.95534.869919.00
10165.154.487754.01
11154.30549.895147.00
12151.1341.220266.01
13144.7836.032273.00
14149.947.720.00
15150.49533.849365.30
16163.19548.562736.01
17157.4842.325844.01
18143.94238.356931.00
19161.2948.987939.01
20156.2142.722729.00
21146.435.493656.01
22148.5937.903345.00
23147.3235.465219.00
24147.95540.31329.01

Define the Stan language model

heightsmodel = "
// Inferring a Rate
data {
  int N;
  real<lower=0> h[N];
}
parameters {
  real<lower=0> sigma;
  real<lower=0,upper=250> mu;
}
model {
  // Priors for mu and sigma
  mu ~ uniform(100, 250);
  sigma ~ cauchy( 0 , 1 );

  // Observed heights
  h ~ normal(mu, sigma);
}
"
"\n// Inferring a Rate\ndata {\n  int N;\n  real<lower=0> h[N];\n}\nparameters {\n  real<lower=0> sigma;\n  real<lower=0,upper=250> mu;\n}\nmodel {\n  // Priors for mu and sigma\n  mu ~ uniform(100, 250);\n  sigma ~ cauchy( 0 , 1 );\n\n  // Observed heights\n  h ~ normal(mu, sigma);\n}\n"

Define the Stanmodel and set the output format to :mcmcchain.

stanmodel = Stanmodel(name="heights", monitors = ["mu", "sigma"],model=heightsmodel,
  output_format=:mcmcchain)

Input data for cmdstan

heightsdata = [
  Dict("N" => length(df2[:height]), "h" => df2[:height])
]
=====> /home/travis/build/StanJulia/StatisticalRethinking.jl/docs/build/04


File /home/travis/build/StanJulia/StatisticalRethinking.jl/docs/build/04/tmp/heights.stan will be updated.

1-element Array{Dict{String,Any},1}:
 Dict("N"=>352,"h"=>Union{Missing, Float64}[151.765, 139.7, 136.525, 156.845, 145.415, 163.83, 149.225, 168.91, 147.955, 165.1  …  156.21, 160.655, 146.05, 156.21, 152.4, 162.56, 142.875, 162.56, 156.21, 158.75])

Sample using cmdstan

rc, chn, cnames = stan(stanmodel, heightsdata, ProjDir, diagnostics=false,
  CmdStanDir=CMDSTAN_HOME)

Describe the draws

describe(chn)


--- Translating Stan model to C++ code ---
bin/stanc  /home/travis/build/StanJulia/StatisticalRethinking.jl/docs/build/04/tmp/heights.stan --o=/home/travis/build/StanJulia/StatisticalRethinking.jl/docs/build/04/tmp/heights.hpp
Model name=heights_model
Input file=/home/travis/build/StanJulia/StatisticalRethinking.jl/docs/build/04/tmp/heights.stan
Output file=/home/travis/build/StanJulia/StatisticalRethinking.jl/docs/build/04/tmp/heights.hpp

--- Linking C++ model ---
g++ -Wall -I . -isystem stan/lib/stan_math/lib/eigen_3.3.3 -isystem stan/lib/stan_math/lib/boost_1.66.0 -isystem stan/lib/stan_math/lib/sundials_3.1.0/include -std=c++1y -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION -Wno-unused-function -Wno-uninitialized -I src -isystem stan/src -isystem stan/lib/stan_math/ -DFUSION_MAX_VECTOR_SIZE=12 -Wno-unused-local-typedefs -DEIGEN_NO_DEBUG -DNO_FPRINTF_OUTPUT -pipe   src/cmdstan/main.cpp  -O3 -o /home/travis/build/StanJulia/StatisticalRethinking.jl/docs/build/04/tmp/heights -include /home/travis/build/StanJulia/StatisticalRethinking.jl/docs/build/04/tmp/heights.hpp stan/lib/stan_math/lib/sundials_3.1.0/lib/libsundials_nvecserial.a stan/lib/stan_math/lib/sundials_3.1.0/lib/libsundials_cvodes.a stan/lib/stan_math/lib/sundials_3.1.0/lib/libsundials_idas.a

Length of data array is not equal to nchains,
all chains will use the first data dictionary.

Calling /home/travis/cmdstan/bin/stansummary to infer across chains.

Inference for Stan model: heights_model
4 chains: each with iter=(1000,1000,1000,1000); warmup=(0,0,0,0); thin=(1,1,1,1); 4000 iterations saved.

Warmup took (0.029, 0.026, 0.025, 0.027) seconds, 0.11 seconds total
Sampling took (0.037, 0.046, 0.040, 0.052) seconds, 0.17 seconds total

                Mean     MCSE  StdDev    5%   50%   95%    N_Eff  N_Eff/s    R_hat
lp__            -895  2.3e-02     1.0  -897  -895  -894  2.0e+03  1.1e+04  1.0e+00
accept_stat__   0.92  1.4e-03   0.098  0.71  0.96   1.0  4.7e+03  2.7e+04  1.0e+00
stepsize__      0.81  2.4e-02   0.034  0.77  0.82  0.86  2.0e+00  1.1e+01  3.0e+13
treedepth__      2.1  1.0e-01    0.83   1.0   2.0   4.0  6.8e+01  3.9e+02  1.0e+00
n_leapfrog__     6.0  1.1e+00     8.4   1.0   3.0    23  5.9e+01  3.4e+02  1.0e+00
divergent__     0.00     -nan    0.00  0.00  0.00  0.00     -nan     -nan     -nan
energy__         896  3.5e-02     1.5   894   896   899  1.7e+03  9.9e+03  1.0e+00
sigma            7.7  5.4e-03    0.30   7.3   7.7   8.2  3.0e+03  1.7e+04  1.0e+00
mu               155  6.8e-03    0.42   154   155   155  3.8e+03  2.2e+04  1.0e+00

Samples were drawn using hmc with nuts.
For each parameter, N_Eff is a crude measure of effective sample size,
and R_hat is the potential scale reduction factor on split chains (at
convergence, R_hat=1).

Iterations = 1:1000
Thinning interval = 1
Chains = 1,2,3,4
Samples per chain = 1000

Empirical Posterior Estimates:
         Mean        SD       Naive SE      MCSE     ESS
sigma   7.746845 0.29698506 0.004695746 0.005029119 1000
   mu 154.597421 0.41688256 0.006591492 0.006623318 1000

Quantiles:
         2.5%       25.0%       50.0%      75.0%      97.5%
sigma   7.183481   7.5444175   7.740005   7.938735   8.362853
   mu 153.803975 154.3120000 154.598000 154.877000 155.411100

Compare with previous result

clip_07s_example_output = "

Samples were drawn using hmc with nuts.
For each parameter, N_Eff is a crude measure of effective sample size,
and R_hat is the potential scale reduction factor on split chains (at
convergence, R_hat=1).

Iterations = 1:1000
Thinning interval = 1
Chains = 1,2,3,4
Samples per chain = 1000

Empirical Posterior Estimates:
          Mean        SD      Naive SE      MCSE      ESS
sigma   7.7641718 0.3055115 0.004830561 0.0047596714 1000
   mu 154.6042417 0.4158242 0.006574758 0.0068304868 1000

Quantiles:
         2.5%       25.0%       50.0%      75.0%      97.5%
sigma   7.198721   7.5573575   7.749435   7.960795   8.393317
   mu 153.795975 154.3307500 154.610000 154.884000 155.391050

"
"\n\nSamples were drawn using hmc with nuts.\nFor each parameter, N_Eff is a crude measure of effective sample size,\nand R_hat is the potential scale reduction factor on split chains (at\nconvergence, R_hat=1).\n\nIterations = 1:1000\nThinning interval = 1\nChains = 1,2,3,4\nSamples per chain = 1000\n\nEmpirical Posterior Estimates:\n          Mean        SD      Naive SE      MCSE      ESS\nsigma   7.7641718 0.3055115 0.004830561 0.0047596714 1000\n   mu 154.6042417 0.4158242 0.006574758 0.0068304868 1000\n\nQuantiles:\n         2.5%       25.0%       50.0%      75.0%      97.5%\nsigma   7.198721   7.5573575   7.749435   7.960795   8.393317\n   mu 153.795975 154.3307500 154.610000 154.884000 155.391050\n\n"

Plot the density of posterior draws

density(chn)
7.0 7.5 8.0 8.5 9.0 9.5 0.0 0.3 0.6 0.9 1.2 sigma Sample value Density 153 154 155 156 0.0 0.2 0.4 0.6 0.8 mu Sample value Density

Close cd(ProjDir) do block

This page was generated using Literate.jl.