sr_path
StatisticalRethinking.sr_path
— Methodsr_path
Relative path using the StatisticalRethinking src/ directory.
Example to get access to the data subdirectory
sr_path("..", "data")
Note that in the projects, e.g. SR2StanPluto.jl and SR2TuringPluto.jl, the DrWatson approach is a better choics, i.e: sr_datadir(filename)
sr_datadir
StatisticalRethinking.sr_datadir
— Methodsr_datadir
Relative path using the StatisticalRethinking src/ directory.
Example to access Howell1.csv
in StatisticalRethinking:
df = CSV.read(sr_datadir("Howell1.csv"), DataFrame)
link
StatisticalRethinking.link
— Functionlink
Compute the link function for standardized variables.
link(dfa, vars, xrange)
Required arguments
df::DataFrame
: Chain samples converted to a DataFramevars::Vector{Symbol}
: Variables in DataFrame (2 variables)xrange::range
: Range over which link values are computed
Optional arguments
xbar::Float64
: Mean value of observed predictorybar::Float64
: Mean value of observed outcome (requires xbar argument)
Return values
result
: Vector of link values
link
Generalized link function to evaluate callable for all parameters in dataframe over range of x values.
link(dfa, rx_to_val, xrange)
Required arguments
dfa::DataFrame
: data frame with parametersrx_to_val::Function
: function of two arguments: row object and xxrange
: sequence of x values to be evaluated on
Return values
Is the vector, where each entry was calculated on each value from xrange. Every such entry is a list corresponding each row in the data frame.
Examples
julia> using StatisticalRethinking, DataFrames
julia> d = DataFrame(:a => [1,2], :b=>[1,1])
2×2 DataFrame
Row │ a b
│ Int64 Int64
─────┼──────────────
1 │ 1 1
2 │ 2 1
julia> link(d, (r,x) -> r.a+x*r.b, 1:2)
2-element Vector{Vector{Int64}}:
[2, 3]
[3, 4]
lppd
StatisticalRethinking.lppd
— Functionlppd
Generic version of Log Pointwise Predictive Density computation, which is similar to simulate
function, but additionally computes log density for the target values.
lppd(df, rx_to_dist, xseq, yseq)
Required arguments
df::DataFrame
: data frame with parametersrx_to_dist::Function
: callable with two arguments: row object and x value.
Has to return Distribution
instance
xseq
: sequence of x values to be passed to the callableyseq
: sequence of target values for log density calculation.
Return values
Vector of float values with the same size as xseq
and yseq
.
Examples
julia> using StatisticalRethinking, DataFrames, Distributions
julia> df = DataFrame(:mu => [0.0, 1.0])
2×1 DataFrame
Row │ mu
│ Float64
─────┼─────────
1 │ 0.0
2 │ 1.0
julia> lppd(df, (r, x) -> Normal(r.mu + x, 1.0), 0:3, 3:-1:0)
4-element Vector{Float64}:
-3.5331959794720684
-1.1380087295845114
-1.9106724357818656
-6.082335295491998
rescale
StatisticalRethinking.rescale
— Methodrescale
Rescale a vector to "un-standardize", the opposite of scale!().
rescale(x, xbar, xstd)
Extended help
Required arguments
* `x::Vector{Float64}` : Vector to be rescaled
* `xbar` : Mean value for rescaling
* `xstd` : Std for rescaling
Return values
* `result::AbstractVector` : Rescaled vector
sample
StatsBase.sample
— Methodsample
Sample rows from a DataFrame
Method
sample(df, n; replace, ordered)
Required arguments
* `df::DataFrame` : DataFrame
* `n::Int` : Number of samples
Optional argument
* `rng::AbstractRNG` : Random number generator
* `replace::Bool=true` : Sample with replace
* `ordered::Bool=false` : Sort sample
Return values
* `result` : Array of samples
hpdi
StatisticalRethinking.hpdi
— Methodhpdi
Compute high density region.
hpdi(x; alpha)
Derived from hpd
in MCMCChains.jl.
By default alpha=0.11 for a 2-sided tail area of p < 0.055% and p > 0.945%.
meanlowerupper
StatisticalRethinking.meanlowerupper
— Functionmeanlowerupper
Compute a NamedTuple with means, lower and upper PI values.
meanlowerupper(data)
meanlowerupper(data, PI)
compare
StatisticalRethinking.compare
— Methodcompare
Compare waic and psis values for models.
compare(m, ; mnames)
Required arguments
* `models` : Vector of logprob matrices
* `criterium` : Either ::Val{:waic} or ::Val{:psis}
Optional argument
* `mnames::Vector{Symbol}` : Vector of model names
Return values
* `df` : DataFrame with statistics
create_observation_matrix
StatisticalRethinking.create_observation_matrix
— Methodpairsplot
Create a polynomial observation matrix.
create_observation_matrix(x, k)
r2_is_bad
StatisticalRethinking.r2_is_bad
— Methodr2isbad
Compute R^2 values.
r2_is_bad(model, df)
PI
StatisticalRethinking.PI
— FunctionPI
Compute percentile central interval of data. Returns vector of bounds.
PI(data; perc_prob)
Required arguments
data
: iterable over data values
Optional arguments
perc_prob::Float64=0.89
: percentile interval to calculate
Examples
julia> using StatisticalRethinking
julia> PI(1:10)
2-element Vector{Float64}:
1.495
9.505
julia> PI(1:10; perc_prob=0.1)
2-element Vector{Float64}:
5.05
5.95
var2
StatisticalRethinking.var2
— Methodvar2
Variance without n-1 correction.
var2(x)
sim_happiness
StatisticalRethinking.sim_happiness
— Functionsim_happiness
sim_happiness(; seed, n_years, max_age, n_births, aom)
Simulates hapiness using rules from section 6.3 of the book:
- Each year, 20 people are born with uniformly distributed happiness values.
- Each year, each person ages one year. Happiness does not change.
- At age 18, individuals can become married. The odds of marriage each year are
proportional to an individual’s happiness.
- Once married, an individual remains married.
- After age 65, individuals leave the sample. (They move to Spain.)
Arguments:
seed
: random seed, default is no seedn_years
: amount of years to simulatemax_age
: maximum age people are livingn_births
: count of people are born every yearaom
: at what age people can got married
Examples
julia> using StatisticalRethinking
julia> sim_happiness(n_years=4, n_births=10)
40×3 DataFrame
Row │ age happiness married
│ Int64 Float64 Int64
─────┼───────────────────────────
1 │ 4 -2.0 0
2 │ 4 -1.55556 0
3 │ 4 -1.11111 0
simulate
StatisticalRethinking.simulate
— Functionsimulate
Used for counterfactual simulations.
simulate(df, coefs, var_seq)
Required arguments
* `df` : DataFrame with coefficient samples
* `coefs` : Vector of coefficients
* `var_seq` : Input values for simulated effect
Return values
* `m_sim::NamedTuple` : Array with predictions
simulate
Counterfactual predictions after manipulating a variable.
simulate(df, coefs, var_seq, coefs_ext)
Required arguments
* `df` : DataFrame with coefficient samples
* `coefs` : Vector of coefficients
* `var_seq` : Input values for simulated effect
* `ext_coefs` : Vector of simulated variable coefficients
Return values
* `(m_sim, d_sim)` : Arrays with predictions
simulate
Generic simulate of predictions using callable returning distribution to sample from.
simulate(df, rx_to_dist, xrange; return_dist, seed)
Required arguments
df::DataFrame
: data frame with parameters in each rowrx_to_dist::Function
: callable with two arguments: row object and x value. Have to returnDistribution
instance.xrange
: iterable with arguments
Optional arguments
return_dist::Bool = false
: if set totrue
, distributions will be returned, not their samplesseed::Int = missing
: sets the random seed
Return value
Vector were each item is generated from every item in xrange argument. Each item is again a vector obtained from rx_to_dist
call to obtain a distribution and then sample from it. If argument return_dist=true
, sampling step will be omitted.
Examples
julia> using StatisticalRethinking, DataFrames, Distributions
julia> d = DataFrame(:mu => [1.0, 2.0], :sigma => [0.1, 0.2])
2×2 DataFrame
Row │ mu sigma
│ Float64 Float64
─────┼──────────────────
1 │ 1.0 0.0
2 │ 2.0 0.0
julia> simulate(d, (r,x) -> Normal(r.mu+x, r.sigma), 0:1)
2-element Vector{Vector{Float64}}:
[1.0, 2.0]
[2.0, 3.0]
julia> simulate(d, (r,x) -> Normal(r.mu+x, r.sigma), 0:1, return_dist=true)
2-element Vector{Vector{Normal{Float64}}}:
[Normal{Float64}(μ=1.0, σ=0.0), Normal{Float64}(μ=2.0, σ=0.0)]
[Normal{Float64}(μ=2.0, σ=0.0), Normal{Float64}(μ=3.0, σ=0.0)]