StructuralCausalModels

StructuralCausalModels.StructuralCausalModelsModule

StructuralCausalModels

StructuralCausalModels.jl provides functionality to analyse directed acyclic graph (DAG) based causal models as described in StatisticalRethinking, Causal Inference in Statistics and Cause and Correlation in Biology.

My initial goal for this package is to have a way to apply SCM ideas to the examples in the context of StatisticalRethinking, i.e. a working version of basis_set(), d_separation(), m_separation() and adjustment_sets().

SCM can be used as an alias to StructuralCausalModels.

source

SCM

scm_path

StructuralCausalModels.scm_pathMethod

scm_path

Relative path using the StructuralCausalModels.jl src/ directory.

Example to get access to the data subdirectory

scm_path("..", "data")

Part of the API, exported.

source

DAG

StructuralCausalModels.DAGMethod

DAG

Directed acyclic graph constructor

DAG(name, model; df)

Required arguments

* `name::AbstractString`               : Name for the DAG object
* `d::ModelDefinition`                 : DAG definition

where

ModelDefinition = Union{OrderedDict, AbstractString, NamedArray}

See the extended help for a usage example.

Keyword arguments

* `df::DataFrame`                      : DataFrame with observations

Returns

* `dag::DAG`                           : Boolean result of test

Extended help

In the definition of the OrderedDict, read => as ~ in regression models or <- in causal models, e.g.:

d = OrderedDict(
  :u => [:x, :v],
  :s1 => [:u],
  :w => [:v, :y],
  :s2 => [:w]
);
dag = DAG("my_name", d)

Coming from R's dagitty:

amat <- dagitty("dag { {X V} -> U; S1 <- U; {Y V} -> W; S2 <- W}”)

dag = DAG("my_name", "dag { {X V} -> U; S1 <- U; {Y V} -> W; S2 <- W}”)
display(dag) # Show the DAG

Coming from R's ggm:

amat <- DAG(U~X+V, S1~U, W~V+Y, S2~W, order=FALSE)

dag = DAG("my_name", "DAG(U~X+V, S1~U, W~V+Y, S2~W”)
display(dag) # Show the DAG

Acknowledgements

Original author: Giovanni M. Marchetti

Translated to Julia: Rob J Goedman

License

The R package ggm is licensed under License: GPL-2.

The Julia translation is licenced under: MIT.

Part of API, exported.

source

d_separation

StructuralCausalModels.d_separationMethod

d_separation

d_separation(d, f, s; c, debug)

Computes the d_separation between 2 sets of nodes conditioned on a third set.

Required arguments

d_separation(
* `d::DAG`                             : DAG
* `f::SymbolList`                      : First set
*  s::SymbolList`                      : Second set
)

Keyword arguments

* `c::SymbolListOrNothing=nothing`     : Conditioning set
* `debug=false`                        : Trace execution

Returns

* `res::Bool`                          : Boolean result of test

Extended help

Example

d_separation between mechanics and statistics, conditioning on algebra

using StructuralCausalModels, CSV

df = DataFrame!(CSV.File(scm_path("..", "data", "marks.csv"));

d = OrderedDict(
  :mechanics => [:vectors, :algebra],
  :vectors => [:algebra],
  :analysis => [:algebra],
  :statistics => [:algebra, :analysis]
);

dag = DAG("marks", d, df);
d_separation(marks, [:statistics], [:mechanics]; c=[:algebra]))

Acknowledgements

Original author: Giovanni M. Marchetti

Translated to Julia: Rob J Goedman

License

The R package ggm is licensed under License: GPL-2.

The Julia translation is licenced under: MIT.

Part of the API, exported.

source

m_separation

StructuralCausalModels.m_separationMethod

m_separation

m_separation(d, f, s; c, debug)

Computes the m_separation between 2 sets of nodes conditioned on a third set.

Required arguments

m_separation(
* `d::DAG`                             : DAG
* `f::SymbolList`                      : First vertex or set
*  s::SymbolList`                      : Second vertex or set
)

Keyword arguments

* `c::SymbolListOrNothing=nothing`     : Conditioning set
* `debug=false`                        : Trace execution

Returns

* `res::Bool`                          : Boolean result of test

Extended help

Example

m_separation between mechanics and statistics, conditioning on algebra

using StructuralCausalModels, CSV

df = DataFrame!(CSV.File(scm_path("..", "data", "marks.csv"));

d = OrderedDict(
  :mechanics => [:vectors, :algebra],
  :vectors => [:algebra],
  :analysis => [:algebra],
  :statistics => [:algebra, :analysis]
);

dag = DAG("marks", d, df);
m_separation(marks, [:statistics], [:mechanics]; c=[:algebra]))

Acknowledgements

Original author: Giovanni M. Marchetti

Translated to Julia: Rob J Goedman

License

The R package ggm is licensed under License: GPL-2.

The Julia translation is licenced under: MIT.

Part of the API, exported.

source

shipley_test

StructuralCausalModels.shipley_testMethod

shipley_test

shipley_test(d)

Test of all independencies implied by a given DAG

Computes a simultaneous test of all independence relationships implied by a given Gaussian model defined according to a directed acyclic graph, based on the sample covariance matrix.

The test statistic is C = -2 sum(ln(pj)) where pj are the p-values of tests of conditional independence in the basis set computed by basiSet(A). The p-values are independent uniform variables on (0,1) and the statistic has exactly a chi square distribution on 2k degrees of freedom where k is the number of elements of the basis set. Shipley (2002) calls this test Fisher's C test.

Method

shipley_test(;
* `d::Dag`                             : Directed acyclic graph
)

Returns

* `res::NamedTuple`                    : (ctest=..., dof=..., pval=...)

where:

ctest: Test statistic C dof: Degrees of freedom. pval: The P-value of the test, assuming a two-sided alternative.

Extended help

Example

Shipley_test for the mathematics marks data

using StructuralCausalModels, RData

objs = RData.load(scm_path("..", "data", "marks.rda");
marks_df = objs["marks"]

d = OrderedDict(
  :mechanics => [:vectors, :algebra],
  :vectors => [:algebra],
  :statistics => [:algebra, :analysis],
  :analysis => [:algebra]
);
dag = Dag(d; df=df)
shipley_test(dag)

See also

?Dag
?basis_set
?pcor_test

Acknowledgements

Original author: Giovanni M. Marchetti

Translated to Julia: Rob J Goedman

References

Shipley, B. (2000). A new inferential test for path models based on directed acyclic graphs. Structural Equation Modeling, 7(2), 206–218.

Licence

The R package ggm is licensed under License: GPL-2.

The Julia translation is licenced under: MIT.

Part of the api, exported.

source

basis_set

ancestral_graph

StructuralCausalModels.ancestral_graphMethod

ancestral_graph

ancestral_graph(d; m, c)

Ancestral graphs after marginalization and conditioning.

Required arguments

* `d::DAG`                             : DAG onject

Optional arguments

* `m::Vector{Symbol}`                  : Nodes in DAG that are marginalized
* `c::Vector{Symbol})`                 : Nodes in DAG there are conditioned on

Returns

* `ag::NamedArray`                     : Ancestral graph remaining
source

ribbbon_graph

StructuralCausalModels.ribbon_graphMethod

ribbon_graph

ribbon_graph(d; m, c)

Ribbon graphs after marginalization and conditioning.

Required arguments

* `d::DAG`                             : DAG onject

Optional arguments

* `m::Vector{Symbol}`                  : Nodes in DAG that are marginalized
* `c::Vector{Symbol})`                 : Nodes in DAG there are conditioned on

Returns

* `rg::NamedArray`                     : Ribbon graph remaining
source

adjustment_sets

StructuralCausalModels.adjustment_setsMethod

adjustment_sets

adjustment_sets(dag, f, l; debug)

Computes the covariance adjustment vertex set.

Required arguments

* `dag::DAG`                           : DAG
* `f::Symbol`                          : Start variable
* `l::Symbol`                          : End variable

Optional arguments

* `debug::Bool`                        : Show debug trace

Returns

* `adjustmentsets=Vector{Symbol}[]`    : Array of adjustment sets

Extended help

Acknowledgements

Original author : Rob J Goedman

Licence

Licenced under: MIT.

Part of the api, exported.

source

paths

Support_functions

StructuralCausalModels.ancestral_graphMethod

ancestral_graph

ancestral_graph(amat; m, c)

Ancestral graphs after marginalization and conditioning.

Required arguments

* `amat::NamedArray{Int, 2}`           : Adjacency matrix of a DAG

Optional arguments

* `m::Vector{Symbol}`                  : Nodes in DAG that are marginalized
* `c::Vector{Symbol})`                 : Nodes in DAG there are conditioned on

Returns

* `ag::NamedArray`                     : Ancestral graph remaining

Extended help

Example

Adjacency matrix used for testing in ggm

amat_data = transpose(reshape([
  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,
  0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,
  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,
  0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,
  0,0,0,0,1,0,1,0,1,1,0,0,0,0,0,0,
  1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,
  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,
  0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0
], (16,16)));

vars = [Symbol("n\$i") for i in 1:size(amat_data, 1)]
amat = NamedArray(Int.(amat_data), (vars, vars), ("Rows", "Cols"));
m = [:n3, :n5, :n6, :n15, :n16];
c = [:n4, :n7];

ag = ancestral_graph(amat; m = m, c = c)

Acknowledgements

Original author: Kayvan Sadeghi

Translated to Julia: Rob J Goedman

References

Sadeghi, K. (2011). Stable classes of graphs containing directed acyclic graphs.

Richardson, T.S. and Spirtes, P. (2002). Ancestral graph Markov models {Annals of Statistics}, 30(4), 962-1030.

Licence

The R package ggm is licensed under License: GPL-2.

The Julia translation is licenced under: MIT.

Part of the api, exported.

source
StructuralCausalModels.DAGType

DAG

Directed acyclic graph struct

Struct

DAG(
* `name::AbstractString`                    : Name for the DAG object
* `d::OrderedDictOrNothing`                 : DAG definition as an OrderedDict
* `a::NamedArrayOrNothing`                  : Adjacency matrix
* `e::NamedArrayOrNothing`                  : Edge matrix
* `s::NamedArrayOrNothing`                  : Covariance matrix
* `df::DataFrameOrNothing`                  : Variable observations
* `vars::Vector{Symbol}`                    : Names of variables in DAG
)

Part of API, exported.

source
StructuralCausalModels.pcorMethod

pcor

pcor(d, u)

Computes the partial correlation between two variables given a set of other variables.

Method

pcor(;
* `d::DAG`                             : DAG object
* `u::Vector{Symbol}`                  : Variables used to compute correlation
)

where:

u[1], u[2]: Variables used to compute correlation between, remaining variables are the conditioning set

Returns

* `res::Float64`                       : Correlation between u[1] and u[2]

Extended help

Example

Correlation between vectors and algebra, conditioning on analysis and statistics

using StructuralCausalModels, CSV

df = DataFrame!(CSV.File(scm_path("..", "data", "marks.csv"));
S = cov(Array(df))

u = [2, 3, 4, 5]
pcor(u, S)
u = [:vectors, :algebra, :statistics, :analysis]

Acknowledgements

Original author: Giovanni M. Marchetti

Translated to Julia: Rob J Goedman

License

The R package ggm is licensed under License: GPL-2.

The Julia translation is licenced under: MIT.

Part of the api, not exported.

source
StructuralCausalModels.pcor_testMethod

pcor_test

pcor_test(d, u, q, n)

Computes the partial correlation between two variables given a set of other variables.

Method

pcor_test(;
* `u::Vector{Symbol}`                  : Variables used to compute correlation
* `q::Int`                             : Number of variables in conditioning set
* `n::Int`                             : Number of observations
* `S::Matrix`                          : Sample covariance matrix
)

where:

u[1], u[2]: Variables used to compute correlation between, remaining variables are the conditioning set

Returns

* `res::Float64`                       : Correlation between u[1] and u[2]

Extended help

Acknowledgements

Original author: Giovanni M. Marchetti

Translated to Julia: Rob J Goedman

License

The R package ggm is licensed under License: GPL-2.

The Julia translation is licenced under: MIT.

Part of the api, not exported.

source
StructuralCausalModels.ribbon_graphMethod

ribbon_graph

ribbon_graph(amat; m, c)

Ribbon graphs after marginalization and conditioning.

Required arguments

* `amat::NamedArray{Int, 2}`           : Adjacency matrix of a DAG

Optional arguments

* `m::Vector{Symbol}`                  : Nodes in DAG that are marginalized
* `c::Vector{Symbol})`                 : Nodes in DAG there are conditioned on

Returns

* `rg::NamedArray`                     : Ribbon graph remaining

Extended help

Example

Adjacency matrix used for testing in ggm

amat_data = transpose(reshape([
  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,
  0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,
  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,
  0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,
  0,0,0,0,1,0,1,0,1,1,0,0,0,0,0,0,
  1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,
  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,
  0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0
], (16,16)));

vars = [Symbol("n\$i") for i in 1:size(amat_data, 1)]
amat = NamedArray(Int.(amat_data), (vars, vars), ("Rows", "Cols"));
m = [:n3, :n5, :n6, :n15, :n16];
c = [:n4, :n7];

rg = ribbon_graph(amat; m = m, c = c)

Acknowledgements

Original author: Kayvan Sadeghi

Translated to Julia: Rob J Goedman

References

Sadeghi, K. (2011). Stable classes of graphs containing directed acyclic graphs.

Richardson, T.S. and Spirtes, P. (2002). Ancestral graph Markov models {Annals of Statistics}, 30(4), 962-1030.

Sadeghi, K. and Lauritzen, S.L. (2011). Markov properties for loopless mixed graphs.

Licence

The R package ggm is licensed under License: GPL-2.

The Julia translation is licenced under: MIT.

Part of the api, exported.

source
StructuralCausalModels.set_dag_df!Method

set_dag_df!

Set or update Dataframe associated to DAG

set_dag_df!(d, df; force)

Required arguments

* `d::DAG`                                  : Previously defined DAG object 
* `df::DataFrameOrNothing`                  : DataFrame associated with DAG
)

Optional arguments

* `force=false`                             : Force assignment of df 
)

The force = true option can be used if the DAG involves unobserved nodes.

Part of API, exported.

source
StructuralCausalModels.set_dag_cov_matrix!Method

set_dag_cov_matrix!

Set or update the covariance matrix associated to DAG

set_dag_cov_matrix!(d, cm; force)

Required arguments

* `d::DAG`                                  : Previously defined DAG object 
* `cm::NamedArrayOrNothing`                 : Covariance matrix in NamedArray format
)

Optional arguments

* `force=false`                             : Force assignment of df 
)

The force = true option can be used if the DAG involves unobserved nodes.

Part of API, exported.

source