StructuralCausalModels API · StructuralCausalModels.jl

StructuralCausalModels

StructuralCausalModels.StructuralCausalModels — Module

StructuralCausalModels

StructuralCausalModels.jl provides functionality to analyse directed acyclic graph (DAG) based causal models as described in StatisticalRethinking, Causal Inference in Statistics and Cause and Correlation in Biology.

My initial goal for this package is to have a way to apply SCM ideas to the examples in the context of StatisticalRethinking, i.e. a working version of basis_set(), d_separation(), m_separation() and adjustment_sets().

SCM can be used as an alias to StructuralCausalModels.

SCM

StructuralCausalModels.SCM — Module

SCM

Alias for StructuralCausalModels

scm_path

StructuralCausalModels.scm_path — Method

scm_path

Relative path using the StructuralCausalModels.jl src/ directory.

Example to get access to the data subdirectory

scm_path("..", "data")

Part of the API, exported.

DAG

StructuralCausalModels.DAG — Method

DAG

Directed acyclic graph constructor

DAG(name, model; df)

Required arguments

* `name::AbstractString`               : Name for the DAG object
* `d::ModelDefinition`                 : DAG definition

where

ModelDefinition = Union{OrderedDict, AbstractString, NamedArray}

See the extended help for a usage example.

Keyword arguments

* `df::DataFrame`                      : DataFrame with observations

Returns

* `dag::DAG`                           : Boolean result of test

Extended help

In the definition of the OrderedDict, read => as ~ in regression models or <- in causal models, e.g.:

d = OrderedDict(
  :u => [:x, :v],
  :s1 => [:u],
  :w => [:v, :y],
  :s2 => [:w]
);
dag = DAG("my_name", d)

Coming from R's dagitty:

amat <- dagitty("dag { {X V} -> U; S1 <- U; {Y V} -> W; S2 <- W}”)

dag = DAG("my_name", "dag { {X V} -> U; S1 <- U; {Y V} -> W; S2 <- W}”)
display(dag) # Show the DAG

Coming from R's ggm:

amat <- DAG(U~X+V, S1~U, W~V+Y, S2~W, order=FALSE)

dag = DAG("my_name", "DAG(U~X+V, S1~U, W~V+Y, S2~W”)
display(dag) # Show the DAG

Acknowledgements

Original author: Giovanni M. Marchetti

Translated to Julia: Rob J Goedman

License

The R package ggm is licensed under License: GPL-2.

The Julia translation is licenced under: MIT.

Part of API, exported.

d_separation

StructuralCausalModels.d_separation — Method

d_separation

d_separation(d, f, s; c, debug)

Computes the d_separation between 2 sets of nodes conditioned on a third set.

Required arguments

d_separation(
* `d::DAG`                             : DAG
* `f::SymbolList`                      : First set
*  s::SymbolList`                      : Second set
)

Keyword arguments

* `c::SymbolListOrNothing=nothing`     : Conditioning set
* `debug=false`                        : Trace execution

Returns

* `res::Bool`                          : Boolean result of test

Extended help

Example

d_separation between mechanics and statistics, conditioning on algebra

using StructuralCausalModels, CSV

df = DataFrame!(CSV.File(scm_path("..", "data", "marks.csv"));

d = OrderedDict(
  :mechanics => [:vectors, :algebra],
  :vectors => [:algebra],
  :analysis => [:algebra],
  :statistics => [:algebra, :analysis]
);

dag = DAG("marks", d, df);
d_separation(marks, [:statistics], [:mechanics]; c=[:algebra]))

Acknowledgements

Original author: Giovanni M. Marchetti

Translated to Julia: Rob J Goedman

License

The R package ggm is licensed under License: GPL-2.

The Julia translation is licenced under: MIT.

Part of the API, exported.

m_separation

StructuralCausalModels.m_separation — Method

m_separation

m_separation(d, f, s; c, debug)

Computes the m_separation between 2 sets of nodes conditioned on a third set.

Required arguments

m_separation(
* `d::DAG`                             : DAG
* `f::SymbolList`                      : First vertex or set
*  s::SymbolList`                      : Second vertex or set
)

Keyword arguments

* `c::SymbolListOrNothing=nothing`     : Conditioning set
* `debug=false`                        : Trace execution

Returns

* `res::Bool`                          : Boolean result of test

Extended help

Example

m_separation between mechanics and statistics, conditioning on algebra

using StructuralCausalModels, CSV

df = DataFrame!(CSV.File(scm_path("..", "data", "marks.csv"));

d = OrderedDict(
  :mechanics => [:vectors, :algebra],
  :vectors => [:algebra],
  :analysis => [:algebra],
  :statistics => [:algebra, :analysis]
);

dag = DAG("marks", d, df);
m_separation(marks, [:statistics], [:mechanics]; c=[:algebra]))

Acknowledgements

Original author: Giovanni M. Marchetti

Translated to Julia: Rob J Goedman

License

The R package ggm is licensed under License: GPL-2.

The Julia translation is licenced under: MIT.

Part of the API, exported.

shipley_test

StructuralCausalModels.shipley_test — Method

shipley_test

shipley_test(d)

Test of all independencies implied by a given DAG

Computes a simultaneous test of all independence relationships implied by a given Gaussian model defined according to a directed acyclic graph, based on the sample covariance matrix.

The test statistic is C = -2 sum(ln(pj)) where pj are the p-values of tests of conditional independence in the basis set computed by basiSet(A). The p-values are independent uniform variables on (0,1) and the statistic has exactly a chi square distribution on 2k degrees of freedom where k is the number of elements of the basis set. Shipley (2002) calls this test Fisher's C test.

Method

shipley_test(;
* `d::Dag`                             : Directed acyclic graph
)

Returns

* `res::NamedTuple`                    : (ctest=..., dof=..., pval=...)

where:

ctest: Test statistic C dof: Degrees of freedom. pval: The P-value of the test, assuming a two-sided alternative.

Extended help

Example

Shipley_test for the mathematics marks data

using StructuralCausalModels, RData

objs = RData.load(scm_path("..", "data", "marks.rda");
marks_df = objs["marks"]

d = OrderedDict(
  :mechanics => [:vectors, :algebra],
  :vectors => [:algebra],
  :statistics => [:algebra, :analysis],
  :analysis => [:algebra]
);
dag = Dag(d; df=df)
shipley_test(dag)

See also

?Dag
?basis_set
?pcor_test

Acknowledgements

Original author: Giovanni M. Marchetti

Translated to Julia: Rob J Goedman

References

Shipley, B. (2000). A new inferential test for path models based on directed acyclic graphs. Structural Equation Modeling, 7(2), 206–218.

Licence

The R package ggm is licensed under License: GPL-2.

The Julia translation is licenced under: MIT.

Part of the api, exported.

basis_set

StructuralCausalModels.basis_set — Method

basis_set

Determine basis_set

basis_set(dag; debug)

Part of API, exported.

ancestral_graph

StructuralCausalModels.ancestral_graph — Method

ancestral_graph

ancestral_graph(d; m, c)

Ancestral graphs after marginalization and conditioning.

Required arguments

* `d::DAG`                             : DAG onject

Optional arguments

* `m::Vector{Symbol}`                  : Nodes in DAG that are marginalized
* `c::Vector{Symbol})`                 : Nodes in DAG there are conditioned on

Returns

* `ag::NamedArray`                     : Ancestral graph remaining

ribbbon_graph

StructuralCausalModels.ribbon_graph — Method

ribbon_graph

ribbon_graph(d; m, c)

Ribbon graphs after marginalization and conditioning.

Required arguments

* `d::DAG`                             : DAG onject

Optional arguments

* `m::Vector{Symbol}`                  : Nodes in DAG that are marginalized
* `c::Vector{Symbol})`                 : Nodes in DAG there are conditioned on

Returns

* `rg::NamedArray`                     : Ribbon graph remaining

adjustment_sets

StructuralCausalModels.adjustment_sets — Method

adjustment_sets

adjustment_sets(dag, f, l; debug)

Computes the covariance adjustment vertex set.

Required arguments

* `dag::DAG`                           : DAG
* `f::Symbol`                          : Start variable
* `l::Symbol`                          : End variable

Optional arguments

* `debug::Bool`                        : Show debug trace

Returns

* `adjustmentsets=Vector{Symbol}[]`    : Array of adjustment sets

Extended help

Acknowledgements

Original author : Rob J Goedman

Licence

Licenced under: MIT.

Part of the api, exported.

paths

StructuralCausalModels.all_paths — Method

all_edges

all_paths(d, f, l; debug)

Part of the API, exported.

StructuralCausalModels.backdoor_paths — Method

backdoor_paths

backdoor_paths(d, paths, f)

Internal.

Support_functions

StructuralCausalModels.adjacency_matrix — Method

adjacency_matrix

adjacency_matrix(d)

Part of the API, exported

StructuralCausalModels.adjacency_matrix — Method

adjacency_matrix

adjacency_matrix(e)

Part of the API, exported

StructuralCausalModels.adjacency_matrix_to_dict — Method

adjacency_matrix_to_dict

adjacency_matrix_to_dict(a)

Part of the API, exported.

StructuralCausalModels.ancester_graph — Method

ancestor_graph

ancester_graph(e)

Internal

StructuralCausalModels.ancestral_graph — Method

ancestral_graph

ancestral_graph(amat; m, c)

Ancestral graphs after marginalization and conditioning.

Required arguments

* `amat::NamedArray{Int, 2}`           : Adjacency matrix of a DAG

Optional arguments

* `m::Vector{Symbol}`                  : Nodes in DAG that are marginalized
* `c::Vector{Symbol})`                 : Nodes in DAG there are conditioned on

Returns

* `ag::NamedArray`                     : Ancestral graph remaining

Extended help

Example

Adjacency matrix used for testing in ggm

amat_data = transpose(reshape([
  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,
  0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,
  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,
  0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,
  0,0,0,0,1,0,1,0,1,1,0,0,0,0,0,0,
  1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,
  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,
  0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0
], (16,16)));

vars = [Symbol("n\$i") for i in 1:size(amat_data, 1)]
amat = NamedArray(Int.(amat_data), (vars, vars), ("Rows", "Cols"));
m = [:n3, :n5, :n6, :n15, :n16];
c = [:n4, :n7];

ag = ancestral_graph(amat; m = m, c = c)

Acknowledgements

Original author: Kayvan Sadeghi

Translated to Julia: Rob J Goedman

References

Sadeghi, K. (2011). Stable classes of graphs containing directed acyclic graphs.

Richardson, T.S. and Spirtes, P. (2002). Ancestral graph Markov models {Annals of Statistics}, 30(4), 962-1030.

Licence

The R package ggm is licensed under License: GPL-2.

The Julia translation is licenced under: MIT.

Part of the api, exported.

StructuralCausalModels.check_open — Method

check_open

check_open(d, path, conditioning_set; debug)

Internal.

StructuralCausalModels.DAG — Type

DAG

Directed acyclic graph struct

Struct

DAG(
* `name::AbstractString`                    : Name for the DAG object
* `d::OrderedDictOrNothing`                 : DAG definition as an OrderedDict
* `a::NamedArrayOrNothing`                  : Adjacency matrix
* `e::NamedArrayOrNothing`                  : Edge matrix
* `s::NamedArrayOrNothing`                  : Covariance matrix
* `df::DataFrameOrNothing`                  : Variable observations
* `vars::Vector{Symbol}`                    : Names of variables in DAG
)

Part of API, exported.

StructuralCausalModels.dag_show — Method

dag_show

dag_show(io, d)

Internal

StructuralCausalModels.dag_vars — Method

dag_vars

dag_vars(d)

Part of the API, exported

StructuralCausalModels.edge_matrix — Method

edge_matrix

edge_matrix(d)

Part of the API, exported

StructuralCausalModels.edge_matrix — Function

edge_matrix

edge_matrix(a)
edge_matrix(a, inv)

Part of the API, exported

StructuralCausalModels.indicator_matrix — Method

indicator_matrix

indicator_matrix(e)

Internal

StructuralCausalModels.induced_covariance_graph — Method

induced_covariance_graph

induced_covariance_graph(d, sel, c; debug)

Internal

StructuralCausalModels.node_edges — Method

node_edges

node_edges(p, s, l; debug)

Internal.

StructuralCausalModels.open_paths — Method

open_paths

open_paths(d, paths, cond; debug)

Internal.

StructuralCausalModels.pcor — Method

pcor

pcor(d, u)

Computes the partial correlation between two variables given a set of other variables.

Method

pcor(;
* `d::DAG`                             : DAG object
* `u::Vector{Symbol}`                  : Variables used to compute correlation
)

where:

u[1], u[2]: Variables used to compute correlation between, remaining variables are the conditioning set

Returns

* `res::Float64`                       : Correlation between u[1] and u[2]

Extended help

Example

Correlation between vectors and algebra, conditioning on analysis and statistics

using StructuralCausalModels, CSV

df = DataFrame!(CSV.File(scm_path("..", "data", "marks.csv"));
S = cov(Array(df))

u = [2, 3, 4, 5]
pcor(u, S)
u = [:vectors, :algebra, :statistics, :analysis]

Acknowledgements

Original author: Giovanni M. Marchetti

Translated to Julia: Rob J Goedman

License

The R package ggm is licensed under License: GPL-2.

The Julia translation is licenced under: MIT.

Part of the api, not exported.

StructuralCausalModels.pcor_test — Method

pcor_test

pcor_test(d, u, q, n)

Computes the partial correlation between two variables given a set of other variables.

Method

pcor_test(;
* `u::Vector{Symbol}`                  : Variables used to compute correlation
* `q::Int`                             : Number of variables in conditioning set
* `n::Int`                             : Number of observations
* `S::Matrix`                          : Sample covariance matrix
)

where:

u[1], u[2]: Variables used to compute correlation between, remaining variables are the conditioning set

Returns

* `res::Float64`                       : Correlation between u[1] and u[2]

Extended help

Acknowledgements

Original author: Giovanni M. Marchetti

Translated to Julia: Rob J Goedman

License

The R package ggm is licensed under License: GPL-2.

The Julia translation is licenced under: MIT.

Part of the api, not exported.

StructuralCausalModels.ribbon_graph — Method

ribbon_graph

ribbon_graph(amat; m, c)

Ribbon graphs after marginalization and conditioning.

Required arguments

* `amat::NamedArray{Int, 2}`           : Adjacency matrix of a DAG

Optional arguments

* `m::Vector{Symbol}`                  : Nodes in DAG that are marginalized
* `c::Vector{Symbol})`                 : Nodes in DAG there are conditioned on

Returns

* `rg::NamedArray`                     : Ribbon graph remaining

Extended help

Example

Adjacency matrix used for testing in ggm

amat_data = transpose(reshape([
  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,
  0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,
  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,
  0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,
  0,0,0,0,1,0,1,0,1,1,0,0,0,0,0,0,
  1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,
  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
  1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,
  0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0
], (16,16)));

vars = [Symbol("n\$i") for i in 1:size(amat_data, 1)]
amat = NamedArray(Int.(amat_data), (vars, vars), ("Rows", "Cols"));
m = [:n3, :n5, :n6, :n15, :n16];
c = [:n4, :n7];

rg = ribbon_graph(amat; m = m, c = c)

Acknowledgements

Original author: Kayvan Sadeghi

Translated to Julia: Rob J Goedman

References

Sadeghi, K. (2011). Stable classes of graphs containing directed acyclic graphs.

Richardson, T.S. and Spirtes, P. (2002). Ancestral graph Markov models {Annals of Statistics}, 30(4), 962-1030.

Sadeghi, K. and Lauritzen, S.L. (2011). Markov properties for loopless mixed graphs.

Licence

The R package ggm is licensed under License: GPL-2.

The Julia translation is licenced under: MIT.

Part of the api, exported.

StructuralCausalModels.sym_in_all_paths — Method

syms_in_all_paths

Check if a vertex is part of all paths.

sym_in_all_paths(paths, sym)

Part of the API, Exported

StructuralCausalModels.syms_in_paths — Method

syms_in_paths

Collect vertices in all paths.

syms_in_paths(paths, f, l)

Part of the API, Exported

StructuralCausalModels.topological_order — Method

topological_order

topological_order(a)

Part of the API, exported

StructuralCausalModels.topological_sort — Method

topological_sort

topological_sort(a)

Part of the API, exported

StructuralCausalModels.topological_sort — Method

topological_sort

topological_sort(dag)

Part of the API, exported

StructuralCausalModels.topological_sort! — Method

topological_sort!

topological_sort!(dag)

Part of the API, exported

StructuralCausalModels.transitive_closure — Method

transitive_closure

transitive_closure(a)

Internal

StructuralCausalModels.set_dag_df! — Method

set_dag_df!

Set or update Dataframe associated to DAG

set_dag_df!(d, df; force)

Required arguments

* `d::DAG`                                  : Previously defined DAG object 
* `df::DataFrameOrNothing`                  : DataFrame associated with DAG
)

Optional arguments

* `force=false`                             : Force assignment of df 
)

The force = true option can be used if the DAG involves unobserved nodes.

Part of API, exported.

StructuralCausalModels.set_dag_cov_matrix! — Method

set_dag_cov_matrix!

Set or update the covariance matrix associated to DAG

set_dag_cov_matrix!(d, cm; force)

Required arguments

* `d::DAG`                                  : Previously defined DAG object 
* `cm::NamedArrayOrNothing`                 : Covariance matrix in NamedArray format
)

Optional arguments

* `force=false`                             : Force assignment of df 
)

The force = true option can be used if the DAG involves unobserved nodes.

Part of API, exported.

StructuralCausalModels.undirected_matrix — Method

undirected_matrix

undirected_matrix(d)

Internal