Skip to content

Commit 477bd4c

Browse files
authored
Allow LocalPermutationTest to be used with TEShannon estimated using dedicated TE estimators (#350)
* Fix issue #348 * More effective estimation for `Lindner` when doing e.g. surrogate tests partially addresses #344 * Typos * Add tests * Up patch version * Correctly scale * Make sure we have enough samples for tests * Better test organization * It is the estimator that controls what happens, not the measure * Add note to `LocalPermutationTest` docstring about transfer entropy * Error should occur only for `TransferEntropyEstimator`s * More tests * Improve test comments. * Fix #349 And also mention conditioning in `Zhu1` docs
1 parent 5909d72 commit 477bd4c

File tree

14 files changed

+250
-76
lines changed

14 files changed

+250
-76
lines changed

Project.toml

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ name = "CausalityTools"
22
uuid = "5520caf5-2dd7-5c5d-bfcb-a00e56ac49f7"
33
authors = ["Kristian Agasøster Haaga <kahaaga@gmail.com>", "Tor Einar Møller <temolle@gmail.com>", "George Datseris <datseris.george@gmail.com>"]
44
repo = "https://github.com/kahaaga/CausalityTools.jl.git"
5-
version = "2.9.1"
5+
version = "2.9.2"
66

77
[deps]
88
Accessors = "7d9f7c33-5ae7-4f3b-8dc6-eff91059b697"

changelog.md

+2
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,8 @@
55
### Bug fixes
66

77
- Fixed bug in `transferentropy` function which yielded identical results in both directions for the bivariate case.
8+
- Fixed bug that occurred when using `LocalPermutationTest` with `TEShannon` as the measure and a dedicated `TransferEntropyEstimator` (e.g. `Zhu1` or `Lindner`). This occurred because the `LocalPermutationTest` is, strictly speaking, a test using conditional mutual information as the measure. Therefore, naively applying a `TransferEntropy` measure such as `TEShannon` would error. This is fixed by performing a similar procedure where the source marginal is shuffled according to local neighborhoods in the conditional marginal. This is similar, but not identical to the CMI-based `LocalPermutationTest`, and adapts to the specific case of transfer entropy estimation using dedicated transfer entropy estimators instead of some lower-level estimator.
9+
- Fixed bug in `Zhu1` transfer entropy estimator where when box volumes were extremely small, taking the logarithm of volume ratios resulted in `Inf` values. This was solved by simply ignoring these volumes.
810

911
## 2.8.0
1012

src/independence_tests/local_permutation/LocalPermutationTest.jl

+7-7
Original file line numberDiff line numberDiff line change
@@ -72,13 +72,13 @@ instead of `Z` and we `I(X; Y)` and `Iₖ(X̂; Y)` instead of `I(X; Y | Z)` and
7272
7373
## Compatible measures
7474
75-
| Measure | Pairwise | Conditional | Requires `est` |
76-
| ----------------------------- | :------: | :---------: | :------------: |
77-
| [`PartialCorrelation`](@ref) | ✖ | ✓ | No |
78-
| [`DistanceCorrelation`](@ref) | ✖ | ✓ | No |
79-
| [`CMIShannon`](@ref) | ✖ | ✓ | Yes |
80-
| [`TEShannon`](@ref) | ✓ | ✓ | Yes |
81-
| [`PMI`](@ref) | ✖ | ✓ | Yes |
75+
| Measure | Pairwise | Conditional | Requires `est` | Note |
76+
| ----------------------------- | :------: | :---------: | :------------: | :-------------------------------------------------------------------------------------------------------------------------------: |
77+
| [`PartialCorrelation`](@ref) | ✖ | ✓ | No | |
78+
| [`DistanceCorrelation`](@ref) | ✖ | ✓ | No | |
79+
| [`CMIShannon`](@ref) | ✖ | ✓ | Yes | |
80+
| [`TEShannon`](@ref) | ✓ | ✓ | Yes | Pairwise tests not possible with `TransferEntropyEstimator`s, only lower-level estimators, e.g. `FPVP`, `GaussianMI` or `Kraskov` |
81+
| [`PMI`](@ref) | ✖ | ✓ | Yes | |
8282
8383
The `LocalPermutationTest` is only defined for conditional independence testing.
8484
Exceptions are for measures like [`TEShannon`](@ref), which use conditional

src/independence_tests/local_permutation/transferentropy.jl

+49-5
Original file line numberDiff line numberDiff line change
@@ -10,17 +10,61 @@ end
1010

1111
function independence(test::LocalPermutationTest{<:TransferEntropy{<:E}}, x::AbstractVector...) where E
1212
measure, est, nshuffles = test.measure, test.est, test.nshuffles
13+
14+
if !(length(x) == 3) && est isa TransferEntropyEstimator
15+
msg = "`LocalPermutationTest` is not defined for pairwise transfer entropy with " *
16+
" `TransferEntropyEstimators`. " *
17+
"Either provide a third timeseries to condition on, or use some other estimator."
18+
throw(ArgumentError(msg))
19+
end
1320
# Below, the T variable also includes any conditional variables.
1421
S, T, T⁺, C = individual_marginals_te(measure.embedding, x...)
1522
TC = StateSpaceSet(T, C)
1623
@assert length(T⁺) == length(S) == length(TC)
1724
N = length(x)
1825

19-
X, Y = T⁺, S
20-
Z = TC # The conditional variable
21-
cmi = te_to_cmi(measure)
22-
= estimate(cmi, est, X, Y, Z)
23-
Îs = permuted_Îs(X, Y, Z, cmi, est, test)
26+
if est isa TransferEntropyEstimator
27+
= estimate(measure, est, S, T, T⁺, C)
28+
Îs = permuted_Îs_te(S, T, T⁺, C, measure, est, test)
29+
else
30+
X, Y = S, T⁺ # The source marginal `S` is the one being shuffled.
31+
Z = TC # The conditional variable
32+
cmi = te_to_cmi(measure)
33+
= estimate(cmi, est, X, Y, Z)
34+
Îs = permuted_Îs(X, Y, Z, cmi, est, test)
35+
end
36+
2437
p = count(Î .<= Îs) / nshuffles
2538
return LocalPermutationTestResult(length(x), Î, Îs, p, nshuffles)
2639
end
40+
41+
# Runge's local permutation test can't be directly translated to transfer entropy specific
42+
# estimators like `Lindner`. However, but we can use a similar principle where
43+
# the source marginal `S` is shuffled according to local closeness in the
44+
# conditional marginal `C`. The `T` and `T⁺` marginals (i.e. all information)
45+
# about the target variable is left untouched.
46+
function permuted_Îs_te(S, T, T⁺, C, measure::TransferEntropy, est, test)
47+
rng, kperm, nshuffles, replace, w = test.rng, test.kperm, test.nshuffles, test.replace, test.w
48+
49+
N = length(S)
50+
test.kperm < N || throw(ArgumentError("kperm must be smaller than input data length"))
51+
52+
# Search for neighbors in the conditional marginal
53+
tree_C = KDTree(C, Chebyshev())
54+
idxs_C = bulkisearch(tree_C, C, NeighborNumber(kperm), Theiler(w))
55+
56+
# Shuffle source marginal `S` based on local closeness in C.
57+
= deepcopy(S)
58+
Nᵢ = MVector{kperm, Int}(zeros(kperm)) # A statically sized copy
59+
πs = shuffle(rng, 1:N)
60+
Îs = zeros(nshuffles)
61+
for n in 1:nshuffles
62+
if replace
63+
shuffle_with_replacement!(Ŝ, S, idxs_C, rng)
64+
else
65+
shuffle_without_replacement!(Ŝ, S, idxs_C, kperm, rng, Nᵢ, πs)
66+
end
67+
Îs[n] = estimate(measure, est, Ŝ, T, T⁺, C)
68+
end
69+
return Îs
70+
end

src/methods/infomeasures/transferentropy/estimators/Lindner.jl

+30-16
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,8 @@ also used in the Trentool MATLAB toolbox, and is based on nearest neighbor searc
1515
during neighbor searches (defaults to `0`, meaning that only the point itself is excluded
1616
when searching for neighbours).
1717
18+
The estimator can be used both for pairwise and conditional transfer entropy estimation.
19+
1820
## Description
1921
2022
For a given points in the joint embedding space `jᵢ`, this estimator first computes the
@@ -32,7 +34,8 @@ TE(X \\to Y) =
3234
```
3335
3436
where the index `k` references the three marginal subspaces `T`, `TTf` and `ST` for which
35-
neighbor searches are performed.
37+
neighbor searches are performed. Here this estimator has been modified to allow for
38+
conditioning too (a simple modification to [Lindner2011](@citet)'s equation 5 and 6).
3639
"""
3740
Base.@kwdef struct Lindner{B} <: TransferEntropyEstimator
3841
k::Int = 2 # number of neighbors in joint space.
@@ -61,11 +64,22 @@ function estimate(measure::TEShannon, est::Lindner,
6164
C::AbstractStateSpaceSet)
6265
(; k, w, base) = est
6366

64-
joint = StateSpaceSet(S, T, T⁺, C)
65-
ST = StateSpaceSet(S, T, C)
66-
TT⁺ = StateSpaceSet(T, T⁺, C)
67-
T = StateSpaceSet(T, C)
67+
# This layer ensures that the number of `StateSpaceSet`s that must be
68+
# constructed is minimal when doing e.g. surrogate testing (then,
69+
# `S` is the only marginal changing).
70+
TT⁺C = StateSpaceSet(T, T⁺, C)
71+
TC = StateSpaceSet(T, C)
72+
return estimate_with_premade_embeddings(measure, est, S, TT⁺C, TC)
73+
end
74+
75+
function estimate_with_premade_embeddings(measure::TEShannon, est::Lindner,
76+
S::AbstractStateSpaceSet,
77+
TT⁺C::AbstractStateSpaceSet,
78+
TC::AbstractStateSpaceSet)
79+
(; k, w, base) = est
6880

81+
joint = StateSpaceSet(S, TT⁺C)
82+
STC = StateSpaceSet(S, TC)
6983
N = length(joint)
7084
W = Theiler(w)
7185
metric = Chebyshev()
@@ -75,19 +89,19 @@ function estimate(measure::TEShannon, est::Lindner,
7589
# points within distance `ds[i]` from the point. Then count, for each point in each
7690
# of the marginals, how many neighbors each `xᵢ` has given `ds[i]`.
7791
ds = last.(ds_joint) # only care about distance to the k-th neighbor
78-
tree_ST = KDTree(ST, metric)
79-
tree_TT⁺ = KDTree(TT⁺, metric)
80-
tree_T = KDTree(T, metric)
81-
nns_ST = [isearch(tree_ST, pᵢ, WithinRange(ds[i])) for (i, pᵢ) in enumerate(ST)]
82-
nns_TT⁺ = [isearch(tree_TT⁺, pᵢ, WithinRange(ds[i])) for (i, pᵢ) in enumerate(TT⁺)]
83-
nns_T = [isearch(tree_T, pᵢ, WithinRange(ds[i])) for (i, pᵢ) in enumerate(T)]
84-
85-
n_ST = length.(nns_ST)
86-
n_TT⁺ = length.(nns_TT⁺)
87-
n_T = length.(nns_T)
92+
tree_STC = KDTree(STC, metric)
93+
tree_TT⁺C = KDTree(TT⁺C, metric)
94+
tree_TC = KDTree(TC, metric)
95+
nns_STC = [isearch(tree_STC, pᵢ, WithinRange(ds[i])) for (i, pᵢ) in enumerate(STC)]
96+
nns_TT⁺C = [isearch(tree_TT⁺C, pᵢ, WithinRange(ds[i])) for (i, pᵢ) in enumerate(TT⁺C)]
97+
nns_TC = [isearch(tree_TC, pᵢ, WithinRange(ds[i])) for (i, pᵢ) in enumerate(TC)]
98+
99+
n_STC = length.(nns_STC)
100+
n_TT⁺C = length.(nns_TT⁺C)
101+
n_TC = length.(nns_TC)
88102
te = 0.0
89103
for i = 1:N
90-
te += digamma(n_T[i] + 1) - digamma(n_TT⁺[i] + 1) - digamma(n_ST[i])
104+
te += digamma(n_TC[i] + 1) - digamma(n_TT⁺C[i] + 1) - digamma(n_STC[i])
91105
end
92106
te /= N
93107
# The "unit" is nats

src/methods/infomeasures/transferentropy/estimators/Zhu1.jl

+13-3
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,10 @@ export Zhu1
1313
1414
The `Zhu1` transfer entropy estimator [Zhu2015](@cite).
1515
16-
Assumes that the input data have been normalized as described in (Zhu et al., 2015).
16+
Assumes that the input data have been normalized as described in [Zhu2015](@citet).
17+
The estimator can be used both for pairwise and conditional transfer entropy.
18+
19+
## Description
1720
1821
This estimator approximates probabilities within hyperrectangles
1922
surrounding each point `xᵢ ∈ x` using using `k` nearest neighbor searches. However,
@@ -102,10 +105,17 @@ end
102105

103106
function mean_volumes(vols_joint, vols_ST, vols_TT⁺, vols_T, N::Int)
104107
vol = 0.0
108+
n_ignore = 0
105109
for i = 1:N
106-
vol += log((vols_TT⁺[i] * vols_ST[i]) / (vols_joint[i] * vols_T[i]))
110+
num = vols_TT⁺[i] * vols_ST[i]
111+
den = vols_joint[i] * vols_T[i]
112+
if den != 0
113+
vol += log(num / den)
114+
else
115+
n_ignore += 1
116+
end
107117
end
108-
return vol / N
118+
return vol / (N - n_ignore)
109119
end
110120

111121
function mean_digamma(ks_ST, ks_TT⁺, ks_T, k::Int, N::Int,
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
using Test
2+
using CausalityTools
3+
using StableRNGs
4+
5+
rng = StableRNG(123)
6+
x, y, z = rand(rng, 30), rand(rng, 30), rand(rng, 30)
7+
8+
independence_test = LocalPermutationTest(CMIShannon(), FPVP())
9+
# We should get back a convenience wrapper containing the result.
10+
res = independence(independence_test, x, z, y)
11+
@test res isa LocalPermutationTestResult
12+
13+
# We should be able to compute p-values for the result.
14+
@test pvalue(res) isa Real
15+
@test pvalue(res) 0
16+
17+
# Only conditional analyses are possible, meaning that we need three inputs.
18+
# Pairwise analyses won't work, because only two inputs are given.
19+
@test_throws ArgumentError independence(independence_test, x, y)
20+
21+
# Sampling with/without replacement
22+
test_cmi_replace = LocalPermutationTest(CMIShannon(), FPVP(), replace = true)
23+
test_cmi_nonreplace = LocalPermutationTest(CMIShannon(), FPVP(), replace = false)
24+
@test independence(test_cmi_replace, x, y, z) isa LocalPermutationTestResult
25+
@test independence(test_cmi_nonreplace, x, y, z) isa LocalPermutationTestResult
26+
27+
# Measure definition AND estimator must be provided for info measures
28+
@test_throws ArgumentError LocalPermutationTest(TEShannon()) # estimator needed
29+
30+
# The number of local neighbors can't exceed the number of input datapoints
31+
test_kperm_toolarge = LocalPermutationTest(CMIShannon(), FPVP(); kperm = 200, rng)
32+
@test_throws ArgumentError independence(test_kperm_toolarge, x, y, z)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
using Test
2+
using CausalityTools
3+
using StableRNGs
4+
5+
rng = StableRNG(123)
6+
x, y, z = rand(rng, 30), rand(rng, 30), rand(rng, 30)
7+
8+
X = StateSpaceSet(x)
9+
Y = StateSpaceSet(y)
10+
Z = StateSpaceSet(z)
11+
12+
nshuffles = 5
13+
lptest_sp = LocalPermutationTest(CMIShannon(), SymbolicPermutation(); nshuffles, rng)
14+
lptest_vh = LocalPermutationTest(CMIShannon(), ValueHistogram(4); nshuffles, rng)
15+
lptest_dp = LocalPermutationTest(CMIShannon(), Dispersion(); nshuffles, rng)
16+
@test independence(lptest_sp, x, y, z) isa LocalPermutationTestResult
17+
@test independence(lptest_vh, x, y, z) isa LocalPermutationTestResult
18+
@test independence(lptest_dp, x, y, z) isa LocalPermutationTestResult
19+
@test independence(lptest_sp, X, Y, Z) isa LocalPermutationTestResult
20+
@test independence(lptest_vh, X, Y, Z) isa LocalPermutationTestResult
21+
@test independence(lptest_dp, X, Y, Z) isa LocalPermutationTestResult
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
using Test
2+
using CausalityTools
3+
using StableRNGs
4+
5+
rng = StableRNG(123)
6+
x, y, z = rand(rng, 30), rand(rng, 30), rand(rng, 30)
7+
8+
independence_test = LocalPermutationTest(DistanceCorrelation())
9+
@test independence(independence_test, x, y, z) isa LocalPermutationTestResult
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
include("api.jl")
2+
3+
# Measure-specific implementations. One file per method that is listed in the docstring
4+
# of `LocalPermutationTest`.
5+
include("conditional_mutual_information.jl")
6+
include("part_mutual_information.jl")
7+
include("transferentropy.jl")
8+
include("partial_correlation.jl")
9+
include("distance_correlation.jl")

test/independence/LocalPermutation.jl renamed to test/independence/LocalPermutationTest/part_mutual_information.jl

+4-43
Original file line numberDiff line numberDiff line change
@@ -1,49 +1,10 @@
1-
1+
using Test
2+
using CausalityTools
23
using StableRNGs
3-
rng = StableRNG(123)
4-
x, y, z = rand(rng, 100), rand(rng, 100), rand(rng, 100)
5-
6-
test_cmi_replace = LocalPermutationTest(CMIShannon(), FPVP())
7-
test_cmi_nonreplace = LocalPermutationTest(CMIShannon(), FPVP())
8-
9-
test_teshannon = LocalPermutationTest(TEShannon(), FPVP())
10-
@test_throws ArgumentError LocalPermutationTest(TEShannon()) # estimator needed
11-
12-
@test independence(test_cmi_replace, x, y, z) isa LocalPermutationTestResult
13-
@test independence(test_cmi_nonreplace, x, y, z) isa LocalPermutationTestResult
144

15-
@test independence(test_teshannon, x, y, z) isa LocalPermutationTestResult
16-
17-
test_kperm_toolarge = LocalPermutationTest(CMIShannon(), FPVP(); kperm = 200, rng)
18-
@test_throws ArgumentError independence(test_kperm_toolarge, x, y, z)
19-
20-
# CMI
21-
# ------------------------
22-
# Independence tests
23-
x = rand(rng, 50)
24-
y = rand(rng, 50)
25-
z = rand(rng, 50)
26-
X = StateSpaceSet(x)
27-
Y = StateSpaceSet(y)
28-
Z = StateSpaceSet(z)
29-
30-
nshuffles = 5
31-
lptest_sp = LocalPermutationTest(CMIShannon(), SymbolicPermutation(); nshuffles, rng)
32-
lptest_vh = LocalPermutationTest(CMIShannon(), ValueHistogram(4); nshuffles, rng)
33-
lptest_dp = LocalPermutationTest(CMIShannon(), Dispersion(); nshuffles, rng)
34-
@test independence(lptest_sp, x, y, z) isa LocalPermutationTestResult
35-
@test independence(lptest_vh, x, y, z) isa LocalPermutationTestResult
36-
@test independence(lptest_dp, x, y, z) isa LocalPermutationTestResult
37-
@test independence(lptest_sp, X, Y, Z) isa LocalPermutationTestResult
38-
@test independence(lptest_vh, X, Y, Z) isa LocalPermutationTestResult
39-
@test independence(lptest_dp, X, Y, Z) isa LocalPermutationTestResult
5+
rng = StableRNG(123)
6+
x, y, z = rand(rng, 30), rand(rng, 30), rand(rng, 30)
407

41-
# Part mutual information
42-
# ------------------------
43-
# Independence tests
44-
x = rand(rng, 50)
45-
y = rand(rng, 50)
46-
z = rand(rng, 50)
478
X = StateSpaceSet(x)
489
Y = StateSpaceSet(y)
4910
Z = StateSpaceSet(z)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
using Test
2+
using CausalityTools
3+
using StableRNGs
4+
5+
rng = StableRNG(123)
6+
x, y, z = rand(rng, 30), rand(rng, 30), rand(rng, 30)
7+
8+
independence_test = LocalPermutationTest(PartialCorrelation())
9+
@test independence(independence_test, x, y, z) isa LocalPermutationTestResult

0 commit comments

Comments
 (0)