Skip to content

Commit 4a92f3e

Browse files
authored
Add filtering file names when collecting results (#312)
* Added filtering functionality * Add filtering tests * Tests passed * Add changelog entry * Increment version in Project.toml * Add example, assert rgx expr * Fix versioning, bump patch version * Fix silly spelling mistake
1 parent 57deeb4 commit 4a92f3e

File tree

5 files changed

+53
-2
lines changed

5 files changed

+53
-2
lines changed

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
# 2.8.0
2+
* Add filtering of `collect_results` using `rinclude` and `rexclude` keyword arguments.
13
# 2.7.2
24
* By default `storepatch` keywords are `false`. This means that `gitpatch` is NOT stored by default. This is a BUGFIX, because there is an unknown problem of non-halting when storing the patch.
35
# 2.7.0

Project.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
name = "DrWatson"
22
uuid = "634d3b9d-ee7a-5ddf-bec9-22491ea816e1"
33
repo = "https://github.com/JuliaDynamics/DrWatson.jl.git"
4-
version = "2.7.6"
4+
version = "2.8.0"
55

66
[deps]
77
Dates = "ade2ca70-3891-5945-98fb-dc099432e06a"

docs/src/real_world.md

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -433,7 +433,22 @@ As `@onlyif` is meant to be used with [`dict_list`](@ref), it supports the vecto
433433
This is achieved by automatically broadcasting every `@onlyif` call over `Vector` arguments, which allows chaining those calls to combine conditions.
434434
So in terms of the result, `@onlyif( :a == 2, [5, @onlyif(:b == 4, 6)])` is equivalent to `[@onlyif( :a == 2, 5), @onlyif(:a == 2 && :b == 4, 6)]`.
435435

436-
## Advanced Usage of collect_results
436+
## Filtering by name with collect_results
437+
438+
Using [`collect_results`](@ref) on a folder with many (e.g. 1,000) files in it can be noticeably slow. To speed this up, you can use the `rinclude` and `rexclude` keyword arguments, both of which are vectors of [Regex expressions](https://docs.julialang.org/en/v1/manual/strings/#man-regex-literals). The results returned will have a filename which matches **any** of the Regex expressions in `rinclude` and does not match **any** of the Regex expressions in `rexclude`.
439+
440+
```julia
441+
df = collect_results(datadir("results"); rinclude=[r"a=1"])
442+
# Only include results whose filename contains "a=1"
443+
444+
df = collect_results(datadir("results"); rexclude=[r"a=3"])
445+
# Exclude any results whose filename contains "a=3"
446+
447+
df = collect_results(datadir("results"); rinclude=[r"a=1", r"b=5"], rexclude=[r"a=3"])
448+
# Only include results whose filename contains "a=1" OR "b=5" and exclude any which contain "a=3"
449+
```
450+
451+
## Advanced usage of collect_results
437452
At some point in your work you may want to run a single function
438453
that returns multiple fields that you want to include in your
439454
results `DataFrame`.

src/result_collection.jl

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,8 @@ See also [`collect_results`](@ref).
4141
* `verbose = true` : Print (using `@info`) information about the process.
4242
* `update = false` : Update data from modified files and remove entries for deleted
4343
files.
44+
* `rinclude = [r\"\"]` : Only include files whose name matches any of these Regex expressions. Default value includes all files.
45+
* `rexclude = [r\"^\\b\$\"]` : Exclude any files whose name matches any of these Regex expressions. Default value does not exclude any files.
4446
* `white_list` : List of keys to use from result file. By default
4547
uses all keys from all loaded result-files.
4648
* `black_list = [:gitcommit, :gitpatch, :script]`: List of keys not to include from result-file.
@@ -84,8 +86,12 @@ function collect_results!(filename, folder;
8486
verbose = true,
8587
update = false,
8688
newfile = false, # keyword only for defining collect_results without !
89+
rinclude = [r""],
90+
rexclude = [r"^\b$"],
8791
kwargs...)
8892

93+
@assert all(eltype(r) <: Regex for r in (rinclude, rexclude)) "Elements of `rinclude` and `rexclude` must be Regex expressions."
94+
8995
if newfile || !isfile(filename)
9096
!newfile && verbose && @info "Starting a new result collection..."
9197
df = DataFrames.DataFrame()
@@ -116,6 +122,19 @@ function collect_results!(filename, folder;
116122
else
117123
allfiles = joinpath.(Ref(folder), readdir(folder))
118124
end
125+
126+
if (rinclude == [r""] && rexclude == [r"^\b$"]) == false
127+
idx_filt = Int[]
128+
for i in eachindex(allfiles)
129+
file = allfiles[i]
130+
include_bool = any(match(rgx, file) !== nothing for rgx in rinclude)
131+
exclude_bool = any(match(rgx, file) !== nothing for rgx in rexclude)
132+
if include_bool == false || exclude_bool == true
133+
push!(idx_filt, i)
134+
end
135+
end
136+
deleteat!(allfiles, idx_filt)
137+
end
119138

120139
n = 0 # new entries added
121140
u = 0 # entries updated

test/update_results_tests.jl

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,21 @@ cres_relpath = collect_results!(relpathname, folder;
6464
rpath = projectdir())
6565
@info all(startswith.(cres[!,"path"], "data"))
6666

67+
###############################################################################
68+
# Include or exclude files #
69+
###############################################################################
70+
71+
@test_throws AssertionError collect_results(datadir("results"); rinclude=["a=1"])
72+
73+
df = collect_results(datadir("results"); rinclude=[r"a=1", r"b=3"])
74+
@test all(row -> row["a"] == 1 || row["b"] == "2", eachrow(df))
75+
76+
df = collect_results(datadir("results"); rexclude=[r"a=3"])
77+
@test all(df[:,"a"] .!== 3)
78+
79+
df = collect_results(datadir("results"); rinclude=[r"a=3"], rexclude=[r"a=3"])
80+
@test isempty(df)
81+
6782
###############################################################################
6883
# Add another file in a sub sub folder #
6984
###############################################################################

0 commit comments

Comments
 (0)