|
1 | 1 | # Data manipulation frameworks
|
2 | 2 |
|
3 |
| -Two popular frameworks provide convenience methods to manipulate `DataFrame`s: |
4 |
| -DataFramesMeta.jl and Query.jl. They implement a functionality similar to |
| 3 | +Three frameworks provide convenience methods to manipulate `DataFrame`s: |
| 4 | +DataFramesMeta.jl, DataFrameMacros.jl and Query.jl. They implement a functionality similar to |
5 | 5 | [dplyr](https://dplyr.tidyverse.org/) or
|
6 | 6 | [LINQ](https://en.wikipedia.org/wiki/Language_Integrated_Query).
|
7 | 7 |
|
@@ -117,6 +117,84 @@ julia> @chain df begin
|
117 | 117 | You can find more details about how this package can be used on the
|
118 | 118 | [DataFramesMeta.jl GitHub page](https://github.com/JuliaData/DataFramesMeta.jl).
|
119 | 119 |
|
| 120 | +## DataFrameMacros.jl |
| 121 | + |
| 122 | +[DataFrameMacros.jl](https://github.com/jkrumbiegel/DataFrameMacros.jl) is |
| 123 | +an alternative to DataFramesMeta.jl with an additional focus on convenient |
| 124 | +solutions for the transformation of multiple columns at once. |
| 125 | +The instructions below are for version 0.3 of DataFrameMacros.jl. |
| 126 | + |
| 127 | +First, install the DataFrameMacros.jl package: |
| 128 | + |
| 129 | +```julia |
| 130 | +using Pkg |
| 131 | +Pkg.add("DataFrameMacros") |
| 132 | +``` |
| 133 | + |
| 134 | +In DataFrameMacros.jl, all but the `@combine` macro are row-wise by default. |
| 135 | +There is also a `@groupby` which allows creating grouping columns on the fly |
| 136 | +using the same syntax as `@transform`, for grouping by new columns |
| 137 | +without writing them out twice. |
| 138 | + |
| 139 | +In the example below, you can also see some of DataFrameMacros.jl's multi-column |
| 140 | +features, where `mean` is applied to both age columns at once by selecting |
| 141 | +them with the `r"age"` regex. The new column names are then derived using the |
| 142 | +`"{}"` shortcut which splices the transformed column names into a string. |
| 143 | + |
| 144 | +```jldoctest dataframemacros |
| 145 | +julia> using DataFrames, DataFrameMacros, Chain, Statistics |
| 146 | +
|
| 147 | +julia> df = DataFrame(name=["John", "Sally", "Roger"], |
| 148 | + age=[54.0, 34.0, 79.0], |
| 149 | + children=[0, 2, 4]) |
| 150 | +3×3 DataFrame |
| 151 | + Row │ name age children |
| 152 | + │ String Float64 Int64 |
| 153 | +─────┼─────────────────────────── |
| 154 | + 1 │ John 54.0 0 |
| 155 | + 2 │ Sally 34.0 2 |
| 156 | + 3 │ Roger 79.0 4 |
| 157 | +
|
| 158 | +julia> @chain df begin |
| 159 | + @transform :age_months = :age * 12 |
| 160 | + @groupby :has_child = :children > 0 |
| 161 | + @combine "mean_{}" = mean({r"age"}) |
| 162 | + end |
| 163 | +2×3 DataFrame |
| 164 | + Row │ has_child mean_age mean_age_months |
| 165 | + │ Bool Float64 Float64 |
| 166 | +─────┼────────────────────────────────────── |
| 167 | + 1 │ false 54.0 648.0 |
| 168 | + 2 │ true 56.5 678.0 |
| 169 | +``` |
| 170 | + |
| 171 | +There's also the capability to reference a group of multiple columns as a single unit, |
| 172 | +for example to run aggregations over them, with the `{{ }}` syntax. |
| 173 | +In the following example, the first quarter is compared to the maximum of the other three: |
| 174 | + |
| 175 | +```jldoctest dataframemacros |
| 176 | +julia> df = DataFrame(q1 = [12.0, 0.4, 42.7], |
| 177 | + q2 = [6.4, 2.3, 40.9], |
| 178 | + q3 = [9.5, 0.2, 13.6], |
| 179 | + q4 = [6.3, 5.4, 39.3]) |
| 180 | +3×4 DataFrame |
| 181 | + Row │ q1 q2 q3 q4 |
| 182 | + │ Float64 Float64 Float64 Float64 |
| 183 | +─────┼──────────────────────────────────── |
| 184 | + 1 │ 12.0 6.4 9.5 6.3 |
| 185 | + 2 │ 0.4 2.3 0.2 5.4 |
| 186 | + 3 │ 42.7 40.9 13.6 39.3 |
| 187 | +
|
| 188 | +julia> @transform df :q1_best = :q1 > maximum({{Not(:q1)}}) |
| 189 | +3×5 DataFrame |
| 190 | + Row │ q1 q2 q3 q4 q1_best |
| 191 | + │ Float64 Float64 Float64 Float64 Bool |
| 192 | +─────┼───────────────────────────────────────────── |
| 193 | + 1 │ 12.0 6.4 9.5 6.3 true |
| 194 | + 2 │ 0.4 2.3 0.2 5.4 false |
| 195 | + 3 │ 42.7 40.9 13.6 39.3 true |
| 196 | +``` |
| 197 | + |
120 | 198 | ## Query.jl
|
121 | 199 |
|
122 | 200 | The [Query.jl](https://github.com/queryverse/Query.jl) package provides advanced
|
|
0 commit comments