1
1
2
2
# ' @title Group by Datacube
3
- # ' @description Group by datacube wraps the aggregate_temporal_period(https://processes.openeo.org/#aggregate_temporal_period),
4
- # ' aggregate_spatial (https://processes.openeo.org/#aggregate_spatial),
5
- # ' and aggregate_temporal(https://processes.openeo.org/#aggregate_temporal),
6
- # ' functions into a simulated dplyr's \code{\link[dplyr]{group_by}}.
3
+ # ' @description Group by datacube works similarly to the dplyr's \code{\link[dplyr]{group_by}}.
4
+ # ' It does not truly modifiy the datacube, but it registers a grouping or aggregation
5
+ # ' strategy. One can aggregate a datacube by its spatial dimension, or maybe its
6
+ # ' temporal dimension, or even a geometry (sf object).
7
+ # '
8
+ # ' The group_by function interacts directly with summarise and it basically will create
9
+ # ' a subclass called "grouped datacube", with its aggregation method in its environment.
10
+ # ' That will be searched by the summarise function, when summarising.
7
11
# ' @name group_by
8
12
# ' @rdname group_by
9
13
# ' @param .data datacube object from tidyopeneo
10
14
# ' @param ... any parameter inherited from dplyr
11
- # ' @param .period (optional) For **aggregate_temporal_period** : The time intervals to aggregate.
12
- # ' The following pre-defined values are available:* `hour`: Hour of the day* `day`:
13
- # ' Day of the year* `week`: Week of the year* `dekad`: Ten day periods,
14
- # ' counted per year with three periods per month (day 1 - 10, 11 - 20 and 21 -
15
- # ' end of month). The third dekad of the month can range from 8 to 11 days.
16
- # ' For example, the fourth dekad is Feb, 1 - Feb, 10 each year.
17
- # ' * `month`: Month of the year* `season`: Three month periods of the calendar
18
- # ' seasons (December - February, March - May, June - August, September - November).
19
- # ' * `tropical-season`: Six month periods of the tropical seasons (November -
20
- # ' April, May - October).* `year`: Proleptic years* `decade`: Ten year periods
21
- # ' ([0-to-9 decade](https://en.wikipedia.org/wiki/Decade#0-to-9_decade)), from a
22
- # ' year ending in a 0 to the next year ending in a 9.* `decade-ad`: Ten year
23
- # ' periods ([1-to-0 decade](https://en.wikipedia.org/wiki/Decade#1-to-0_decade))
24
- # ' better aligned with the anno Domini (AD) calendar era, from a year ending in
25
- # ' a 1 to the next year ending in a 0.
26
- # ' @param .reducer A reducer to be applied for the values contained in each period.
27
- # ' A reducer is a single process such as ``mean()`` or a set of processes, which
28
- # ' computes a single value for a list of values, see the category 'reducer' for
29
- # ' such processes. Periods may not contain any values, which for most reducers
30
- # ' leads to no-data (`null`) values by default. It may also be a character referring to one
31
- # ' of openeo reducing functions, such as, mean, sum, min, max, etc.
32
- # ' @param .dimension (optional). For **aggregate_temporal_period** and **aggregate_temporal** (optional) :
33
- # ' The name of the temporal dimension for aggregation. All
34
- # ' data along the dimension is passed through the specified reducer. If the
35
- # ' dimension is not set or set to `null`, the data cube is expected to only
36
- # ' have one temporal dimension. Fails with a `TooManyDimensions` exception if
37
- # ' it has more dimensions. Fails with a `DimensionNotAvailable` exception if the
38
- # ' specified dimension does not exist.
39
- # ' @param .context (optional) Additional data to be passed to the reducer.
40
- # ' @param .geometries (optional). For **aggregate_spatial** : Geometries as GeoJSON on which
41
- # ' the aggregation will be based.
42
- # ' One value will be computed per GeoJSON `Feature`, `Geometry` or
43
- # ' `GeometryCollection`. For a `FeatureCollection` multiple values will be computed,
44
- # ' one value per contained `Feature`. For example, a single value will be computed
45
- # ' for a `MultiPolygon`, but two values will be computed for a `FeatureCollection`
46
- # ' containing two polygons.- For **polygons**, the process considers all
47
- # ' pixels for which the point at the pixel centre intersects with the corresponding
48
- # ' polygon (as defined in the Simple Features standard by the OGC).
49
- # ' For **points**, the process considers the closest pixel centre.
50
- # ' For **lines** (line strings), the process considers all the pixels whose centres
51
- # ' are closest to at least one point on the line.Thus, pixels may be part of
52
- # ' multiple geometries and be part of multiple aggregations.To maximize
53
- # ' interoperability, a nested `GeometryCollection` should be avoided.
54
- # ' Furthermore, a `GeometryCollection` composed of a single type of geometries
55
- # ' should be avoided in favour of the corresponding multi-part type
56
- # ' (e.g. `MultiPolygon`).
57
- # ' @param .target_dimension (optional). For **aggregate-spatial** (optional) : The new dimension name
58
- # ' to be used for storing the results. Defaults to `result`.
59
- # ' @param .intervals (optional). For **aggregate_temporal** : Left-closed temporal intervals,
60
- # ' which are allowed to overlap.
61
- # ' Each temporal interval in the array has exactly two elements:1.
62
- # ' The first element is the start of the temporal interval. The specified instance
63
- # ' in time is **included** in the interval.2. The second element is the end of
64
- # ' the temporal interval. The specified instance in time is **excluded** from the
65
- # ' interval.The specified temporal strings follow
66
- # ' RFC 3339(https://www.rfc-editor.org/rfc/rfc3339.html). Although RFC 3339 prohibits
67
- # ' the hour to be '24'(https://www.rfc-editor.org/rfc/rfc3339.html#section-5.7),
68
- # ' **this process allows the value '24' for the hour** of an end time in order
69
- # ' to make it possible that left-closed time intervals can fully cover the day.
70
- # ' @param .labels (optional). For **aggregate_temporal** (optional) : Distinct labels for the intervals, which can contain dates
71
- # ' and/or times. Is only required to be specified if the values for the start of
72
- # ' the temporal intervals are not distinct and thus the default labels would not
73
- # ' be unique. The number of labels and the number of groups need to be equal.
74
- # ' @param .con (optional) openeo connection. Default to NULL
75
- # ' @param .p (optional) processes available at .con
76
- # ' @return datacube
77
- # ' @import dplyr openeo cli sf
78
- # ' @details If .period is defined, aggregate_temporal_period is run. Else if
79
- # ' .geometries is defined, aggregate_spatial runs. Otherwise, if .intervals is passed,
80
- # ' aggregate_temporal runs.
15
+ # ' @param .by aggregation method, such as:
16
+ # ' 1. "hour", "day", "week", "dekad", "month", "season", "tropical-season", "year", "decade", "decade-ad" for
17
+ # ' aggregate temporal period.
18
+ # '
19
+ # ' 2. sf object for aggregate spatial
20
+ # '
21
+ # ' 3. list with 2 intervals for aggreggate temporal period
22
+ # '
23
+ # ' 4. 'time', 'temporal', 't' for reduce temporal dimension
24
+ # '
25
+ # ' 5. 'space', 'spatial', 's' for reduce spatial dimension
26
+ # ' @return grouped datacube
27
+ # ' @import dplyr openeo sf
81
28
# ' @seealso [openeo::list_processes()]
82
29
# ' @importFrom dplyr group_by
83
30
# ' @examples
119
66
# ' p = openeo::processes()
120
67
# '
121
68
# ' # aggregate spatially
122
- # ' dc_mean <- dc %>% group_by(.reducer = function(data, context) { p$mean(data) },
123
- # ' .geometries = polygons)
69
+ # ' dc_mean <- dc %>%
70
+ # ' group_by(polygons) %>%
71
+ # ' summarise("mean")
72
+ # '
73
+ # ' # reduce temporal dimension
74
+ # ' dc_sum <- dc %>%
75
+ # ' group_by("t") %>%
76
+ # ' summarise("sum")
124
77
# '
125
- # ' # the same result can be obtained with the simplified version ...
126
- # ' dc_mean <- dc %>% group_by(.reducer = "mean",
127
- # ' .geometries = polygons)
128
78
# ' @export
129
- group_by.datacube <- function (.data = NULL , ... , .period = NULL , .reducer = NULL ,
130
- .dimension = NULL , .context = NULL ,
131
- .geometries = NULL , .target_dimension = " result" ,
132
- .intervals = NULL , .labels = array (),
133
- .p = openeo :: processes(.con ), .con = NULL ) {
79
+ group_by.datacube <- function (.data = NULL , .by = NULL , ... ) {
134
80
135
81
# check dots ...
136
- dots = list (... )
137
-
138
- for (i in dots ){
139
- if (length(dots ) != 0 ){
140
- inherits(dots )
141
- }
82
+ if (length(list (... )) > 0 ) {
83
+ cli :: cli_alert_warning(" Additional arguments were passed" )
142
84
}
143
85
144
86
# check mandatory argument
@@ -148,42 +90,13 @@ group_by.datacube <- function(.data = NULL, ..., .period = NULL, .reducer = NULL
148
90
tidyopeneo MUST be passed"
149
91
))}
150
92
151
- # if reducer is present, it can be either a function call or a function name as string
152
- if (! is.null(.reducer )){
153
-
154
- if (inherits(.reducer , " character" )){
155
- reducing_process = .reducer
156
- .reducer = function (data , context ) {.p [[reducing_process ]](data )}
157
- }
158
-
159
- }else {
160
- stop(cli :: format_error(" ERROR : no reducer passed or not implemented" ))
161
- }
162
-
163
- # aggregate_temporal_period
164
- if (all(! is.null(.data ), ! is.null(.period ), is.null(.geometries ), is.null(.intervals ))) {
165
- dc = .p $ aggregate_temporal_period(data = .data , period = .period , reducer = .reducer ,
166
- dimension = .dimension , context = .context )
167
- cli :: cli_alert_success(" aggregate_temporal_period applied" )
168
- }
169
-
170
- # aggregate_spatial
171
- if (all(! is.null(.data ), ! is.null(.geometries ), is.null(.period ), is.null(.intervals ))) {
172
-
173
- dc = .p $ aggregate_spatial(data = .data , geometries = .geometries ,
174
- reducer = .reducer , target_dimension = .target_dimension ,
175
- context = .context )
176
- cli :: cli_alert_success(" aggregate_spatial applied" )
177
- }
178
-
179
- # aggregate_temporal
180
- if (all(! is.null(.data ), is.null(.geometries ), is.null(.period ), ! is.null(.intervals ))) {
181
- dc = .p $ aggregate_temporal(data = .data , intervals = .intervals ,
182
- reducer = .reducer , dimension = .dimension ,
183
- context = .context )
184
- cli :: cli_alert_success(" aggregate_temporal applied" )
185
- }
186
-
187
- structure(dc , class = c(" datacube" , class(dc )))
93
+ # add a tag to grouped dc
94
+ # .data$group = .by
95
+ group_env <- environment(.data )
96
+ group_env $ group <- .by
188
97
98
+ # check if there's a by
99
+ attr(.data , " group_env" ) <- group_env
100
+ structure(.data , class = unique(c(" grouped datacube" , class(.data )))) # Class definition
101
+ return (.data )
189
102
}
0 commit comments