Skip to content

Commit 21793f2

Browse files
committed
updating documentation and example
1 parent 03b4aed commit 21793f2

File tree

7 files changed

+103
-43
lines changed

7 files changed

+103
-43
lines changed

DESCRIPTION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
Package: cbsodataR
22
Type: Package
33
Title: Statistics Netherlands (CBS) Open Data API Client
4-
Version: 1.0.1.9000
4+
Version: 1.1.0.9000
55
Authors@R: c(person("Edwin", "de Jonge", email="edwindjonge@gmail.com", role = c("aut","cre")),
66
person("Sara", "Houweling", role=c("ctb")))
77
Description: The data and meta data from Statistics

NEWS.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
1-
# cbsodataR 1.0.2
1+
# cbsodataR 1.1.0
2+
3+
* Added `cbs_add_unit_column` to add unit columns to the data set, thanks to Marieke Rensman en Martin van Elp for the suggestion
24

35
* Bug fix for issue #39 default selection which includes a substring of is not parsed correctly, thanks to @guyhill for reporting
46

57
* Bug fix for issue #38 cbs_get_data argument typed=FALSE not working correctly, thanks to @guyhill for reporting
68

79
* Bug fix for issue #37 cbs_download_data: catalog error, thanks to @guyhill for reporting
810

9-
* Added `cbs_add_unit_column` to add unit columns to the data set, thanks to Marieke Rensman en Martin van Elp for the suggestion
10-
1111
# cbsodataR 1.0.1
1212

1313
* fixed example in `cbs_get_catalogs`, which failed when catalog was temporarily

R/cbs_add_unit_column.R

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,15 +5,17 @@
55
#' The unit columns will be named `<topic_column>_unit`, and are a `character`
66
#'
77
#' By default all topic columns will be with a unit column. The name
8-
#' of each unit column will be `<code_column>_unit`.
8+
#' of each unit column will be `<topic_column>_unit`.
99
#' @export
1010
#' @param x `data.frame` retrieved using [cbs_get_data()].
11-
#' @param columns `character` with the names of the columns for which labels will be added
11+
#' @param columns `character` with the names of the columns for which units will be added,
12+
#' non-topic columns will be ignored.
1213
#' @param ... not used.
1314
#' @return the original data.frame `x` with extra unit
1415
#' columns. (see description)
1516
#' @family data retrieval
1617
#' @family meta data
18+
#' @example example/cbs_add_unit_column.R
1719
cbs_add_unit_column <- function(x, columns=colnames(x), ...){
1820
add <- list()
1921
nms <- colnames(x)

example/cbs_add_unit_column.R

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
if (interactive()) {
2+
x <- cbs_get_data( id = "7196ENG" # table id
3+
, Periods = "2000MM03" # March 2000
4+
, CPI = "000000" # Category code for total
5+
, verbose = TRUE # show the url that is used
6+
)
7+
8+
9+
# adds two extra columns
10+
x_with_units <-
11+
x |>
12+
cbs_add_unit_column()
13+
14+
x_with_units[,1:4]
15+
}

inst/extra/cbsodataR.Rmd

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,7 @@ cbs_get_data('71509ENG') |>
103103
head()
104104
```
105105

106-
### Adding Date column
106+
### Adding a Date column
107107

108108
The period/time columns of Statistics Netherlands (CBS) contain coded time periods:
109109
e.g. 2018JJ00 (i.e. 2018), 2018KW03 (i.e. 2018 Q3), 2016MM04 (i.e. 2016 April).
@@ -126,6 +126,18 @@ cbs_get_data('71509ENG') |>
126126
```
127127

128128

129+
### Adding unit columns
130+
131+
Each topic in the CBS data can have a unit, e.g. "%" or "mln kg".
132+
Using `cbs_add_unit_column` for each (specified) topic a unit column will be added.
133+
134+
```{r, add_unit, message=FALSE}
135+
cbs_get_data('71509ENG') |>
136+
cbs_add_unit_column() |>
137+
subset(,1:4) |>
138+
head()
139+
```
140+
129141

130142
## Select and filter
131143

man/cbs_add_unit_column.Rd

Lines changed: 20 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

vignettes/cbsodataR.md

Lines changed: 47 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ datasets |>
2323
select(Identifier, ShortTitle)
2424
```
2525

26-
## # A tibble: 962 × 2
26+
## # A tibble: 1,005 × 2
2727
## Identifier ShortTitle
2828
## <chr> <chr>
2929
## 1 80783eng Agriculture; general farm type, region
@@ -36,7 +36,7 @@ datasets |>
3636
## 8 84312ENG Caribbean NL; students MBO
3737
## 9 84732ENG Caribbean NL; pupils and students
3838
## 10 81154eng Caribbean NL; electricity and water
39-
## # ℹ 952 more rows
39+
## # ℹ 995 more rows
4040

4141
## Search for tables
4242

@@ -50,7 +50,7 @@ toc_apples[, c("Identifier", "ShortTitle", "score")]
5050
## # A tibble: 1 × 3
5151
## Identifier ShortTitle score
5252
## <chr> <chr> <dbl>
53-
## 1 71509ENG Yield apples and pears, 1997 - 2017 2.62
53+
## 1 71509ENG Yield apples and pears, 1997 - 2017 2.64
5454

5555
## Other catalogs
5656

@@ -61,7 +61,8 @@ catalogs <- cbs_get_catalogs()
6161
catalogs$Identifier
6262
```
6363

64-
## [1] "CBS" "MKB" "IV3" "MLZ" "JM" "RIVM" "Politie" "MVstat" "AZW" "InterReg" "SXstat"
64+
## [1] "CBS" "MKB" "IV3" "MLZ" "JM" "RIVM"
65+
## [7] "Politie" "MVstat" "AZW" "InterReg" "SXstat"
6566

6667
## Metadata
6768

@@ -101,7 +102,8 @@ SN table.
101102
names(apples)
102103
```
103104

104-
## [1] "TableInfos" "DataProperties" "CategoryGroups" "FruitFarmingRegions" "Periods"
105+
## [1] "TableInfos" "DataProperties" "CategoryGroups"
106+
## [4] "FruitFarmingRegions" "Periods"
105107

106108
## Data download
107109

@@ -114,8 +116,6 @@ cbs_get_data('71509ENG') |>
114116
head()
115117
```
116118

117-
## | | | 0% | |====================================================================================================================================| 100%
118-
119119
## # A tibble: 6 × 4
120120
## FruitFarmingRegions Periods TotalAppleVarieties_1 CoxSOrangePippin_2
121121
## <chr> <chr> <int> <int>
@@ -144,8 +144,6 @@ cbs_get_data_from_link("https://opendata.cbs.nl/dataportaal/#/CBS/en/dataset/715
144144
## Executing:
145145
## cbs_get_data(id = "71509ENG", select = c("FruitFarmingRegions", "Periods", "TotalAppleVarieties_1", "CoxSOrangePippin_2", "DelbarestivaleDelcorf_3", "Elstar_4", "GoldenDelicious_5", "Jonagold_6", "Jonagored_7", "Junami_8", "Kanzi_9", "RodeBoskoopRennetApple_10", "Rubens_11", "OtherAppleVarieties_12", "TotalAppleVarieties_20", "CoxSOrangePippin_21", "DelbarestivaleDelcorf_22", "Elstar_23", "GoldenDelicious_24", "Jonagold_25", "Jonagored_26", "Junami_27", "Kanzi_28", "RodeBoskoopRennetApple_29", "Rubens_30", "OtherAppleVarieties_31"), FruitFarmingRegions = c("1", "2", "3", "4", "5"), Periods = c("1997JJ00", "2012JJ00", "2013JJ00", "2016JJ00"), deeplink = "https://opendata.cbs.nl/dataportaal/#/CBS/en/dataset/71509ENG/table?dl=193CB", base_url = "http://opendata.cbs.nl")
146146

147-
## | | | 0% | |====================================================================================================================================| 100%
148-
149147
## # A tibble: 6 × 4
150148
## ID FruitFarmingRegions Periods TotalAppleVarieties_1
151149
## <int> <chr> <chr> <int>
@@ -169,8 +167,6 @@ cbs_get_data('71509ENG') |>
169167
head()
170168
```
171169

172-
## | | | 0% | |====================================================================================================================================| 100%
173-
174170
## # A tibble: 6 × 4
175171
## FruitFarmingRegions FruitFarmingRegions_label Periods Periods_label
176172
## <chr> <fct> <chr> <fct>
@@ -181,7 +177,7 @@ cbs_get_data('71509ENG') |>
181177
## 5 1 Total Netherlands 2001JJ00 2001
182178
## 6 1 Total Netherlands 2002JJ00 2002
183179

184-
### Adding Date column
180+
### Adding a Date column
185181

186182
The period/time columns of Statistics Netherlands (CBS) contain coded
187183
time periods: e.g. 2018JJ00 (i.e. 2018), 2018KW03 (i.e. 2018 Q3),
@@ -195,8 +191,6 @@ cbs_get_data('71509ENG') |>
195191
head()
196192
```
197193

198-
## | | | 0% | |====================================================================================================================================| 100%
199-
200194
## # A tibble: 6 × 3
201195
## Periods Periods_Date Periods_freq
202196
## <chr> <date> <fct>
@@ -217,8 +211,6 @@ cbs_get_data('71509ENG') |>
217211
head()
218212
```
219213

220-
## | | | 0% | |====================================================================================================================================| 100%
221-
222214
## # A tibble: 6 × 3
223215
## Periods Periods_numeric Periods_freq
224216
## <chr> <int> <fct>
@@ -229,6 +221,29 @@ cbs_get_data('71509ENG') |>
229221
## 5 2001JJ00 2001 Y
230222
## 6 2002JJ00 2002 Y
231223

224+
### Adding unit columns
225+
226+
Each topic in the CBS data can have a unit, e.g. “%” or “mln kg”. Using
227+
`cbs_add_unit_column` for each (specified) topic a unit column will be
228+
added.
229+
230+
``` r
231+
cbs_get_data('71509ENG') |>
232+
cbs_add_unit_column() |>
233+
subset(,1:4) |>
234+
head()
235+
```
236+
237+
## # A tibble: 6 × 4
238+
## FruitFarmingRegions Periods TotalAppleVarieties_1 TotalAppleVarieties_1_unit
239+
## <chr> <chr> <int> <chr>
240+
## 1 1 1997JJ00 420 mln kg
241+
## 2 1 1998JJ00 518 mln kg
242+
## 3 1 1999JJ00 568 mln kg
243+
## 4 1 2000JJ00 461 mln kg
244+
## 5 1 2001JJ00 408 mln kg
245+
## 6 1 2002JJ00 354 mln kg
246+
232247
## Select and filter
233248

234249
It is possible restrict the download using filter statements. This may
@@ -248,7 +263,8 @@ apples <- cbs_get_meta('71509ENG')
248263
names(apples)
249264
```
250265

251-
## [1] "TableInfos" "DataProperties" "CategoryGroups" "FruitFarmingRegions" "Periods"
266+
## [1] "TableInfos" "DataProperties" "CategoryGroups"
267+
## [4] "FruitFarmingRegions" "Periods"
252268

253269
``` r
254270
# meta data for column Periods
@@ -290,13 +306,12 @@ head(apples$FruitFarmingRegions[,1:2 ])
290306
cbs_add_label_columns()
291307
```
292308

293-
## | | | 0% | |====================================================================================================================================| 100%
294-
295309
## # A tibble: 2 × 5
296-
## FruitFarmingRegions FruitFarmingRegions_label Periods Periods_label TotalAppleVarieties_1
297-
## <chr> <fct> <chr> <fct> <int>
298-
## 1 1 Total Netherlands 2000JJ00 2000 461
299-
## 2 1 Total Netherlands 2001JJ00 2001 408
310+
## FruitFarmingRegions FruitFarmingRegions_label Periods Periods_label
311+
## <chr> <fct> <chr> <fct>
312+
## 1 1 Total Netherlands 2000JJ00 2000
313+
## 2 1 Total Netherlands 2001JJ00 2001
314+
## # ℹ 1 more variable: TotalAppleVarieties_1 <int>
300315

301316
- To filter for values in a column that have a substring e.g. “JJ” you
302317
can use `<column_name> = has_substring(<substring>)` to `cbs_get_data`
@@ -314,12 +329,11 @@ head(apples$FruitFarmingRegions[,1:2 ])
314329
cbs_add_label_columns()
315330
```
316331

317-
## | | | 0% | |====================================================================================================================================| 100%
318-
319332
## # A tibble: 1 × 5
320-
## FruitFarmingRegions FruitFarmingRegions_label Periods Periods_label TotalAppleVarieties_1
321-
## <chr> <fct> <chr> <fct> <int>
322-
## 1 1 Total Netherlands 2000JJ00 2000 461
333+
## FruitFarmingRegions FruitFarmingRegions_label Periods Periods_label
334+
## <chr> <fct> <chr> <fct>
335+
## 1 1 Total Netherlands 2000JJ00 2000
336+
## # ℹ 1 more variable: TotalAppleVarieties_1 <int>
323337

324338
- To combine values and substring use the “\|” operator:
325339
`Periods = eq("2020JJ00") | has_substring("KW")`
@@ -336,13 +350,12 @@ head(apples$FruitFarmingRegions[,1:2 ])
336350
cbs_add_label_columns()
337351
```
338352

339-
## | | | 0% | |====================================================================================================================================| 100%
340-
341353
## # A tibble: 2 × 5
342-
## FruitFarmingRegions FruitFarmingRegions_label Periods Periods_label TotalAppleVarieties_1
343-
## <chr> <fct> <chr> <fct> <int>
344-
## 1 1 Total Netherlands 2000JJ00 2000 461
345-
## 2 1 Total Netherlands 2010JJ00 2010 334
354+
## FruitFarmingRegions FruitFarmingRegions_label Periods Periods_label
355+
## <chr> <fct> <chr> <fct>
356+
## 1 1 Total Netherlands 2000JJ00 2000
357+
## 2 1 Total Netherlands 2010JJ00 2010
358+
## # ℹ 1 more variable: TotalAppleVarieties_1 <int>
346359

347360
# Download data
348361

0 commit comments

Comments
 (0)