Skip to content

Commit 50e58ed

Browse files
committed
presentation and solution 4A&4b
1 parent d29e4e5 commit 50e58ed

File tree

6 files changed

+282
-29
lines changed

6 files changed

+282
-29
lines changed

presentations/presentation4A_conditionsNloops.qmd

Lines changed: 110 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,6 @@ Now we have three possible outcomes:
5252

5353

5454
```{r}
55-
5655
#try with different values for num2
5756
num2 <- 10
5857
@@ -67,6 +66,51 @@ if (num1 > num2){
6766
print(statement)
6867
```
6968

69+
### And and or operations
70+
71+
Check values of num1 and num2
72+
```{r}
73+
num1
74+
num2
75+
76+
# num1 <- 14
77+
# num2 <- 12
78+
```
79+
80+
You can give multiple conditions and check if both of them are true with the `&` (and-operation).
81+
```{r}
82+
if (num1 < 10 & num2 < 10) {
83+
print('Both numbers are lower than 10')
84+
} else {
85+
print('Both numbers are not lower than 10')
86+
}
87+
```
88+
89+
You can also check if either one or the other is true with the `|` (or-operation).
90+
```{r}
91+
if (num1 < 10 | num2 < 10) {
92+
print('One or both of the numbers are lower than 10.')
93+
} else {
94+
print('None of the numbers are not lower than 10')
95+
}
96+
```
97+
98+
If you add more conditions, paraphrases can be useful.
99+
```{r}
100+
# num1 <- 8
101+
# num2 <- 5
102+
103+
num3 <- 10
104+
```
105+
106+
```{r}
107+
if ((num1 < 10 | num2 < 10) & num3 == 10) {
108+
print('Yes')
109+
} else {
110+
print('No')
111+
}
112+
```
113+
70114
## For-loops
71115

72116
### Defining a for loop
@@ -115,14 +159,10 @@ THIS_VARIABLE
115159

116160
### Loop control
117161

118-
There are two loop control statements we can use to
119-
120-
* jump to the next iteration: `next`
121-
* end the loop before finishing: `break`
162+
There are two loop control statements we can use to: `next` and `break`
122163

164+
`next` jumps to the next iteration. Here, we print every element in list1 and when the element is 'hello' we jump to the next iteration.
123165
```{r}
124-
#example for next
125-
126166
for (element in list1) {
127167
if(element == 'hello'){
128168
next
@@ -132,8 +172,8 @@ for (element in list1) {
132172
}
133173
```
134174

175+
`break` ends the loop before finishing. Here, we print every element in list1 and when the element is 'hello' we break (end) the loop.
135176
```{r}
136-
#example for break
137177
for (element in list1) {
138178
if(element == 'hello'){
139179
break
@@ -192,6 +232,68 @@ for (i in 1:nrow(my_df)) {
192232
193233
```
194234

235+
### Plotting in loops
236+
Create data
237+
```{r}
238+
plot_data_1 <- tibble(Name = c('Marie', 'Marie', 'Emma', 'Sofie', 'Sarah', 'Sofie', 'Hannah', 'Lise', 'Emma'),
239+
Class = c('1.A', '1.A', '1.A', '1.A', '1.B', '1.B', '1.B', '1.C', '1.C'),
240+
Food = c('Lasagna', 'Pizza', 'Pizza', 'Bruger', 'Lasagna', 'Lasagna', 'Lasagna', 'Burger', 'Lasagna'),
241+
Age = c(6, 6, 6, 6, 6, 5, 7, 6, 6))
242+
243+
head(plot_data_1, n = 2)
244+
```
245+
Barplot of each variable.
246+
```{r}
247+
ggplot(plot_data_1,
248+
aes(x = Name)) +
249+
geom_bar()
250+
251+
ggplot(plot_data_1,
252+
aes(x = Class)) +
253+
geom_bar()
254+
255+
# and so on ...
256+
```
257+
258+
Let's do it in a for loop!
259+
260+
First, let's check that the variables we are interested in are iterated correctly.
261+
```{r}
262+
for (col in colnames(plot_data_1)){
263+
print(col)
264+
}
265+
```
266+
267+
Great! Now, let's add the plot function to our for loop.
268+
```{r}
269+
for (col in colnames(plot_data_1)){
270+
p <- ggplot(plot_data_1,
271+
aes(x = col)) +
272+
geom_bar()
273+
274+
print(p)
275+
}
276+
```
277+
278+
That was not what we wanted...
279+
280+
Wrap the `!!sym()` function around the string formatted column name when passing it in aesthetic.
281+
282+
* `sym()` turns a string into a symbol (column reference)
283+
* `!!` unquotes the symbol to use it in `aes()`
284+
285+
```{r}
286+
for (col in colnames(plot_data_1)){
287+
p <- ggplot(plot_data_1,
288+
aes(x = !!sym(col))) +
289+
geom_bar()
290+
291+
print(p)
292+
}
293+
```
294+
295+
`!!sym(col)` should also be used for other tidyverse operations (`filter`, `select`, ...) where you pass the column names in string format.
296+
195297

196298
### If-else in loops
197299

presentations/presentation4B_functions.qmd

Lines changed: 37 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -106,9 +106,9 @@ if (age >= 18){
106106
Or we can choose to execute our function once for every element of an iterable, e.g. every row in a dataframe:
107107

108108
```{r}
109-
df <- data.frame(row.names = 1:5,
110-
age = c(45, 16, 31, 56, 19),
111-
weight_kg = c(85, 65, 100, 45, 76),
109+
df <- data.frame(row.names = 1:5,
110+
age = c(45, 16, 31, 56, 19),
111+
weight_kg = c(85, 65, 100, 45, 76),
112112
height_m = c(1.75, 1.45, 1.95, 1.51, 1.89))
113113
114114
df
@@ -118,13 +118,13 @@ Print ID, weight, and height of all individuals.
118118

119119
```{r}
120120
for (id in rownames(df)){
121-
121+
122122
weight <- df[id, 'weight_kg']
123-
123+
124124
height <- df[id, 'height_m']
125-
125+
126126
print(c(id, weight, height))
127-
127+
128128
}
129129
```
130130

@@ -200,6 +200,36 @@ Have a look at the data frame.
200200
df
201201
```
202202

203+
## Plotting in functions
204+
205+
Define function that creates boxplot
206+
```{r}
207+
my_boxplot <- function(dataframe, variable = ''){
208+
209+
p <- ggplot(data = dataframe,
210+
aes(y = !!sym(variable))) + # Use variable as column reference
211+
geom_boxplot(color = 'blue') +
212+
theme_bw() +
213+
labs(title = paste('Boxplot of', variable)) # Use variable as string
214+
215+
return(p)
216+
217+
}
218+
```
219+
220+
Look at column names of df
221+
```{r}
222+
colnames(df)
223+
colnames(df)[1]
224+
```
225+
226+
Run function on age.
227+
```{r}
228+
my_boxplot(dataframe = df, colnames(df)[1])
229+
```
230+
231+
232+
203233
## Error handling in user-defined functions
204234

205235
Currently our BMI function accepts all kinds of inputs. However, what happens if we give a negative weight?

program_draft.txt

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
DAY 1
2+
(30 minutes) Introduction
3+
(30 minutes) Exercise 0: Getting started
4+
(45 minutes) Presentation 1: Data Cleanup & Summary Statistics
5+
(45-60 minutes) Exercise 1: Data Cleanup and Summary Statistics
6+
(45 minutes) Presentation 2: Data Transformation and Integration
7+
(60 minutes) Exercise 2: Data Transformation and Integration
8+
9+
DAY 2
10+
(XX minutes) Presentation 3 - Exploratory Data Analysis (EDA)
11+
(XX minutes) Exercise 3, 3A, 3B ???
12+
(45 minutes) Presentation 4A: Scripting in R - Conditions and For-loops
13+
(60 minutes) Exercise 4A: Scripting in R - Conditions and For-loops
14+
(30 minutes) Presentation 4B: Scripting in R - Functions
15+
(60 minutes) Exercise 4B: Scripting in R - Functions
16+
17+
DAY 3
18+
(XX minutes) Presentation 5A: Models and Model Evaluation in R
19+
(XX minutes) Exercise 5A: Models and Model Evaluation in R
20+
(XX minutes) Presentation 5B: Models and Model Evaluation in R
21+
(XX minutes) Exercise 5B: Models and Model Evaluation in R
22+
23+

solutions/solution4A.qmd

Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -106,6 +106,7 @@ for (i in 1:10) {
106106
} else {
107107
print('Normal blood pressure')
108108
}
109+
109110
}
110111
```
111112

@@ -144,4 +145,102 @@ for (i in 1:10) {
144145
145146
```
146147

148+
7. Do the same as above but instead of printing the risk status, append it to a list. Start by initiating an empty list.
149+
150+
```{r}
151+
# Initiate list
152+
risk_status <- list()
153+
```
154+
155+
```{r}
156+
for (i in 1:nrow(diabetes_glucose)) {
157+
Smoker <- diabetes_glucose$Smoker[i]
158+
BMI <- diabetes_glucose$BMI[i]
159+
160+
#skip rows where either of the values is NA
161+
if (is.na(Smoker) | is.na(BMI)){
162+
next
163+
}
164+
165+
if (Smoker == 'Smoker' & BMI > 35){
166+
risk_status <- append(risk_status, 'High risk')
167+
} else if (Smoker == 'Smoker' | BMI > 35) {
168+
risk_status <- append(risk_status, 'Moderate risk')
169+
} else {
170+
risk_status <- append(risk_status, 'Low risk')
171+
}
172+
}
173+
174+
```
175+
176+
```{r}
177+
risk_status %>% head()
178+
```
179+
180+
8. Check the length of the list. Is it as expected?
181+
182+
Since we looped through all the rows in the `diabetes_glucose` dataframe, the list should be as long as there are row in the dataframe.
183+
```{r}
184+
length(risk_status)
185+
```
186+
187+
```{r}
188+
nrow(diabetes_glucose)
189+
```
190+
191+
192+
9. Add the list as a column in the `diabetes_glucose` data frame.
193+
194+
```{r}
195+
diabetes_glucose$risk_status <- risk_status
196+
```
197+
198+
```{r}
199+
diabetes_glucose %>% select(BMI, Smoker, risk_status) %>% unnest(risk_status)
200+
```
201+
202+
10. Make a list of all the column names in `diabetes_glucose` that contain categorical variables. Make a for loop that goes through the list and prints a barplot for each of the categorical variables.
203+
204+
```{r}
205+
categorical <- list('Sex', 'Smoker', 'Diabetes', 'Married', 'Work')
206+
```
207+
208+
```{r}
209+
for (var in categorical){
210+
211+
p <- ggplot(diabetes_glucose,
212+
aes(x = !!sym(var))) +
213+
geom_bar() +
214+
labs(title = paste('Barplot of', var))
215+
216+
print(p)
217+
218+
}
219+
```
220+
221+
10. Make a list of all the column names in `diabetes_glucose` that contain numeric variables. Make a for loop that goes through the list and prints a boxplot for each of the categorical variables.
222+
223+
```{r}
224+
head(diabetes_glucose)
225+
```
226+
227+
```{r}
228+
numeric <- list('Age', 'BloodPressure', 'BMI', 'PhysicalActivity', 'Serum_ca2')
229+
```
230+
231+
```{r}
232+
for (var in numeric){
233+
234+
p <- ggplot(diabetes_glucose,
235+
aes(y = !!sym(var))) +
236+
geom_boxplot() +
237+
labs(title = paste('Boxplot of', var))
238+
239+
print(p)
240+
241+
}
242+
```
243+
244+
245+
147246

solutions/solution4B.R

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ make_boxplot <- function(df, plot_column){
66
stop('The column to plot must be numcerial.')
77
}
88

9-
p <- ggplot(df, aes(y = .data[[plot_column]])) +
9+
p <- ggplot(df, aes(y = !!sym(plot_column))) +
1010
geom_boxplot(fill = "#03579A") +
1111
labs(title = paste("Boxplot of", plot_column)) +
1212
theme_bw()

0 commit comments

Comments
 (0)