Skip to content

Commit 6867c74

Browse files
committed
Update group-by file.
1 parent c82a372 commit 6867c74

File tree

1 file changed

+27
-15
lines changed

1 file changed

+27
-15
lines changed

examples/group-by.md

Lines changed: 27 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,39 +1,51 @@
11
# Group check results by category with Soda Core
22

3-
You can use a SQL query in a failed row check to group failed check results by one or more categories using Soda Core.
3+
You can use a SQL query in a failed row check to group failed check results by one or more categories using Soda Core. This approach is particularly useful when you want to identify specific subsets of data that fail quality checks and understand patterns in your data quality issues.
44

55
Use a SQL editor to build and test a SQL query with your data source, then add the query to a failed rows check to execute it during a Soda scan.
66

7-
The following example illustrates how to build a query that identifies the countries where the average age of people is less than 25.
7+
## Example: Identifying countries with low average age
8+
9+
The following example demonstrates how to build a query that identifies countries where the average age of people is less than 25. This step-by-step approach helps you develop and test your query before implementing it in Soda Core.
10+
11+
1. Beginning with a basic query, the output shows the data this example works with.
812

9-
1. Begining with a basic query, the output shows the data this example works with.
1013
```sql
1114
SELECT * FROM Customers;
1215
```
13-
![group-by-1](/docs/assets/images/group-by-1.png){:height="600px" width="600px"}
16+
17+
![group-by-1](/docs/assets/images/group-by-1.png)
18+
1419
2. Build a query to select groups with the relevant aggregations.
20+
1521
```sql
1622
SELECT country, AVG(age) as avg_age
1723
FROM Customers
1824
GROUP BY country
1925
```
20-
![group-by-2](/docs/assets/images/group-by-2.png){:height="600px" width="600px"}
26+
27+
![group-by-2](/docs/assets/images/group-by-2.png)
28+
2129
3. Identify the "bad" group (where the average age is less than 25) from among the grouped results.
30+
2231
```sql
23-
SELECT country, AVG(age) as avg_age
24-
FROM Customers
25-
GROUP BY country
26-
HAVING AVG(age) < 25
32+
SELECT country, AVG(age) as avg_age
33+
FROM Customers
34+
GROUP BY country
35+
HAVING AVG(age) < 25
2736
```
28-
![group-by-3](/docs/assets/images/group-by-3.png){:height="600px" width="600px"}
37+
38+
![group-by-3](/docs/assets/images/group-by-3.png)
39+
2940
4. Now that the query yields the expected results, add the query to a failed row check, as per the following example.
41+
3042
```yaml
3143
checks for dim_customers:
3244
- failed rows:
3345
name: Average age of citizens is less than 25
3446
fail query: |
35-
SELECT country, AVG(age) as avg_age
36-
FROM Customers
37-
GROUP BY country
38-
HAVING AVG(age) < 25
39-
```
47+
SELECT country, AVG(age) as avg_age
48+
FROM Customers
49+
GROUP BY country
50+
HAVING AVG(age) < 25
51+
```

0 commit comments

Comments
 (0)