Skip to content

Commit 9c61f92

Browse files
authored
Create tool to automate reviews of metrics files (#3620)
This adds an automatic tool to review metrics files. It should run as part of PRB and post back the results of computing the diff. You can see a version of its sample output here: https://github.com/FoundationDB/fdb-record-layer/actions/runs/17975684135?pr=3620 That particular version was modified so that it would use `4.5.10.0` as the base branch. The extra queries there account for why there were so many identified in the analysis. The basic structure of the report is: 1. A basic summary of the number of queries added, dropped, and changed 2. The number of new queries (per file) 3. A list of dropped queries 4. Statistics for query metric changes for queries that have modified their plans, along with a list of outlier queries with large regressions 5. The same for queries where only the metrics changed Those two categories of queries are separated out as we expect some amount of churn whenever a query plan changes, as it requires that there is some larger kind of planner change. When only the metrics change (particularly when there is a regression), that can be indicative of some kind of problem, as the planner may have done more work without seeing a better outcome. One thing that I have not been able to test is the ability to post back comments to the PR because `pull_request` events don't have write access (even for comments). The current version uses `pull_request_target` to do that, which should be able to post comments back to the PR (including in-line comments), though I have not been able to test that. I did also look at using GitHub annotations, which would be more forgiving, but those have the downside of being limited to something like 10 in a single step as well as not supporting markdown formatting. So comments seem preferable if we can get them to work.
1 parent 4b2321c commit 9c61f92

18 files changed

+2923
-20
lines changed
Lines changed: 147 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,147 @@
1+
name: Metrics Diff Analysis
2+
3+
on:
4+
pull_request_target:
5+
paths:
6+
- '**/*.metrics.yaml'
7+
8+
jobs:
9+
metrics-analysis:
10+
runs-on: ubuntu-latest
11+
permissions:
12+
issues: write
13+
pull-requests: write
14+
15+
steps:
16+
- name: Checkout code
17+
uses: actions/checkout@v4
18+
with:
19+
fetch-depth: 0 # Need full history for git diff
20+
21+
- name: Setup Base Environment
22+
uses: ./actions/setup-base-env
23+
24+
- name: Run metrics diff analysis
25+
id: metrics-diff
26+
run: |
27+
# Run the analysis. Compare against the base hash of this PR
28+
./gradlew :yaml-tests:analyzeMetrics \
29+
-PmetricsAnalysis.baseRef="${{ github.sha }}" \
30+
-PmetricsAnalysis.headRef="${{ github.event.pull_request.head.sha }}" \
31+
-PmetricsAnalysis.urlBase="${{ github.server_url }}/${{ github.repository }}/blob" \
32+
-PmetricsAnalysis.repositoryRoot="${{ github.workspace }}" \
33+
-PmetricsAnalysis.output="${{ github.workspace }}/metrics-analysis-output.txt" \
34+
-PmetricsAnalysis.outlierQueries="${{ github.workspace }}/outlier-queries.txt"
35+
36+
- name: Add Report To Summary
37+
run: cat metrics-analysis-output.txt > $GITHUB_STEP_SUMMARY
38+
39+
- name: Check for outliers
40+
id: check-changes
41+
run: |
42+
if [[ -f outlier-queries.txt ]] ; then
43+
echo "SIGNIFICANT_CHANGES=true" >> $GITHUB_OUTPUT
44+
else
45+
echo "SIGNIFICANT_CHANGES=false" >> $GITHUB_OUTPUT
46+
fi
47+
48+
- name: Comment on PR
49+
uses: actions/github-script@v7
50+
continue-on-error: true
51+
with:
52+
github-token: ${{ secrets.GITHUB_TOKEN }}
53+
script: |
54+
const fs = require('fs');
55+
const report = fs.readFileSync("${{ github.workspace }}/metrics-analysis-output.txt", { encoding: 'utf8', flag: 'r'});
56+
57+
// Find existing comment
58+
const { data: comments } = await github.rest.issues.listComments({
59+
owner: context.repo.owner,
60+
repo: context.repo.repo,
61+
issue_number: context.issue.number,
62+
});
63+
64+
const existingComment = comments.find(comment =>
65+
comment.user.type === 'Bot' && comment.body.includes('📊 Metrics Diff Analysis')
66+
);
67+
68+
if (existingComment) {
69+
// Delete previous comment
70+
await github.rest.issues.deleteComment({
71+
owner: context.repo.owner,
72+
repo: context.repo.repo,
73+
comment_id: existingComment.id,
74+
});
75+
}
76+
77+
// Create new comment
78+
await github.rest.issues.createComment({
79+
owner: context.repo.owner,
80+
repo: context.repo.repo,
81+
issue_number: context.issue.number,
82+
body: report
83+
});
84+
85+
- name: Add inline comments for outliers
86+
uses: actions/github-script@v7
87+
if: steps.check-changes.outputs.SIGNIFICANT_CHANGES == 'true'
88+
continue-on-error: true
89+
with:
90+
github-token: ${{ secrets.GITHUB_TOKEN }}
91+
script: |
92+
// First, delete any existing comments from previous runs
93+
const { data: reviewComments } = await github.rest.pulls.listReviewComments({
94+
owner: context.repo.owner,
95+
repo: context.repo.repo,
96+
pull_number: context.issue.number,
97+
});
98+
99+
const metricsComments = reviewComments.filter(comment =>
100+
comment.user.type === 'Bot' && comment.body.includes('**Significant Metrics Change**')
101+
);
102+
103+
for (const comment of metricsComments) {
104+
try {
105+
await github.rest.pulls.deleteReviewComment({
106+
owner: context.repo.owner,
107+
repo: context.repo.repo,
108+
comment_id: comment.id,
109+
});
110+
console.log(`Deleted previous metrics comment ${comment.id}`);
111+
} catch (error) {
112+
console.log(`Could not delete comment ${comment.id}: ${error.message}`);
113+
}
114+
}
115+
116+
// Parse the outliers report analysis output to find files with notable queries
117+
const fs = require('fs');
118+
const output = fs.readFileSync("${{ github.workspace }}/outlier-queries.txt", { encoding: 'utf8', flag: 'r' });
119+
120+
// Each query appears separated by a blank line
121+
const queries = output.split('\n\n');
122+
123+
for (const query of queries) {
124+
var newl = query.indexOf('\n');
125+
if (newl < 0) {
126+
continue;
127+
}
128+
const info = query.substring(0, newl);
129+
const match = info.match(/^(.+\.metrics\.yaml):(\d+): (.+)$/);
130+
if (match) {
131+
const [, filePath, lineNumber, query] = match;
132+
const data = query.substring(newl, query.length);
133+
try {
134+
await github.rest.pulls.createReviewComment({
135+
owner: context.repo.owner,
136+
repo: context.repo.repo,
137+
pull_number: context.issue.number,
138+
body: "**Significant Metrics Change**\n\nThis query's metrics have changed significantly.\n\n" + data + "\n",
139+
path: filePath,
140+
line: lineNumber,
141+
side: 'RIGHT'
142+
});
143+
} catch (error) {
144+
console.log(`Could not add comment to ${filePath}: ${error.message}`);
145+
}
146+
}
147+
}
Lines changed: 183 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,183 @@
1+
/*
2+
* GitMetricsFileFinder.java
3+
*
4+
* This source file is part of the FoundationDB open source project
5+
*
6+
* Copyright 2015-2025 Apple Inc. and the FoundationDB project authors
7+
*
8+
* Licensed under the Apache License, Version 2.0 (the "License");
9+
* you may not use this file except in compliance with the License.
10+
* You may obtain a copy of the License at
11+
*
12+
* http://www.apache.org/licenses/LICENSE-2.0
13+
*
14+
* Unless required by applicable law or agreed to in writing, software
15+
* distributed under the License is distributed on an "AS IS" BASIS,
16+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
17+
* See the License for the specific language governing permissions and
18+
* limitations under the License.
19+
*/
20+
21+
package com.apple.foundationdb.relational.yamltests;
22+
23+
import com.apple.foundationdb.record.logging.KeyValueLogMessage;
24+
import com.apple.foundationdb.relational.api.exceptions.ErrorCode;
25+
import com.apple.foundationdb.relational.api.exceptions.RelationalException;
26+
import com.google.common.collect.ImmutableSet;
27+
import org.slf4j.Logger;
28+
import org.slf4j.LoggerFactory;
29+
30+
import javax.annotation.Nonnull;
31+
import javax.annotation.Nullable;
32+
import java.io.BufferedReader;
33+
import java.io.IOException;
34+
import java.io.InputStreamReader;
35+
import java.nio.charset.StandardCharsets;
36+
import java.nio.file.Files;
37+
import java.nio.file.Path;
38+
import java.util.ArrayList;
39+
import java.util.List;
40+
import java.util.Set;
41+
42+
/**
43+
* Utility class for finding changed metrics files using git integration.
44+
* This class provides methods to identify metrics files that have changed between different git references.
45+
*/
46+
public final class GitMetricsFileFinder {
47+
private static final Logger logger = LoggerFactory.getLogger(GitMetricsFileFinder.class);
48+
49+
private GitMetricsFileFinder() {
50+
// Utility class
51+
}
52+
53+
/**
54+
* Finds all metrics YAML files that have changed between two git references.
55+
*
56+
* @param baseRef the base git reference (e.g., "main", "HEAD~1", commit SHA)
57+
* @param headRef the target git reference (e.g., "main", "HEAD~1", commit SHA)
58+
* @param repositoryRoot the root directory of the git repository
59+
* @return set of paths to changed metrics YAML files
60+
* @throws RelationalException if git command fails or repository is not valid
61+
*/
62+
@Nonnull
63+
public static Set<Path> findChangedMetricsYamlFiles(@Nonnull final String baseRef,
64+
@Nonnull final String headRef,
65+
@Nonnull final Path repositoryRoot) throws RelationalException {
66+
final List<String> changedFiles = getChangedFiles(baseRef, headRef, repositoryRoot);
67+
final ImmutableSet.Builder<Path> metricsFiles = ImmutableSet.builder();
68+
69+
for (final var filePath : changedFiles) {
70+
if (isMetricsYamlFile(filePath)) {
71+
metricsFiles.add(repositoryRoot.resolve(filePath));
72+
}
73+
}
74+
75+
return metricsFiles.build();
76+
}
77+
78+
/**
79+
* Gets a list of all files changed between two git references.
80+
*
81+
* @param baseRef the base reference
82+
* @param headRef the target reference
83+
* @param repositoryRoot the repository root directory
84+
* @return list of relative file paths that have changed
85+
* @throws RelationalException if git command fails
86+
*/
87+
@Nonnull
88+
private static List<String> getChangedFiles(@Nonnull final String baseRef,
89+
@Nonnull final String headRef,
90+
@Nonnull final Path repositoryRoot) throws RelationalException {
91+
final var command = List.of("git", "diff", "--name-only", baseRef + "..." + headRef);
92+
try {
93+
if (logger.isDebugEnabled()) {
94+
logger.debug(KeyValueLogMessage.of("Executing git command",
95+
"root", repositoryRoot,
96+
"command", String.join(" ", command)));
97+
}
98+
final var processBuilder = new ProcessBuilder(command)
99+
.directory(repositoryRoot.toFile())
100+
.redirectErrorStream(true);
101+
102+
final var process = processBuilder.start();
103+
final List<String> result = new ArrayList<>();
104+
105+
try (BufferedReader reader = new BufferedReader(new InputStreamReader(process.getInputStream(), StandardCharsets.UTF_8))) {
106+
for (String line = reader.readLine(); line != null; line = reader.readLine()) {
107+
final String stripped = line.strip();
108+
if (!stripped.isEmpty()) {
109+
result.add(stripped);
110+
}
111+
}
112+
}
113+
114+
final var exitCode = process.waitFor();
115+
if (exitCode != 0) {
116+
throw new RelationalException(
117+
"Git command failed with exit code " + exitCode + ": " + String.join(" ", command),
118+
ErrorCode.INTERNAL_ERROR);
119+
}
120+
121+
return result;
122+
} catch (final IOException | InterruptedException e) {
123+
throw new RelationalException("Failed to execute git command: " + String.join(" ", command), ErrorCode.INTERNAL_ERROR, e);
124+
}
125+
}
126+
127+
/**
128+
* Checks if a file path represents a metrics YAML file.
129+
*
130+
* @param filePath the file path to check
131+
* @return true if the file is a metrics file
132+
*/
133+
private static boolean isMetricsYamlFile(@Nonnull final String filePath) {
134+
return filePath.endsWith(".metrics.yaml");
135+
}
136+
137+
/**
138+
* Gets the file contents for a specific git reference (commit).
139+
* This method checks out the file content at the specified reference and
140+
* saves it into a temporary file located at the returned {@link Path}.
141+
* If the file does not exist at the given reference, it will return a
142+
* {@code null} path.
143+
*
144+
* @param filePath the relative file path
145+
* @param gitRef the git reference
146+
* @param repositoryRoot the repository root
147+
* @return path to a temporary file containing the content at the specified reference
148+
* or {@code null} if it didn't exist at the given ref
149+
* @throws RelationalException if git command fails
150+
*/
151+
@Nullable
152+
public static Path getFileAtReference(@Nonnull final String filePath,
153+
@Nonnull final String gitRef,
154+
@Nonnull final Path repositoryRoot) throws RelationalException {
155+
try {
156+
final var tempFile = Files.createTempFile("metrics-diff-", gitRef + ".metrics.yaml");
157+
final var command = List.of("git", "show", gitRef + ":" + filePath);
158+
final var processBuilder = new ProcessBuilder(command)
159+
.directory(repositoryRoot.toFile())
160+
.redirectOutput(tempFile.toFile())
161+
.redirectErrorStream(false);
162+
163+
final var process = processBuilder.start();
164+
final var exitCode = process.waitFor();
165+
166+
if (exitCode != 0) {
167+
// File could not be loaded. It was probably not present at that version, so return null
168+
if (logger.isDebugEnabled()) {
169+
logger.debug(KeyValueLogMessage.of("Unable to get file contents at reference",
170+
"path", filePath,
171+
"ref", gitRef));
172+
}
173+
Files.deleteIfExists(tempFile);
174+
return null;
175+
}
176+
177+
return tempFile;
178+
} catch (final IOException | InterruptedException e) {
179+
throw new RelationalException(
180+
"Failed to get file content at reference " + gitRef + ":" + filePath, ErrorCode.INTERNAL_ERROR, e);
181+
}
182+
}
183+
}

0 commit comments

Comments
 (0)