Skip to content

Conversation

@anujmodi2021
Copy link
Contributor

@anujmodi2021 anujmodi2021 commented Oct 21, 2025

Description of PR

Jira: https://issues.apache.org/jira/browse/HADOOP-19729

It has been observed that certain requests taking more time than expected to complete hinders the performance of whole workload. Such requests are known as tailing requests. They can be taking more time due to a number of reasons and the prominent among them is a bad network connection. In Abfs driver we cache network connections and keeping such bad connections in cache and reusing them can be bad for perf.

In this effort we try to identify such connections and close them so that new good connetions can be established and perf can be improved. There are two parts of this effort.

  1. Identifying Tailing Requests: This involves profiling all the network calls and getting percentiles value optimally. By default we consider p99 as the tail latency and all the future requests taking more than tail latency will be considere as Tailing requests.

  2. Proactively Killing Socket Connections: With Apache client, we can now kill the socket connection and fail the tailing request. Such failures will not be thrown back to user and retried immediately without any sleep but from another socket connection.

How was this patch tested?

New tests around both profiling and connection killing added.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 8m 54s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
-1 ❌ test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
_ trunk Compile Tests _
+1 💚 mvninstall 21m 29s trunk passed
+1 💚 compile 0m 24s trunk passed with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 27s trunk passed with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
+1 💚 checkstyle 0m 19s trunk passed
+1 💚 mvnsite 0m 25s trunk passed
+1 💚 javadoc 0m 23s trunk passed with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 23s trunk passed with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
-1 ❌ spotbugs 0m 44s /branch-spotbugs-hadoop-tools_hadoop-azure-warnings.html hadoop-tools/hadoop-azure in trunk has 178 extant spotbugs warnings.
+1 💚 shadedclient 14m 3s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 20s the patch passed
+1 💚 compile 0m 18s the patch passed with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 18s the patch passed
+1 💚 compile 0m 20s the patch passed with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 20s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 10s /results-checkstyle-hadoop-tools_hadoop-azure.txt hadoop-tools/hadoop-azure: The patch generated 11 new + 3 unchanged - 0 fixed = 14 total (was 3)
+1 💚 mvnsite 0m 21s the patch passed
-1 ❌ javadoc 0m 18s /results-javadoc-javadoc-hadoop-tools_hadoop-azure-jdkUbuntu-21.0.7+6-Ubuntu-0ubuntu120.04.txt hadoop-tools_hadoop-azure-jdkUbuntu-21.0.7+6-Ubuntu-0ubuntu120.04 with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04 generated 36 new + 1472 unchanged - 0 fixed = 1508 total (was 1472)
-1 ❌ javadoc 0m 15s /results-javadoc-javadoc-hadoop-tools_hadoop-azure-jdkUbuntu-17.0.15+6-Ubuntu-0ubuntu120.04.txt hadoop-tools_hadoop-azure-jdkUbuntu-17.0.15+6-Ubuntu-0ubuntu120.04 with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04 generated 34 new + 1413 unchanged - 0 fixed = 1447 total (was 1413)
-1 ❌ spotbugs 0m 46s /new-spotbugs-hadoop-tools_hadoop-azure.html hadoop-tools/hadoop-azure generated 4 new + 177 unchanged - 1 fixed = 181 total (was 178)
+1 💚 shadedclient 14m 7s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 8s hadoop-azure in the patch passed.
-1 ❌ asflicense 0m 20s /results-asflicense.txt The patch generated 1 ASF License warnings.
67m 50s
Reason Tests
SpotBugs module:hadoop-tools/hadoop-azure
Unknown bug pattern CT_CONSTRUCTOR_THROW in new org.apache.hadoop.fs.azurebfs.services.AbfsAHCHttpOperation(URL, String, List, Duration, Duration, long, AbfsApacheHttpClient, AbfsClient) At AbfsAHCHttpOperation.java:new org.apache.hadoop.fs.azurebfs.services.AbfsAHCHttpOperation(URL, String, List, Duration, Duration, long, AbfsApacheHttpClient, AbfsClient) At AbfsAHCHttpOperation.java:[line 118]
Unknown bug pattern AT_STALE_THREAD_WRITE_OF_PRIMITIVE in org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.executeHttpOperation(int, TracingContext) At AbfsRestOperation.java:org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.executeHttpOperation(int, TracingContext) At AbfsRestOperation.java:[line 528]
new org.apache.hadoop.fs.azurebfs.services.AbfsTailLatencyTracker(AbfsConfiguration) may expose internal representation by storing an externally mutable object into AbfsTailLatencyTracker.configuration At AbfsTailLatencyTracker.java:representation by storing an externally mutable object into AbfsTailLatencyTracker.configuration At AbfsTailLatencyTracker.java:[line 55]
Unknown bug pattern CT_CONSTRUCTOR_THROW in new org.apache.hadoop.fs.azurebfs.services.SlidingWindowHdrHistogram(long, int, int, int, int, long, int, AbfsRestOperationType) At SlidingWindowHdrHistogram.java:new org.apache.hadoop.fs.azurebfs.services.SlidingWindowHdrHistogram(long, int, int, int, int, long, int, AbfsRestOperationType) At SlidingWindowHdrHistogram.java:[line 79]
Subsystem Report/Notes
Docker ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8043/1/artifact/out/Dockerfile
GITHUB PR #8043
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle
uname Linux 420ed6d29123 5.15.0-156-generic #166-Ubuntu SMP Sat Aug 9 00:02:46 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 6da7267
Default Java Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
Multi-JDK versions /usr/lib/jvm/java-21-openjdk-amd64:Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-17-openjdk-amd64:Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8043/1/testReport/
Max. process+thread count 773 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8043/1/console
versions git=2.25.1 maven=3.9.11 spotbugs=4.9.7
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@anujmodi2021 anujmodi2021 changed the title Hadoop 19729. [ABFS][Perf] Network Profiling for Tailing Requests and Killing Bad Connections Proactively HADOOP-19729. [ABFS][Perf] Network Profiling for Tailing Requests and Killing Bad Connections Proactively Oct 21, 2025
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 11m 8s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 21m 31s trunk passed
+1 💚 compile 0m 23s trunk passed with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 25s trunk passed with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
+1 💚 checkstyle 0m 18s trunk passed
+1 💚 mvnsite 0m 26s trunk passed
+1 💚 javadoc 0m 24s trunk passed with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 22s trunk passed with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
-1 ❌ spotbugs 0m 43s /branch-spotbugs-hadoop-tools_hadoop-azure-warnings.html hadoop-tools/hadoop-azure in trunk has 178 extant spotbugs warnings.
+1 💚 shadedclient 13m 53s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 22s the patch passed
+1 💚 compile 0m 18s the patch passed with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 18s the patch passed
+1 💚 compile 0m 21s the patch passed with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 21s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 10s /results-checkstyle-hadoop-tools_hadoop-azure.txt hadoop-tools/hadoop-azure: The patch generated 43 new + 3 unchanged - 0 fixed = 46 total (was 3)
+1 💚 mvnsite 0m 20s the patch passed
-1 ❌ javadoc 0m 18s /results-javadoc-javadoc-hadoop-tools_hadoop-azure-jdkUbuntu-21.0.7+6-Ubuntu-0ubuntu120.04.txt hadoop-tools_hadoop-azure-jdkUbuntu-21.0.7+6-Ubuntu-0ubuntu120.04 with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04 generated 34 new + 1472 unchanged - 0 fixed = 1506 total (was 1472)
-1 ❌ javadoc 0m 16s /results-javadoc-javadoc-hadoop-tools_hadoop-azure-jdkUbuntu-17.0.15+6-Ubuntu-0ubuntu120.04.txt hadoop-tools_hadoop-azure-jdkUbuntu-17.0.15+6-Ubuntu-0ubuntu120.04 with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04 generated 33 new + 1413 unchanged - 0 fixed = 1446 total (was 1413)
-1 ❌ spotbugs 0m 46s /new-spotbugs-hadoop-tools_hadoop-azure.html hadoop-tools/hadoop-azure generated 4 new + 177 unchanged - 1 fixed = 181 total (was 178)
+1 💚 shadedclient 14m 12s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 8s hadoop-azure in the patch passed.
+1 💚 asflicense 0m 19s The patch does not generate ASF License warnings.
70m 0s
Reason Tests
SpotBugs module:hadoop-tools/hadoop-azure
Unknown bug pattern CT_CONSTRUCTOR_THROW in new org.apache.hadoop.fs.azurebfs.services.AbfsAHCHttpOperation(URL, String, List, Duration, Duration, long, AbfsApacheHttpClient, AbfsClient) At AbfsAHCHttpOperation.java:new org.apache.hadoop.fs.azurebfs.services.AbfsAHCHttpOperation(URL, String, List, Duration, Duration, long, AbfsApacheHttpClient, AbfsClient) At AbfsAHCHttpOperation.java:[line 123]
Unknown bug pattern AT_STALE_THREAD_WRITE_OF_PRIMITIVE in org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.executeHttpOperation(int, TracingContext) At AbfsRestOperation.java:org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.executeHttpOperation(int, TracingContext) At AbfsRestOperation.java:[line 533]
new org.apache.hadoop.fs.azurebfs.services.AbfsTailLatencyTracker(AbfsConfiguration) may expose internal representation by storing an externally mutable object into AbfsTailLatencyTracker.configuration At AbfsTailLatencyTracker.java:representation by storing an externally mutable object into AbfsTailLatencyTracker.configuration At AbfsTailLatencyTracker.java:[line 55]
Unknown bug pattern CT_CONSTRUCTOR_THROW in new org.apache.hadoop.fs.azurebfs.services.SlidingWindowHdrHistogram(long, int, int, int, int, long, int, AbfsRestOperationType) At SlidingWindowHdrHistogram.java:new org.apache.hadoop.fs.azurebfs.services.SlidingWindowHdrHistogram(long, int, int, int, int, long, int, AbfsRestOperationType) At SlidingWindowHdrHistogram.java:[line 81]
Subsystem Report/Notes
Docker ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8043/2/artifact/out/Dockerfile
GITHUB PR #8043
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle
uname Linux d98c9c1604f2 5.15.0-156-generic #166-Ubuntu SMP Sat Aug 9 00:02:46 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 7dcac93
Default Java Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
Multi-JDK versions /usr/lib/jvm/java-21-openjdk-amd64:Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-17-openjdk-amd64:Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8043/2/testReport/
Max. process+thread count 614 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8043/2/console
versions git=2.25.1 maven=3.9.11 spotbugs=4.9.7
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@anujmodi2021 anujmodi2021 marked this pull request as ready for review October 27, 2025 04:46
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 21s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 22m 11s trunk passed
+1 💚 compile 0m 23s trunk passed with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 23s trunk passed with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
+1 💚 checkstyle 0m 17s trunk passed
+1 💚 mvnsite 0m 25s trunk passed
+1 💚 javadoc 0m 24s trunk passed with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 23s trunk passed with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
-1 ❌ spotbugs 0m 46s /branch-spotbugs-hadoop-tools_hadoop-azure-warnings.html hadoop-tools/hadoop-azure in trunk has 178 extant spotbugs warnings.
+1 💚 shadedclient 14m 13s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 22s the patch passed
+1 💚 compile 0m 19s the patch passed with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 19s the patch passed
+1 💚 compile 0m 20s the patch passed with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 20s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 11s /results-checkstyle-hadoop-tools_hadoop-azure.txt hadoop-tools/hadoop-azure: The patch generated 43 new + 3 unchanged - 0 fixed = 46 total (was 3)
+1 💚 mvnsite 0m 21s the patch passed
-1 ❌ javadoc 0m 17s /results-javadoc-javadoc-hadoop-tools_hadoop-azure-jdkUbuntu-21.0.7+6-Ubuntu-0ubuntu120.04.txt hadoop-tools_hadoop-azure-jdkUbuntu-21.0.7+6-Ubuntu-0ubuntu120.04 with JDK Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04 generated 35 new + 1472 unchanged - 0 fixed = 1507 total (was 1472)
-1 ❌ javadoc 0m 16s /results-javadoc-javadoc-hadoop-tools_hadoop-azure-jdkUbuntu-17.0.15+6-Ubuntu-0ubuntu120.04.txt hadoop-tools_hadoop-azure-jdkUbuntu-17.0.15+6-Ubuntu-0ubuntu120.04 with JDK Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04 generated 34 new + 1413 unchanged - 0 fixed = 1447 total (was 1413)
-1 ❌ spotbugs 0m 45s /new-spotbugs-hadoop-tools_hadoop-azure.html hadoop-tools/hadoop-azure generated 4 new + 177 unchanged - 1 fixed = 181 total (was 178)
+1 💚 shadedclient 14m 10s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 8s hadoop-azure in the patch passed.
+1 💚 asflicense 0m 18s The patch does not generate ASF License warnings.
60m 12s
Reason Tests
SpotBugs module:hadoop-tools/hadoop-azure
Unknown bug pattern CT_CONSTRUCTOR_THROW in new org.apache.hadoop.fs.azurebfs.services.AbfsAHCHttpOperation(URL, String, List, Duration, Duration, long, AbfsApacheHttpClient, AbfsClient) At AbfsAHCHttpOperation.java:new org.apache.hadoop.fs.azurebfs.services.AbfsAHCHttpOperation(URL, String, List, Duration, Duration, long, AbfsApacheHttpClient, AbfsClient) At AbfsAHCHttpOperation.java:[line 123]
Unknown bug pattern AT_STALE_THREAD_WRITE_OF_PRIMITIVE in org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.executeHttpOperation(int, TracingContext) At AbfsRestOperation.java:org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.executeHttpOperation(int, TracingContext) At AbfsRestOperation.java:[line 533]
new org.apache.hadoop.fs.azurebfs.services.AbfsTailLatencyTracker(AbfsConfiguration) may expose internal representation by storing an externally mutable object into AbfsTailLatencyTracker.configuration At AbfsTailLatencyTracker.java:representation by storing an externally mutable object into AbfsTailLatencyTracker.configuration At AbfsTailLatencyTracker.java:[line 55]
Unknown bug pattern CT_CONSTRUCTOR_THROW in new org.apache.hadoop.fs.azurebfs.services.SlidingWindowHdrHistogram(long, int, int, int, int, long, int, AbfsRestOperationType) At SlidingWindowHdrHistogram.java:new org.apache.hadoop.fs.azurebfs.services.SlidingWindowHdrHistogram(long, int, int, int, int, long, int, AbfsRestOperationType) At SlidingWindowHdrHistogram.java:[line 81]
Subsystem Report/Notes
Docker ClientAPI=1.51 ServerAPI=1.51 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8043/3/artifact/out/Dockerfile
GITHUB PR #8043
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle
uname Linux 9b5c6baa74d5 5.15.0-156-generic #166-Ubuntu SMP Sat Aug 9 00:02:46 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / de244d2
Default Java Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
Multi-JDK versions /usr/lib/jvm/java-21-openjdk-amd64:Ubuntu-21.0.7+6-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-17-openjdk-amd64:Ubuntu-17.0.15+6-Ubuntu-0ubuntu120.04
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8043/3/testReport/
Max. process+thread count 637 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8043/3/console
versions git=2.25.1 maven=3.9.11 spotbugs=4.9.7
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

}

public boolean isTailLatencyRequestTimeoutEnabled() {
return isTailLatencyRequestTimeoutEnabled && isTailLatencyTrackerEnabled
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first check should be for isTailLatencyTrackerEnabled

return tailLatencyAnalysisWindowInMillis;
}

public int getTailLatencyPercentileComputationIntervalInMillis() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name should be shortened


public static final boolean DEFAULT_FS_AZURE_ENABLE_CREATE_BLOB_IDEMPOTENCY = true;

public static final boolean DEFAULT_FS_AZURE_ENABLE_TAIL_LATENCY_TRACKER = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we not want this feature to be enabled by default ?

} catch (TimeoutException e) {
/* Deadline exceeded, abort the request.
* This will also kill the underlying socket exception in the HttpClient.
* Connection will be marker stale and won't be returned back to KAC for reuse.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: typo marked

/**
* Constant for Static Retry Policy Abbreviation. {@value}
*/
public static final String TAIL_LATENCY_TIMEOUT_RETRY_POLICY_ABBREVIATION = "T";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make it TL ?

int significantFigures,
final AbfsRestOperationType operationType) {
if (windowSizeMillis <= 0) throw new IllegalArgumentException("windowSizeMillis > 0");
if (numberOfSegments <= 0) throw new IllegalArgumentException("bucketDurationMillis > 0");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be numberOfSegments in exception

/** Get any percentile over the current sliding window. */
public void computeLatency() {
if (getCurrentTotalCount() < minSampleSize) {
LOG.debug("[{}] Not enough data to report percentiles. Current total count: {}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can return here itself

p50 = tmpForMerge.getValueAtPercentile(50);
p90 = tmpForMerge.getValueAtPercentile(90);
p99 = tmpForMerge.getValueAtPercentile(99);
deviation = (int) ((tailLatency - p50)/p50 * 100);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chances of division by zero error

activeSegmentRecorder.getIntervalHistogramInto(tmpForDelta);
currentSegmentAccumulation.add(tmpForDelta);

if (currentSegmentAccumulation.getTotalCount() <= 0) {
Copy link
Contributor

@anmolanmol1234 anmolanmol1234 Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is less than 0 possible for total count? It is incremented always right

// Next slot is now going to be eradicated. Remove its count from total.
currentTotalCount.set(currentTotalCount.get() - (completedSegments[currentIdx] == null ? 0 : completedSegments[currentIdx].getTotalCount()));
// Store an immutable snapshot (make sure we don't mutate the instance after storing)
completedSegments[currentIdx] = currentSegmentAccumulation;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how are we making sure that this is immutable after this point ? completedSegments[currentIdx] = currentSegmentAccumulation.copy(); ideally we should create a deep copy of the histogram data so that even if currentSegmentAccumulation is reused or reset for the next segment, the data in completedSegments remains unchanged.

/** Ensure active bucket is aligned to current time; rotate if we've crossed a boundary. */
public void rotateIfNeeded() {
LOG.debug("[{}] Triggering Histogram Rotation", operationType);
long now = System.currentTimeMillis();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the use of the variable now ? We can directly use System.currentTimeMillis(); in expectedStart as we are doing later in line 195

*/
private String failureReason;
private AbfsRetryPolicy retryPolicy;
private boolean shouldTailLatencyTimeout = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be renamed to enableTailLatencyTimeout

}

// Update Tail Latency Tracker only for successful requests.
if (tailLatencyTracker != null && statusCode < HttpURLConnection.HTTP_MULT_CHOICE) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will get updated for -1 status code as well, should be checked between 200 to 300


@Test
public void testSlidingWindowHdrHistogram() throws Exception {
SlidingWindowHdrHistogram histogram = new SlidingWindowHdrHistogram(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add comment for which value represents what

// Verify that analysis window is full after full rotation.
assertThat(histogram.isAnalysisWindowFilled()).isTrue();

// Verify that percentiles are not computed due to low deviation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should be computed ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants