Skip to content

Commit f03f45e

Browse files
[SPARK-51011][CORE] Add logging for whether a task is going to be interrupted when killed
### What changes were proposed in this pull request? We now log the value of `interruptThread` when a `TaskRunner`'s `kill` method is killed. This should help with debugging when potential zombie Spark tasks do not seem to be exiting. ### Why are the changes needed? Today, it's tricky to debug why a task is not exiting (and thus, why executors might be getting lost) without knowing for sure if it was issued a Java interrupt. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Ran `org.apache.spark.executor.ExecutorSuite` and verified the log looked as expected. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #49699 from neilramaswamy/spark-51011. Lead-authored-by: Neil Ramaswamy <neil.ramaswamy@databricks.com> Co-authored-by: Hyukjin Kwon <gurwls223@gmail.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
1 parent 8fc6a20 commit f03f45e

File tree

2 files changed

+4
-2
lines changed

2 files changed

+4
-2
lines changed

common/utils/src/main/scala/org/apache/spark/internal/LogKey.scala

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -331,6 +331,7 @@ private[spark] object LogKeys {
331331
case object INPUT extends LogKey
332332
case object INPUT_SPLIT extends LogKey
333333
case object INTEGRAL extends LogKey
334+
case object INTERRUPT_THREAD extends LogKey
334335
case object INTERVAL extends LogKey
335336
case object INVALID_PARAMS extends LogKey
336337
case object ISOLATION_LEVEL extends LogKey

core/src/main/scala/org/apache/spark/executor/Executor.scala

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -504,8 +504,9 @@ private[spark] class Executor(
504504
@volatile var task: Task[Any] = _
505505

506506
def kill(interruptThread: Boolean, reason: String): Unit = {
507-
logInfo(log"Executor is trying to kill ${LogMDC(TASK_NAME, taskName)}," +
508-
log" reason: ${LogMDC(REASON, reason)}")
507+
logInfo(log"Executor is trying to kill ${LogMDC(TASK_NAME, taskName)}, " +
508+
log"interruptThread: ${LogMDC(INTERRUPT_THREAD, interruptThread)}, " +
509+
log"reason: ${LogMDC(REASON, reason)}")
509510
reasonIfKilled = Some(reason)
510511
if (task != null) {
511512
synchronized {

0 commit comments

Comments
 (0)