Skip to content

ci: Improve error handling for Python backend model initialization failures #8303

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

pskiran1
Copy link
Member

@pskiran1 pskiran1 commented Jul 21, 2025

What does the PR do?

Checklist

  • PR title reflects the change and is of format <commit_type>: <Title>
  • Changes are described in the pull request.
  • Related issues are referenced.
  • Populated github labels field
  • Added test plan and verified test passes.
  • Verified that the PR passes existing CI.
  • Verified copyright is correct on all changed files.
  • Added succinct git squash message before merging ref.
  • All template sections are filled out.
  • Optional: Additional screenshots for behavior/output changes with before/after.

Commit Type:

Check the conventional commit type
box here and add the label to the github PR.

  • build
  • ci
  • docs
  • feat
  • fix
  • perf
  • refactor
  • revert
  • style
  • test

Related PRs:

Where should the reviewer start?

Test plan:

  • CI Pipeline ID: 32161262

Caveats:

Background

triton-inference-server/python_backend#408

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • closes GitHub issue: #xxx

@pskiran1 pskiran1 added the PR: ci Changes to our CI configuration files and scripts label Jul 21, 2025
@pskiran1 pskiran1 marked this pull request as ready for review July 21, 2025 14:55
@pskiran1 pskiran1 requested review from krishung5 and yinggeh July 21, 2025 17:06
@yinggeh
Copy link
Contributor

yinggeh commented Jul 23, 2025

Please add related PRs in description triton-inference-server/python_backend#408

@yinggeh yinggeh self-requested a review July 23, 2025 18:26
@@ -445,16 +445,6 @@ else
fi
fi

current_num_pages=`get_shm_pages`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you pleas clarify why we need to remove the shm check?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test case uncovered a pre-existing bug: the internal SHM in the Python backend is not destroyed when model initialization fails. I have created a follow-up ticket, DLIS-8432, to address this issue. To unblock the current issue, I removed the SHM check that I added earlier. I need to figure it out to address DLIS-8432 separately or as part of this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR: ci Changes to our CI configuration files and scripts
Development

Successfully merging this pull request may close these issues.

3 participants