Skip to content

gh-134604: Fix tracemalloc crash with subinterpreters #134667

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

emmatyping
Copy link
Member

@emmatyping emmatyping commented May 25, 2025

This is an attempt at resolving a crash when mixing tracemalloc and sub-interpreters I came across recently. My investigation is recorded in the issue #134604 (comment)

This PR makes a few changes:

  • modifies tracemalloc to copy the filename of a frame instead of referencing it
  • delays sub-interpreter finalization until after tracemalloc finalizes (optional, but probably a good idea)
  • fixes a typo/duplicated assignment on a line in the tracemalloc_get_frame function
  • adds tests to test__interpreters to ensure that this functions correctly.

Should this be backported to 3.13? I suppose subinterpreters can be used there through C APIs, so it might be good to fix this there too.

With subinterpreters, frames can be destroyed at any point, meaning
tracemalloc needs to copy instead of reference the frame filenames.
Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's more to be fixed than just the filename. Doesn't tracemalloc have mutable state accessible by all interpreters?

@emmatyping
Copy link
Member Author

I think there's more to be fixed than just the filename. Doesn't tracemalloc have mutable state accessible by all interpreters?

If you mean the allocator/tracemalloc runtime state, that is guarded by a lock.

@ZeroIntensity
Copy link
Member

Right, but some of the state that it stores looks like it could be problematic across interpreters. I'll investigate.

@emmatyping
Copy link
Member Author

Yeah the following test hangs:

def test_tracemalloc_multiple_subinterpreters(self):
    num_threads = 8
    threads = []

    def run_interp():
        interpid = _interpreters.create()
        _interpreters.run_string(
                interpid,
                "def foo(a, b): return a + b; foo(1, 5)")
        _interpreters.destroy(interpid)

    for i in range(num_threads):
        thread = threading.Thread(target=run_interp)

        threads.append(thread)

    with threading_helper.start_threads(threads):
        pass
    
    stats = tracemalloc.take_snapshot().statistics("filename")
    count = sum(trace.traceback._frames[0][0] == "<string>"
                for trace in stats)
    self.assertEqual(count, num_threads)

@vstinner
Copy link
Member

modifies tracemalloc to copy the filename of a frame instead of referencing it

tracemalloc_filenames stores strong references to Unicode strings. What is the problem with sub-interpreters? I would prefer to keep Unicode objects. For example, _Py_DumpASCII(fd, filename) has a different behavior than PUTS(fd, filename).

@vstinner
Copy link
Member

Would it be possible to modify memory allocation hooks (malloc, calloc, realloc, free) to detect which interpreter is used, and do nothing if it's called from an interpreter different than the one which called tracemalloc.start()? Limit tracemalloc to a single interpreter at the same time.

The problem is that tracemalloc_raw_malloc(), tracemalloc_raw_calloc(), tracemalloc_raw_realloc() and tracemalloc_free() are called without the GIL. The first 3 functions call PyGILState_Ensure() / PyGILState_Release() to acquire the GIL, but these functions are known to not work with sub-interpreters... We would need PEP 788 API here.

@emmatyping
Copy link
Member Author

Would it be possible to modify memory allocation hooks (malloc, calloc, realloc, free) to detect which interpreter is used, and do nothing if it's called from an interpreter different than the one which called tracemalloc.start()? Limit tracemalloc to a single interpreter at the same time.

Yeah I think something like that could be done. I think it might be better to shard the state it tracks to be per-interpreter however. I expect users may wish to call tracemalloc.start() within a sub-interpreter. Also what would the behavior of -X tracemalloc be if it only traced one interpreter? Tracing the main one I presume?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants