Skip to content

Conversation

@brandonistfan
Copy link
Contributor

@brandonistfan brandonistfan commented Nov 2, 2025

GitHub Issues addressed

  • This PR closes

What I did

Stopped the leaking fetch_enrollment.py script: replaced the global session shared by the threads with thread-local sessions via threading.local(), closed all threads after updating, and added connection cleanup on interrupt. Tests and a run on 9,887 sections showed no thread or memory leaks.

Screenshots

  • Before
  • After

Testing

  • A brief explanation of tests done/written or how reviewers can test your work

Run tests

docker exec tcf_django python manage.py test tcf_website.tests.test_fetch_enrollment

Test on dataset

Before running, click on a few different courses and run the commands to see the stats below and ensure that the threads are being terminated.
The command below is for all sections.

docker exec tcf_django python manage.py fetch_enrollment

Verify no orphans

docker exec tcf_django bash -c "pgrep -a python"

Monitor memory

docker stats tcf_django

Questions/Discussions/Notes

Summary by CodeRabbit

  • New Features

    • Network requests now use isolated per-thread sessions for more reliable background fetching.
  • Bug Fixes

    • Improved interruption handling: background work cleanly stops and releases DB connections to prevent leaks.
  • Performance

    • Reduced concurrent workers (20→5) and ensured section counts are computed before processing for steadier resource use.
  • Tests

    • Updated tests to validate thread-local session behavior and cleanup.

@coderabbitai
Copy link

coderabbitai bot commented Nov 2, 2025

Walkthrough

Replaces a global requests Session with per-thread sessions via get_session() backed by thread-local storage; materializes the Section queryset before parallel processing; reduces max workers from 20 to 5; ensures executor shutdown and calls django.db.connections.close_all() on KeyboardInterrupt; updates tests to mock get_session() and verify thread-local sessions; switches a single connection.close() call to connections.close_all() in the API background update.

Changes

Cohort / File(s) Summary
Thread-Local Session Management
tcf_website/management/commands/fetch_enrollment.py
Adds _session_local thread-local storage and a public get_session() to create/return per-thread requests.Session instances with mounted HTTPAdapter and Retry configuration; replaces previous global session usage.
Fetch & Concurrency Adjustments
tcf_website/management/commands/fetch_enrollment.py
Updates fetch_section_data() and caller sites to use get_session(); materializes the Sections queryset by calling count() then list(...) before parallel processing; reduces ThreadPoolExecutor max_workers from 20 to 5; on KeyboardInterrupt shuts down the executor and calls django.db.connections.close_all().
Error Handling & Retry Integration
tcf_website/management/commands/fetch_enrollment.py
Adapts existing retry/backoff logic to work with per-thread sessions and preserves request/error handling around network calls.
API: DB Connection Finalization
tcf_website/api/views.py
Replaces import/use of a single connection.close() with django.db.connections.close_all() inside the background update thread.
Tests: Session Mocking & Thread Safety
tcf_website/tests/test_fetch_enrollment.py
Changes tests to patch get_session() and use the returned mock session for .get(), .raise_for_status() and .json() behavior; adds test_thread_local_sessions asserting distinct session instances per thread; updates fetch tests accordingly.

Sequence Diagram(s)

sequenceDiagram
    participant Main as Main Thread
    participant Worker as Worker Thread
    participant get_session as get_session()
    participant TLS as _session_local (thread-local)
    participant Adapter as HTTPAdapter/Retry
    participant Remote as Remote Service

    rect rgb(220,235,255)
    Note over Worker,Main: First call in a thread
    Worker->>get_session: request session
    get_session->>TLS: check for session
    alt none present
        get_session->>Adapter: create HTTPAdapter + Retry
        get_session->>TLS: store session
    end
    TLS-->>get_session: session
    get_session-->>Worker: session
    end

    rect rgb(235,245,220)
    Note over Worker,Remote: Use session for requests
    Worker->>Remote: session.get(url)
    Remote-->>Worker: response
    Worker->>Worker: handle response / retry logic
    end

    rect rgb(245,230,230)
    Note over Main,Worker: Shutdown on KeyboardInterrupt
    Main->>Worker: KeyboardInterrupt
    Main->>Main: executor.shutdown()
    Main->>Main: django.db.connections.close_all()
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Review thread-local session initialization and adapter/retry configuration.
  • Verify tests correctly simulate per-thread sessions and that test_thread_local_sessions is reliable.
  • Check queryset materialization for memory implications on large datasets.
  • Confirm connections.close_all() placement inside background threads doesn't introduce race conditions.

Poem

🐰 A hop, a thread, each burrow keeps a brew,
Sessions snug and separate, not one but a few.
Five busy paws fetch, no jostle or din,
Connections tucked away when the cleanup begins.
The rabbit smiles: quiet fetches win.

Pre-merge checks and finishing touches

✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The title "fix: enrollment thread and temp storage issue" is directly related to the main changes in the changeset. The PR addresses threading issues by replacing a global session with thread-local sessions and adding connection cleanup on interrupt—both core concerns mentioned in the title. The title is concise and specific enough that a teammate scanning the history would understand this fixes threading and resource cleanup issues in the enrollment functionality. While "temp storage issue" could arguably be more specific, the title adequately summarizes the primary changes without being vague or generic.
Description Check ✅ Passed The pull request description follows the required template structure and provides substantive content in the critical sections. The "What I did" section clearly describes the changes made (thread-local sessions, connection cleanup, and memory leak prevention) with validation results from testing. The "Testing" section is comprehensive, providing specific Docker commands for unit tests, dataset validation, orphan process verification, and memory monitoring. However, two sections are incomplete: the "GitHub Issues addressed" section contains only the template placeholder without an actual issue reference, and the "Screenshots" section has "Before" and "After" labels but no actual images are included.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch enrollment-fix

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 12e0ca1 and 5ad8636.

📒 Files selected for processing (2)
  • tcf_website/management/commands/fetch_enrollment.py (6 hunks)
  • tcf_website/tests/test_fetch_enrollment.py (5 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
tcf_website/management/commands/fetch_enrollment.py (1)
tcf_website/models/models.py (1)
  • Section (942-992)
tcf_website/tests/test_fetch_enrollment.py (2)
tcf_website/management/commands/fetch_enrollment.py (2)
  • fetch_section_data (76-137)
  • get_session (49-65)
tcf_website/models/models.py (1)
  • SectionEnrollment (1055-1099)
🪛 GitHub Actions: Continuous Integration
tcf_website/management/commands/fetch_enrollment.py

[error] 168-168: Pylint: Trailing whitespace. (C0303)

tcf_website/tests/test_fetch_enrollment.py

[error] 116-116: Pylint: Line too long (101/100). (C0301)


[error] 125-125: Pylint: Line too long (105/100). (C0301)


[error] 132-132: Pylint: Line too long (104/100). (C0301)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
tcf_website/management/commands/fetch_enrollment.py (1)

168-168: Remove trailing whitespace on the blank line.

Pylint flags trailing whitespace on this blank line (C0303). Remove the trailing spaces to pass CI.

 print(f"Found {total_sections} sections")
-        
+
 sections = list(sections_queryset)
🧹 Nitpick comments (1)
tcf_website/tests/test_fetch_enrollment.py (1)

107-118: Enhance the test to verify concurrent access.

The test correctly uses the real get_session() implementation (not patched), but it runs threads sequentially (start-join-start-join) rather than concurrently. This verifies that different threads get different sessions, but doesn't test the primary concern: that concurrent threads can safely use thread-local sessions without contention.

Consider this enhanced version that spawns all threads before joining:

 def test_thread_local_sessions(self):
     """Test that each thread gets its own session instance."""
     sessions = []
+    threads = []
+
+    def capture_session():
+        sessions.append(get_session())
+
     for _ in range(5):
-        thread = threading.Thread(target=lambda: sessions.append(get_session()))
-        thread.start()
-        thread.join()
+        thread = threading.Thread(target=capture_session)
+        threads.append(thread)
+
+    for thread in threads:
+        thread.start()
+    for thread in threads:
+        thread.join()
+
     self.assertEqual(
         len(set(id(s) for s in sessions)),
         5,
         "Each thread should have its own session"
     )

This change also eliminates the lambda closure and tests true concurrent access.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5ad8636 and f6903db.

📒 Files selected for processing (2)
  • tcf_website/management/commands/fetch_enrollment.py (6 hunks)
  • tcf_website/tests/test_fetch_enrollment.py (5 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
tcf_website/management/commands/fetch_enrollment.py (3)
tcf_website/models/models.py (1)
  • Section (942-992)
tcf_website/api/enrollment.py (2)
  • process_section (93-102)
  • update_enrollment_data (68-108)
tcf_website/api/views.py (1)
  • SectionEnrollmentViewSet (187-221)
tcf_website/tests/test_fetch_enrollment.py (1)
tcf_website/management/commands/fetch_enrollment.py (2)
  • fetch_section_data (76-137)
  • get_session (49-65)
🔇 Additional comments (13)
tcf_website/management/commands/fetch_enrollment.py (7)

10-10: LGTM!

The new imports are used appropriately: threading for thread-local storage and connections for cleanup on interrupt.

Also applies to: 17-17


27-27: LGTM!

Reducing MAX_WORKERS from 20 to 5 aligns with the PR objectives to prevent thread and memory leaks.


35-36: LGTM!

Using threading.local() is the correct pattern for thread-local storage to avoid session contention across threads.


49-66: LGTM!

The thread-local session implementation is correct. Each thread gets its own requests.Session with appropriate retry logic and connection pooling. The hasattr check ensures lazy initialization per thread.


93-93: LGTM!

Correctly obtains the thread-local session via get_session(), eliminating the shared global session that caused thread contention.


165-169: LGTM! Materializing the queryset prevents lazy evaluation issues.

Converting the queryset to a list before parallel processing is good practice and prevents each thread from independently evaluating the queryset, which could cause database connection issues.


183-187: LGTM! Proper cleanup on interrupt.

Calling connections.close_all() after shutting down the executor ensures Django database connections are cleaned up when the script is interrupted, preventing connection leaks.

tcf_website/tests/test_fetch_enrollment.py (6)

3-3: LGTM!

The threading import is used appropriately in the new thread-local session test.


8-8: LGTM!

The get_session import is necessary for testing the new thread-local session functionality.


23-28: LGTM!

The test correctly patches get_session and configures the mock session. This properly isolates the HTTP layer during testing.


48-62: LGTM!

The test correctly patches get_session and configures the mock session, consistent with the other tests.


82-88: LGTM!

The test correctly patches get_session and configures the mock session.


95-100: LGTM!

The test correctly patches get_session and configures the mock session with error behavior.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
tcf_website/management/commands/fetch_enrollment.py (1)

49-66: Consider reducing per-thread pool sizes.

Each thread now creates its own session with pool_connections=MAX_POOL_SIZE (100) and pool_maxsize=MAX_POOL_SIZE (100). With MAX_WORKERS=5, this allocates potentially large connection pools per thread. Since each thread typically connects to a single host (sisuva.admin.virginia.edu), consider reducing these to more modest values like:

-        adapter = HTTPAdapter(
-            pool_connections=MAX_POOL_SIZE,
-            pool_maxsize=MAX_POOL_SIZE,
-            max_retries=retry_strategy,
-        )
+        adapter = HTTPAdapter(
+            pool_connections=10,
+            pool_maxsize=20,
+            max_retries=retry_strategy,
+        )

This would reduce memory overhead while maintaining adequate connection reuse.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f6903db and 50d6f71.

📒 Files selected for processing (3)
  • tcf_website/api/views.py (2 hunks)
  • tcf_website/management/commands/fetch_enrollment.py (6 hunks)
  • tcf_website/tests/test_fetch_enrollment.py (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tcf_website/tests/test_fetch_enrollment.py
🧰 Additional context used
🧬 Code graph analysis (1)
tcf_website/management/commands/fetch_enrollment.py (1)
tcf_website/models/models.py (1)
  • Section (942-992)
🔇 Additional comments (4)
tcf_website/api/views.py (1)

5-5: LGTM! Thread-local connection cleanup is appropriate.

The change from connection.close() to connections.close_all() ensures all thread-local database connections are properly cleaned up in the background thread, preventing connection leaks. This aligns well with the thread-local session pattern introduced in fetch_enrollment.py.

Also applies to: 200-200

tcf_website/management/commands/fetch_enrollment.py (3)

93-93: LGTM! Thread-local session usage fixes the concurrency issue.

Switching from a global session to get_session() ensures each worker thread uses its own requests session, eliminating race conditions and thread contention that could occur with a shared global session.


165-169: LGTM! Materializing the queryset is appropriate for thread safety.

Converting the queryset to a list before parallel processing ensures all section objects are fetched upfront, avoiding potential issues with lazy evaluation and database access patterns in the threaded context.


186-186: LGTM! Connection cleanup on interrupt prevents leaks.

Calling connections.close_all() after shutting down the executor ensures all thread-local database connections are properly closed when the user interrupts the process, preventing connection leaks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants