perf: Optimize RoutesForAgencyID lookup to O(1) and fix concurrency issues#372
Conversation
aaronbrethorst
left a comment
There was a problem hiding this comment.
Hey Aditya, nice work on this performance optimization. The map-based lookup for RoutesForAgencyID is a clean and effective improvement, and I appreciate that you also caught and fixed the data race in AdvancedDirectionCalculator along the way. The test coverage — especially the concurrent access test — shows good attention to thread safety. Before we can merge this, I will need you to make a couple changes:
Important
- Squash into a single commit. The branch has 3 commits (
fa8412f,f9d0652,0beb643) that should be squashed into one clean commit. If you're not sure how to do this, check out this blog post I wrote: https://www.brethorsting.com/blog/2026/01/git-rebase-for-the-terrified/
Fit and Finish
- Changelog-style comments on the mutex additions (
advanced_direction_calculator.go:56,72). The comments// Mutex protection added for race safetydescribe why you changed the code, not what the code does. Future readers won't have the PR context. Consider changing to something like// Protect cache assignment from concurrent readersor simply removing them — thecacheMutex.Lock()call is self-documenting.
Thanks again, and I look forward to merging this change.
5e4d07a to
7438533
Compare
aaronbrethorst
left a comment
There was a problem hiding this comment.
Hey Aditya, the core optimization and the data race fix both look solid — the buildRouteIndex approach is clean, both setStaticGTFS and ForceUpdate rebuild the index correctly, and the concurrent access test is a nice addition. The squash and rebase is done too. One item from the first round still needs a quick pass, though.
Previous Feedback Status
| # | Item | Status |
|---|---|---|
| 1 | Squash into a single commit | Addressed |
| 2 | Changelog-style comments on mutex additions | Not addressed |
Important
-
Changelog-style comments are still present — and multiplied (
advanced_direction_calculator.go). The first round asked you to rephrase or remove// Mutex protection added for race safetybecause it describes why you changed the code, not what the code does. That exact comment is still on lines 56 and 72 in the diff. Two new instances of the same pattern were also added:// Mutex protection: Use RLock (Read Lock) for checking the cache(incomputeFromShapes)// Mutex protection: Use RLock for checking the cache(incalculateOrientationAtStop)
All four should either be removed (the
Lock()/RLock()calls are self-documenting) or reworded to describe what the code does. For example:// Protect cache from concurrent readers during assignment adc.cacheMutex.Lock()
Or simply delete all four comments —
cacheMutex.Lock()andcacheMutex.RLock()speak for themselves.
Fit and Finish
-
Self-explanatory comment in
RoutesForAgencyID(gtfs_manager.go):// Return empty slice if no routes found for this agencyrestates thereturn []*gtfs.Route{}line. Safe to remove. -
Comments in
buildRouteIndex(static.go):// Pre-allocate based on agency count to avoid resizingand// Use pointer to avoid duplicationare accurate but slightly over-explain idiomatic Go patterns. Consider removing for brevity.
Strengths
- The
buildRouteIndexfunction follows the establishedbuildLookupMapspattern perfectly — same style, same location, easy to maintain. - Both data paths (
setStaticGTFSfor initial load andForceUpdatefor hot-swap) correctly rebuild the index. - The logging addition (
slog.Int("route_index_agencies", ...)) is a nice operational touch. TestRoutesForAgencyID_ConcurrentAccessproperly exercises reader/writer contention withsetStaticGTFS.- Adding
defer manager.Shutdown()across existing tests is good housekeeping. - The benchmark is a welcome addition for tracking regression.
Recommended Action
- Remove or reword the four changelog-style comments (item 1)
- Consider the fit-and-finish items (items 2–3)
- Squash again after changes
54dbd08 to
79a19d3
Compare
aaronbrethorst
left a comment
There was a problem hiding this comment.
Hey Aditya, thanks for taking the time to build this performance optimization for RoutesForAgencyID. The approach of pre-computing a routesByAgencyID index at load time is clean, and I especially appreciate that you integrated it into both setStaticGTFS and ForceUpdate so the index stays consistent across hot-swaps. The test hygiene improvements — adding defer manager.Shutdown(), proper RLock()/RUnlock(), and cleaning up stale comments — are a nice bonus. Before we can merge this, I will need you to make a couple changes:
Important
-
Benchmark uses a hardcoded relative path instead of a computed absolute path. In
BenchmarkRoutesForAgencyID_MapLookup(internal/gtfs/gtfs_manager_test.go:536), you use"../../testdata/raba.zip"while every other test in the file usesmodels.GetFixturePath(t, "raba.zip"). The hardcoded path is fragile — it breaks if the test is run from a different working directory. SinceGetFixturePathcurrently takes*testing.Tand benchmarks use*testing.B, you'll need to updateGetFixturePathininternal/models/test_helpers.goto accepttesting.TB(which is satisfied by both*testing.Tand*testing.B). Then you can usemodels.GetFixturePath(b, "raba.zip")in the benchmark. -
Concurrent access test uses
manager.staticMutexdirectly instead of the public API. InTestRoutesForAgencyID_ConcurrentAccess(internal/gtfs/gtfs_manager_test.go:491-492), the reader goroutines callmanager.staticMutex.RLock()/manager.staticMutex.RUnlock()directly. This is inconsistent with the rest of the test file — for example,TestRoutesForAgencyID_MapOptimizationusesmanager.RLock()/manager.RUnlock(). Please switch to the public API for consistency.
Fit and Finish
- Unnecessary locking in
TestRoutesForAgencyID_MapOptimization. The test acquires and releasesRLockfour separate times in sequence, but this is a single-threaded test with no concurrent writers. You could consolidate these into fewer lock regions. In particular, block 3 (lines 451-456) acquiresRLockjust to iterate overpublicRoutes, which is a local variable — it doesn't access shared manager state, so no lock is needed there at all.
Thanks again, and I look forward to merging this change.
- Add routesByAgencyID map for O(1) route lookups - Fix data race condition in AdvancedDirectionCalculator - Add concurrent access and hot-swap map rebuild tests - Add performance benchmarks
79a19d3 to
6522ec7
Compare
aaronbrethorst
left a comment
There was a problem hiding this comment.
Hey Aditya, every piece of feedback from the previous three rounds has been addressed — the changelog-style comments are gone, the benchmark uses GetFixturePath with the testing.TB interface, the concurrent test uses the public manager.RLock() API, and the test locking is properly consolidated. Clean work across the board.
There's nothing left to change—this PR is ready to merge. The routesByAgencyID index is a well-executed optimization that follows the existing buildLookupMaps pattern perfectly, and the AdvancedDirectionCalculator race fix adds real safety with a minimal footprint.

Description
This PR optimizes the
RoutesForAgencyIDmethod in the GTFS Manager. Previously, this method performed a linear O(N) scan over all routes in the system to filter by agency ID. This created a performance bottleneck for large regional feeds and high-traffic endpoints.Additionally, this PR fixes a data race condition discovered in the
AdvancedDirectionCalculatorduring testing.closes #371
Changes
Performance Optimization
internal/gtfs/gtfs_manager.go:Added
routesByAgencyIDmap to theManagerstruct for O(1) lookups.Refactored
RoutesForAgencyIDto look up routes directly from the map instead of iterating.internal/gtfs/static.go:Implemented
buildRouteIndexhelper to pre-compute the route map.Updated
setStaticGTFS(initialization) andForceUpdate(hot-swap) to ensure the index is rebuilt whenever GTFS data changes.Bug Fixes
internal/gtfs/advanced_direction_calculator.go:sync.RWMutexto protect the internalcontextCache.SetContextCache(write) andcomputeFromShapes(read) could occur simultaneously.Testing
internal/gtfs/gtfs_manager_test.go:TestRoutesForAgencyID_MapOptimizationto verify correct map initialization.TestRoutesForAgencyID_ConcurrentAccessto ensure thread safety.Checklist
ForceUpdate) maintain data integrity.