-
Notifications
You must be signed in to change notification settings - Fork 248
Description
The NBD device pool in the e2b orchestrator fails with "no free slots" error even when multiple NBD devices are available and confirmed to be free (size=0).
Environment
OS: Ubuntu 24.04
Kernel: 6.8.0-87-generic
NBD Devices: 16 available (/dev/nbd0 to /dev/nbd15)
NBD Module: Loaded with max_dev=16
e2b Version: 2025.46
Steps to Reproduce
- Load NBD module: sudo modprobe nbd max_dev=16
- Verify device creation: ls /dev/nbd*(shows 16 devices)
- Confirm devices are free: All show size=0in /sys/block/nbd*/size
- Run e2b orchestrator: ENVIRONMENT=local ./orchestrator
- Observe "no free slots" error in logs
Expected Behavior
The NBD device pool should successfully detect and utilize all available free NBD devices.
Actual Behavior
The device pool fails with "no free slots" error even though:
- 16 NBD devices are available
- All devices show size=0(confirmed free)
- Device files have proper permissions (666)
Error Logs
2025-11-20T09:05:00.576+0800 ERROR [nbd pool]: failed to create network {"service": "orchestrator_template-manager", "internal": true, "pid": 19940, "error": "no free slots", "failed_count": 0}
github.com/e2b-dev/infra/packages/orchestrator/internal/sandbox/nbd.(*DevicePool).Populate
/gopath/src/github.com/e2b-dev/infra/packages/orchestrator/internal/sandbox/nbd/pool.go:132
main.run.func10
/gopath/src/github.com/e2b-dev/infra/packages/orchestrator/main.go:329
main.run.run.func7.func20
/gopath/src/github.com/e2b-dev/infra/packages/orchestrator/main.go:211
golang.org/x/sync/errgroup.(*Group).Go.func1
/gopath/pkg/mod/golang.org/x/sync@v0.17.0/errgroup/errgroup.go:93
Root Cause Analysis
After adding detailed debugging, the issue appears to be in the Populate()method logic:
- Device discovery works: Successfully finds all 16 NBD devices (slots 0-15)
- Device status correct: All devices show size=0(confirmed free)
- Bitset management works: Bitset correctly tracks used slots
- Logic error: Populate()runs in an infinite loop and doesn't exit when all slots are occupied, causing it to continue searching and eventually fail with "no free slots"
The specific issue is in /packages/orchestrator/internal/sandbox/nbd/pool.goin the Populate()method, which doesn't have proper termination conditions.
Proposed Solutions
// Current: Infinite loop causing failure when slots exhausted
for {
device, err := d.getFreeDeviceSlot()
// ...
}
// Suggested: Finite loop based on channel capacity
maxSlots := cap(d.slots)
for slotCount := 0; slotCount < maxSlots; slotCount++ {
device, err := d.getFreeDeviceSlot()
// ...
}