Skip to content

NBD Device Pool Reports "no free slots" Despite Available Devices #1516

@tanzhe-chinamobile-it

Description

@tanzhe-chinamobile-it

The NBD device pool in the e2b orchestrator fails with "no free slots" error even when multiple NBD devices are available and confirmed to be free (size=0).

Environment

OS: Ubuntu 24.04
Kernel: 6.8.0-87-generic
NBD Devices: 16 available (/dev/nbd0 to /dev/nbd15)
NBD Module: Loaded with max_dev=16
e2b Version: 2025.46

Steps to Reproduce

  1. Load NBD module: sudo modprobe nbd max_dev=16
  2. Verify device creation: ls /dev/nbd*(shows 16 devices)
  3. Confirm devices are free: All show size=0in /sys/block/nbd*/size
  4. Run e2b orchestrator: ENVIRONMENT=local ./orchestrator
  5. Observe "no free slots" error in logs

Expected Behavior
The NBD device pool should successfully detect and utilize all available free NBD devices.

Actual Behavior
The device pool fails with "no free slots" error even though:

  • 16 NBD devices are available
  • All devices show size=0(confirmed free)
  • Device files have proper permissions (666)

Error Logs

2025-11-20T09:05:00.576+0800  ERROR  [nbd pool]: failed to create network  {"service": "orchestrator_template-manager", "internal": true, "pid": 19940, "error": "no free slots", "failed_count": 0}
github.com/e2b-dev/infra/packages/orchestrator/internal/sandbox/nbd.(*DevicePool).Populate
        /gopath/src/github.com/e2b-dev/infra/packages/orchestrator/internal/sandbox/nbd/pool.go:132
main.run.func10
        /gopath/src/github.com/e2b-dev/infra/packages/orchestrator/main.go:329
main.run.run.func7.func20
        /gopath/src/github.com/e2b-dev/infra/packages/orchestrator/main.go:211
golang.org/x/sync/errgroup.(*Group).Go.func1
        /gopath/pkg/mod/golang.org/x/sync@v0.17.0/errgroup/errgroup.go:93

Root Cause Analysis
After adding detailed debugging, the issue appears to be in the Populate()method logic:

  1. Device discovery works: Successfully finds all 16 NBD devices (slots 0-15)
  2. Device status correct: All devices show size=0(confirmed free)
  3. Bitset management works: Bitset correctly tracks used slots
  4. Logic error: Populate()runs in an infinite loop and doesn't exit when all slots are occupied, causing it to continue searching and eventually fail with "no free slots"

The specific issue is in /packages/orchestrator/internal/sandbox/nbd/pool.goin the Populate()method, which doesn't have proper termination conditions.

Proposed Solutions

// Current: Infinite loop causing failure when slots exhausted
for {
    device, err := d.getFreeDeviceSlot()
    // ...
}

// Suggested: Finite loop based on channel capacity
maxSlots := cap(d.slots)
for slotCount := 0; slotCount < maxSlots; slotCount++ {
    device, err := d.getFreeDeviceSlot()
    // ...
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions