Skip to content

Storm Rebalance Broken #226

@JessicaLHartog

Description

@JessicaLHartog

After the merging of #200 and #213 rebalance of topologies no longer does anything. This is because there are no offers on which slots can be made when a rebalance happens unless there happen to also be other topologies needing assignments.

This is as a result of the way that Nimbus handles the TopologiesMissingAssignments component. A quick rundown of what now happens is:

  • storm-mesos does scheduling of topologies until no topologies need assignments
    since no topologies need assignments, offers are suppressed
  • storm-mesos doesn't do anything in MesosNimbus because no topologies need assignments (and offers are already suppressed)
  • a rebalance command comes in and is registered by Nimbus, a :do-rebalance event is scheduled some number of seconds in the future
  • those number of seconds later there is finally a topology that needs assignment (i.e. the one that was just rebalanced), but there are no offers buffered
  • since there are no offers buffered and there are topologies needing assignments, offers are revived
  • allSlotsAvailableForScheduling returns after reviving offers
  • Nimbus wants slots immediately for the rebalancing topology on, and there's no time for offers to come in and be used in the next allSlotsAvailableForScheduling call
  • since there are no slots available for the workers to be rescheduled onto, they don't get rescheduled and rebalance therefore does nothing

Notably, if there are other topologies needing assignments at the same time as the :do-rebalance is executed, then the rebalance should work as expected.

This also is simply referring to the Storm UI "Rebalance" and its associated command. I have not tested this with the type of rebalance mentioned in the Storm documentation:

## Reconfigure the topology "mytopology" to use 5 worker processes,
## the spout "blue-spout" to use 3 executors and
## the bolt "yellow-bolt" to use 10 executors.

$ storm rebalance mytopology -n 5 -e blue-spout=3 -e yellow-bolt=10

However, I fully expect they hit the same logic in the Nimbus and this same behavior (or something similar) happens that way too.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions