Skip to content

Return random peers in announce requests instead of the first 74 peer #1604

@josecelano

Description

@josecelano

We always return the first 74 peers because we use take to get the first peers in the peer list.

torrust_tracker_swarm_coordination_registry::swarm::coordinator::Coordinator

#[must_use]
pub fn peers_excluding(&self, peer_addr: &SocketAddr, limit: Option<usize>) -> Vec<Arc<peer::Peer>> {
    match limit {
        Some(limit) => self
            .peers
            .values()
            // Take peers which are not the client peer
            .filter(|peer| peer::ReadInfo::get_address(peer.as_ref()) != *peer_addr)
            // Limit the number of peers on the result
            .take(limit)
            .cloned()
            .collect(),
        None => self
            .peers
            .values()
            // Take peers which are not the client peer
            .filter(|peer| peer::ReadInfo::get_address(peer.as_ref()) != *peer_addr)
            .cloned()
            .collect(),
    }
}

For torrents with few peers, it is not a problem since we always return all; however, for popular torrents, it has some problems.

cc @da2ce7


Model: GPT‑4.1

Problems with Returning Always the First 74 Peers

When the tracker always returns the first 74 peers (sorted by socket address), several issues arise:

  1. Peer Exposure is Biased

    • The same peers (those with the smallest socket addresses) are always returned.
    • Other peers are rarely or never exposed to new connections.
  2. Swarm Connectivity is Suboptimal

    • Peers in the swarm do not discover each other efficiently.
    • Slower peer discovery can lead to longer download times and fragmented swarms.
  3. Hotspot Problem

    • The same few peers are likely to receive a disproportionate number of incoming connections.
    • This can overload certain peers, especially those on slower connections.
  4. New or Late-Joining Peers Are Ignored

    • If a peer joins late and its socket address is higher than the first 74 peers, it may never be returned to other clients.
    • This peer will have a harder time connecting to the swarm.
  5. Lack of Randomness Reduces Robustness

    • Predictable peer selection makes the swarm less resilient to network churn or failures.
    • Public trackers typically use randomness to ensure fair exposure and redundancy.
  6. Stale Peer Risk

    • If older peers in the first 74 disconnect or become unreachable, they may still be returned until explicitly pruned.
    • This leads to more failed connection attempts for clients.

Summary:
Always returning the first 74 peers creates an uneven, predictable, and potentially fragile swarm.
A better approach is to return peers randomly, possibly with a bias toward recently active peers to improve reliability.


Alternatives for Peer Selection in Tracker Announce Responses

When responding to an announce request, there are several strategies for selecting which peers to return.
Below are the main alternatives, with their pros and cons:

1. Deterministic Ordering (Current Approach)

  • Description: Always return the first N peers (e.g., 74) sorted by socket address.
  • Pros:
    • Simple to implement.
    • Cache-friendly and predictable.
  • Cons:
    • Peer exposure is biased: the same peers are always returned.
    • New peers may never be exposed to the swarm.
    • Leads to hotspots and suboptimal swarm connectivity.

2. Random Selection

  • Description: Return N peers chosen uniformly at random from the pool of active peers.
  • Pros:
    • Fair peer exposure.
    • Improves swarm connectivity and robustness.
    • Avoids hotspots.
  • Cons:
    • Requires random sampling on every request.
    • Harder to reproduce/debug exact responses.

3. Random Subset with Partial Rotation

  • Description: Return a mostly random subset, but keep part of the list stable for a short period.
  • Pros:
    • Balances randomness with some stability.
    • Helps with NAT traversal because some peers stay visible for longer.
  • Cons:
    • More complex to implement than pure random selection.

4. Recent Peers First

  • Description: Prioritize peers that announced recently, optionally with a cutoff for stale peers.
  • Pros:
    • Likely to return reachable and active peers.
    • Reduces the number of dead or inactive peers sent to clients.
  • Cons:
    • Can create hotspots if the swarm is small.
    • Older but still active peers may get ignored.

5. Round-Robin / Rotating Window

  • Description: Iterate over the peer list in a circular manner, returning the next N peers each time.
  • Pros:
    • Guarantees all peers are eventually exposed.
    • Avoids permanent hotspots.
  • Cons:
    • Still somewhat predictable.
    • May return stale peers if rotation is slow.

6. Geographically or Network-Localized Selection

  • Description: Prefer peers in the same ASN, subnet, or geographic region as the requesting client.
  • Pros:
    • Can improve performance by reducing cross-network traffic.
  • Cons:
    • Requires IP-to-ASN or geo mapping.
    • Less privacy-friendly and uncommon in public trackers.

7. Hybrid Strategies (Recommended)

  • Description: Combine two or more of the above strategies, such as:
    • 50% random recent peers + 50% random older peers
    • Random peers but rotate part of the list per request
  • Pros:
    • Fair peer distribution with bias toward active peers.
    • Improves swarm health and reduces connection failures.
  • Cons:
    • Slightly more complex logic and bookkeeping required.

Recommended Solution

To improve swarm health, peer discovery, and fairness, we should replace the current
deterministic first-74-peers approach with a hybrid random peer selection policy.

Proposed Algorithm

  1. Maintain last announce timestamps for all peers (ALREADY IMPLEMENTED).
  2. Filter out stale peers, e.g., those inactive for more than 30 minutes (ALREADY IMPLEMENTED).
  3. Split active peers into two groups:
    • Recent peers: Announced in the last 5 minutes.
    • Other active peers: Announced in the last 30 minutes but not the last 5.
  4. Select peers:
    • Take 50% from recent peers (randomly).
    • Take the remaining 50% from other active peers (randomly).
  5. Return the combined list (max 74 peers) in the announce response.

This approach ensures:

  • Fresh, reachable peers are prioritized.
  • Peer selection is random, reducing hotspots and bias.
  • All active peers have a chance of being returned to clients.
  • The swarm remains more robust and well-connected.

Optional Enhancements

  • Round-robin rotation: To guarantee all peers eventually get returned, even in huge swarms.
  • Seeding/Leeching bias: Return more seeds to leechers and more leechers to seeds.
  • Sticky peers: Keep some peers consistent across consecutive responses to help NAT traversal.

Summary:
Switching to a hybrid random + recent bias peer selection strategy will make Torrust Tracker’s
responses more fair, efficient, and reliable, improving the health of all swarms.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions