Skip to content

[feature] pytorch DDP with C++ (utilizing CUDA / NCCL)  #122

@feevos

Description

@feevos

Is your feature request related to a problem? Please describe.
In my experience the same code (for training) is faster with pure C++ than python. My workflows utilize distributed compute for training, therefore such a solution would be awesome.

Describe the solution you'd like
A similar software like torch.distributed but for C++

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions