Skip to content

torch.distributed.elastic.multiprocessing.api:failed ( exitcode:-9) #17

@1787648106

Description

@1787648106

@youngwanLEE Thanks for your excellent work. When I train the code on two GPU, there are a error that torch.distributed.elastic.multiprocessing.api:failed(exitcode:-9).
when I train on single GPU with 'tools/dist_train.sh configs/... 1', It is not appear.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions