GPU have processes assigned but training time is taking as long as CPU

Using pytorch v 1.0.1, I was initially getting this error:

RuntimeError: binary_op(): expected both inputs to be on same device, but input a is on cuda:1 and input b is on cuda:0

After using the register_buffer fix identified here (https://discuss.pytorch.org/t/tensors-are-on-different-gpus/1450/28) in the custom_layers.py file, I was able to get the program to run.  GPU memory is being used, but the iterations are taking just as long as with CPU only.

![Screen Shot 2019-04-04 at 9 30 16 AM](https://user-images.githubusercontent.com/10223653/55565029-b4497a00-56be-11e9-8280-225b12da910b.png)


Do you have any idea as to why this would be?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GPU have processes assigned but training time is taking as long as CPU #4

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

GPU have processes assigned but training time is taking as long as CPU #4

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions