Skip to content

Get device index from local rank if multi-client, otherwise use the current device.#6405

Merged
oneflow-ci-bot merged 4 commits intomasterfrom
dev_fix_random_generator
Sep 26, 2021
Merged

Get device index from local rank if multi-client, otherwise use the current device.#6405
oneflow-ci-bot merged 4 commits intomasterfrom
dev_fix_random_generator

Conversation

@hjchen2
Copy link
Copy Markdown
Contributor

@hjchen2 hjchen2 commented Sep 26, 2021

No description provided.

@hjchen2 hjchen2 changed the title Fix random generator Get device index from local rank if multi-client, otherwise use the current device. Sep 26, 2021

#ifdef WITH_CUDA

int GetCudaDeviceIndex() {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

暂时这么处理缺省device index的情况:

  • multi-client:取local rank作为默认的device index
  • single-client:将当前device作为device index,这里的前提假设是single client下Generator只会在create kernel state接口中创建,此时会提前通过cudaSetDevice来设置当前线程的device

@hjchen2 hjchen2 requested review from oneflow-ci-bot and removed request for oneflow-ci-bot September 26, 2021 13:52
@oneflow-ci-bot oneflow-ci-bot requested review from oneflow-ci-bot and removed request for oneflow-ci-bot September 26, 2021 14:07
@github-actions
Copy link
Copy Markdown
Contributor

Speed stats:
GPU Name: GeForce GTX 1080 

OneFlow resnet50 time: 136.1ms (= 6806.9ms / 50, input_shape=[16, 3, 224, 224])
PyTorch resnet50 time: 138.8ms (= 6942.5ms / 50, input_shape=[16, 3, 224, 224])
❌ Relative speed: 1.02 (= 138.8ms / 136.1ms)
loaded library: /usr/lib/x86_64-linux-gnu/libibverbs.so.1

OneFlow resnet50 time: 78.0ms (= 3901.2ms / 50, input_shape=[8, 3, 224, 224])
PyTorch resnet50 time: 83.3ms (= 4167.2ms / 50, input_shape=[8, 3, 224, 224])
✔️ Relative speed: 1.07 (= 83.3ms / 78.0ms)
loaded library: /usr/lib/x86_64-linux-gnu/libibverbs.so.1

OneFlow resnet50 time: 51.0ms (= 2548.1ms / 50, input_shape=[4, 3, 224, 224])
PyTorch resnet50 time: 57.5ms (= 2877.4ms / 50, input_shape=[4, 3, 224, 224])
✔️ Relative speed: 1.13 (= 57.5ms / 51.0ms)
loaded library: /usr/lib/x86_64-linux-gnu/libibverbs.so.1

OneFlow resnet50 time: 52.7ms (= 2635.1ms / 50, input_shape=[2, 3, 224, 224])
PyTorch resnet50 time: 44.8ms (= 2238.0ms / 50, input_shape=[2, 3, 224, 224])
❌ Relative speed: 0.85 (= 44.8ms / 52.7ms)
loaded library: /usr/lib/x86_64-linux-gnu/libibverbs.so.1

OneFlow resnet50 time: 43.9ms (= 2197.3ms / 50, input_shape=[1, 3, 224, 224])
PyTorch resnet50 time: 42.3ms (= 2114.3ms / 50, input_shape=[1, 3, 224, 224])
✔️ Relative speed: 0.96 (= 42.3ms / 43.9ms)
loaded library: /usr/lib/x86_64-linux-gnu/libibverbs.so.1

loaded library: /usr/lib/x86_64-linux-gnu/libibverbs.so.1
loaded library: /usr/lib/x86_64-linux-gnu/libibverbs.so.1
OneFlow resnet50 time: 156.6ms (= 7829.5ms / 50, input_shape=[16, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 159.4ms (= 7969.2ms / 50, input_shape=[16, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.02 (= 159.4ms / 156.6ms)
loaded library: /usr/lib/x86_64-linux-gnu/libibverbs.so.1

loaded library: /usr/lib/x86_64-linux-gnu/libibverbs.so.1
loaded library: /usr/lib/x86_64-linux-gnu/libibverbs.so.1
OneFlow resnet50 time: 103.6ms (= 5180.0ms / 50, input_shape=[8, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 112.9ms (= 5644.7ms / 50, input_shape=[8, 3, 224, 224], ddp, world size=2)
✔️ Relative speed: 1.09 (= 112.9ms / 103.6ms)
loaded library: /usr/lib/x86_64-linux-gnu/libibverbs.so.1

loaded library: /usr/lib/x86_64-linux-gnu/libibverbs.so.1
loaded library: /usr/lib/x86_64-linux-gnu/libibverbs.so.1
OneFlow resnet50 time: 78.6ms (= 3930.8ms / 50, input_shape=[4, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 70.5ms (= 3524.3ms / 50, input_shape=[4, 3, 224, 224], ddp, world size=2)
❌ Relative speed: 0.90 (= 70.5ms / 78.6ms)
loaded library: /usr/lib/x86_64-linux-gnu/libibverbs.so.1

loaded library: /usr/lib/x86_64-linux-gnu/libibverbs.so.1
loaded library: /usr/lib/x86_64-linux-gnu/libibverbs.so.1
OneFlow resnet50 time: 75.8ms (= 3792.1ms / 50, input_shape=[2, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 62.5ms (= 3125.6ms / 50, input_shape=[2, 3, 224, 224], ddp, world size=2)
❌ Relative speed: 0.82 (= 62.5ms / 75.8ms)
loaded library: /usr/lib/x86_64-linux-gnu/libibverbs.so.1

loaded library: /usr/lib/x86_64-linux-gnu/libibverbs.so.1
loaded library: /usr/lib/x86_64-linux-gnu/libibverbs.so.1
OneFlow resnet50 time: 79.8ms (= 3991.9ms / 50, input_shape=[1, 3, 224, 224], ddp, world size=2)
PyTorch resnet50 time: 57.5ms (= 2873.5ms / 50, input_shape=[1, 3, 224, 224], ddp, world size=2)
❌ Relative speed: 0.72 (= 57.5ms / 79.8ms)
loaded library: /usr/lib/x86_64-linux-gnu/libibverbs.so.1

@oneflow-ci-bot oneflow-ci-bot removed their request for review September 26, 2021 14:55
@oneflow-ci-bot oneflow-ci-bot merged commit 89bbc5b into master Sep 26, 2021
@oneflow-ci-bot oneflow-ci-bot deleted the dev_fix_random_generator branch September 26, 2021 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants