Skip to content

Test failure : timeout #237

@noisychannel

Description

@noisychannel

Possibly related to #216 except that I have CUDA. Post installation my tests fail with a timeout error.

$ DEVICE=cuda1 make test
Running tests...
Test project /export/a16/gkumar/code/libgpuarray/Build
    Start 1: test_types
1/6 Test #1: test_types .......................   Passed    0.04 sec
    Start 2: test_util
2/6 Test #2: test_util ........................   Passed    0.03 sec
    Start 3: test_array
3/6 Test #3: test_array .......................***Failed    8.14 sec
    Start 4: test_elemwise
4/6 Test #4: test_elemwise ....................***Failed   89.57 sec
    Start 5: test_error
5/6 Test #5: test_error .......................   Passed    1.98 sec
    Start 6: test_buffer
6/6 Test #6: test_buffer ......................   Passed    1.88 sec

67% tests passed, 2 tests failed out of 6

Total Test time (real) = 101.71 sec

The following tests FAILED:
          3 - test_array (Failed)
          4 - test_elemwise (Failed)
Errors while running CTest
make: *** [test] Error 8

Here's the output from an individual run:

$ DEVICE='cuda1' tests/check_elemwise
Running suite(s): elemwise
0%: Checks: 11, Failures: 0, Errors: 11
/export/a16/gkumar/code/libgpuarray/tests/check_elemwise.c:43:E:contig:test_contig_simple:0: (after this point) Test timeout expired
/export/a16/gkumar/code/libgpuarray/tests/check_elemwise.c:105:E:contig:test_contig_f16:0: (after this point) Test timeout expired
/export/a16/gkumar/code/libgpuarray/tests/check_elemwise.c:155:E:contig:test_contig_0:0: (after this point) Test timeout expired
/export/a16/gkumar/code/libgpuarray/tests/check_elemwise.c:206:E:basic:test_basic_simple:0: (after this point) Test timeout expired
/export/a16/gkumar/code/libgpuarray/tests/check_elemwise.c:269:E:basic:test_basic_f16:0: (after this point) Test timeout expired
/export/a16/gkumar/code/libgpuarray/tests/check_elemwise.c:333:E:basic:test_basic_offset:0: (after this point) Test timeout expired
/export/a16/gkumar/code/libgpuarray/tests/check_elemwise.c:392:E:basic:test_basic_remove1:0: (after this point) Test timeout expired
/export/a16/gkumar/code/libgpuarray/tests/check_elemwise.c:458:E:basic:test_basic_broadcast:0: (after this point) Test timeout expired
/export/a16/gkumar/code/libgpuarray/tests/check_elemwise.c:520:E:basic:test_basic_collapse:0: (after this point) Test timeout expired
/export/a16/gkumar/code/libgpuarray/tests/check_elemwise.c:589:E:basic:test_basic_neg_strides:0: (after this point) Test timeout expired
/export/a16/gkumar/code/libgpuarray/tests/check_elemwise.c:643:E:basic:test_basic_0:0: (after this point) Test timeout expired

Any thoughts on what could be wrong?

Here's the output from my nvidia-smi

$ nvidia-smi
Mon Aug  8 00:39:30 2016
+------------------------------------------------------+
| NVIDIA-SMI 346.46     Driver Version: 346.46         |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K20m          On   | 0000:05:00.0     Off |                    0 |
| N/A   54C    P0   106W / 225W |   3185MiB /  4799MiB |     51%   E. Process |
+-------------------------------+----------------------+----------------------+
|   1  Tesla K20m          On   | 0000:42:00.0     Off |                    0 |
| N/A   47C    P8    17W / 225W |     14MiB /  4799MiB |      0%   E. Process |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0     33698    C   nnet3-chain-train                             3169MiB |
+-----------------------------------------------------------------------------+

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions