Used decorators and WITH_NP to avoid tests duplication. #11050

gabrieldemarmiesse · 2018-08-31T22:37:01Z

Summary

@taehoonlee
Follow-up on #11037
Still trying to make those tests look younger and prettier.

Related Issues

PR Overview

This PR requires new unit tests [y/n] (make sure tests are included)
This PR requires to update the documentation [y/n] (make sure the docs are up-to-date)
This PR is backwards compatible [y/n]
This PR changes the current API [y/n] (all API changes need to be approved by fchollet)

gabrieldemarmiesse · 2018-08-31T23:00:19Z

A unrelated test failed. It seems flaky with very low probability of failing. For history purposes and in case we need it later since it seems strange, here is the stacktrace:

_________________________ test_stateful_metrics[list] __________________________
[gw1] linux -- Python 3.6.6 /home/travis/miniconda/envs/test-environment/bin/python
metrics_mode = 'list'
    @keras_test
    @pytest.mark.parametrize('metrics_mode', ['list', 'dict'])
    def test_stateful_metrics(metrics_mode):
        np.random.seed(1334)
    
        class BinaryTruePositives(keras.layers.Layer):
            """Stateful Metric to count the total true positives over all batches.
    
            Assumes predictions and targets of shape `(samples, 1)`.
    
            # Arguments
                name: String, name for the metric.
            """
    
            def __init__(self, name='true_positives', **kwargs):
                super(BinaryTruePositives, self).__init__(name=name, **kwargs)
                self.stateful = True
                self.true_positives = K.variable(value=0, dtype='int32')
    
            def reset_states(self):
                K.set_value(self.true_positives, 0)
    
            def __call__(self, y_true, y_pred):
                """Computes the number of true positives in a batch.
    
                # Arguments
                    y_true: Tensor, batch_wise labels
                    y_pred: Tensor, batch_wise predictions
    
                # Returns
                    The total number of true positives seen this epoch at the
                        completion of the batch.
                """
                y_true = K.cast(y_true, 'int32')
                y_pred = K.cast(K.round(y_pred), 'int32')
                correct_preds = K.cast(K.equal(y_pred, y_true), 'int32')
                true_pos = K.cast(K.sum(correct_preds * y_true), 'int32')
                current_true_pos = self.true_positives * 1
                self.add_update(K.update_add(self.true_positives,
                                             true_pos),
                                inputs=[y_true, y_pred])
                return current_true_pos + true_pos
    
        metric_fn = BinaryTruePositives()
        config = metrics.serialize(metric_fn)
        metric_fn = metrics.deserialize(
            config, custom_objects={'BinaryTruePositives': BinaryTruePositives})
    
        # Test on simple model
        inputs = keras.Input(shape=(2,))
        outputs = keras.layers.Dense(1, activation='sigmoid', name='out')(inputs)
        model = keras.Model(inputs, outputs)
    
        if metrics_mode == 'list':
            model.compile(optimizer='sgd',
                          loss='binary_crossentropy',
                          metrics=['acc', metric_fn])
        elif metrics_mode == 'dict':
            model.compile(optimizer='sgd',
                          loss='binary_crossentropy',
                          metrics={'out': ['acc', metric_fn]})
    
        samples = 1000
        x = np.random.random((samples, 2))
        y = np.random.randint(2, size=(samples, 1))
    
        val_samples = 10
        val_x = np.random.random((val_samples, 2))
        val_y = np.random.randint(2, size=(val_samples, 1))
    
        # Test fit and evaluate
        history = model.fit(x, y, validation_data=(val_x, val_y),
                            epochs=1, batch_size=10)
        outs = model.evaluate(x, y, batch_size=10)
        preds = model.predict(x)
    
        def ref_true_pos(y_true, y_pred):
            return np.sum(np.logical_and(y_pred > 0.5, y_true == 1))
    
        # Test correctness (e.g. updates should have been run)
        np.testing.assert_allclose(outs[2], ref_true_pos(y, preds), atol=1e-5)
    
        # Test correctness of the validation metric computation
        val_preds = model.predict(val_x)
        val_outs = model.evaluate(val_x, val_y, batch_size=10)
        assert_allclose(val_outs[2], ref_true_pos(val_y, val_preds), atol=1e-5)
        assert_allclose(val_outs[2], history.history['val_true_positives'][-1],
                        atol=1e-5)
    
        # Test with generators
        gen = [(np.array([x0]), np.array([y0])) for x0, y0 in zip(x, y)]
        val_gen = [(np.array([x0]), np.array([y0])) for x0, y0 in zip(val_x, val_y)]
        history = model.fit_generator(iter(gen), epochs=1, steps_per_epoch=samples,
                                      validation_data=iter(val_gen),
                                      validation_steps=val_samples)
        outs = model.evaluate_generator(iter(gen), steps=samples, workers=0)
        preds = model.predict_generator(iter(gen), steps=samples, workers=0)
    
        # Test correctness of the metric re ref_true_pos()
        np.testing.assert_allclose(outs[2], ref_true_pos(y, preds),
                                   atol=1e-5)
    
        # Test correctness of the validation metric computation
        val_preds = model.predict_generator(iter(val_gen), steps=val_samples, workers=0)
        val_outs = model.evaluate_generator(iter(val_gen), steps=val_samples, workers=0)
        np.testing.assert_allclose(val_outs[2], ref_true_pos(val_y, val_preds),
                                   atol=1e-5)
        np.testing.assert_allclose(val_outs[2],
                                   history.history['val_true_positives'][-1],
>                                  atol=1e-5)
E       AssertionError: 
E       Not equal to tolerance rtol=1e-07, atol=1e-05
E       
E       (mismatch 100.0%)
E        x: array(2.)
E        y: array(3.)
tests/keras/metrics_test.py:219: AssertionError

fchollet

LGTM, thanks

* keras/master: (327 commits) Added in_train_phase and in_test_phase in the numpy backend. (keras-team#11061) Make sure the data_format argument defaults to ‘chanels_last’ for all 1D sequence layers. Speed up backend tests (keras-team#11051) Skipped some duplicated tests. (keras-team#11049) Used decorators and WITH_NP to avoid tests duplication. (keras-team#11050) Cached the theano compilation directory. (keras-team#11048) Removing duplicated backend tests. (keras-team#11037) [P, RELNOTES] Conv2DTranspose supports dilation (keras-team#11029) Doc Change: Change in shape for CIFAR Datasets (keras-team#11043) Fix line too long in mnist_acgan (keras-team#11040) Enable using last incomplete minibatch (keras-team#8344) Better UX (keras-team#11039) Update lstm text generation example (keras-team#11038) fix a bug, load_weights doesn't return anything (keras-team#11031) Speeding up the tests by reducing the number of K.eval(). (keras-team#11036) [P] Expose monitor value getter for easier subclass (keras-team#11002) [RELNOTES] Added the mode "bilinear" in the upscaling2D layer. (keras-team#10994) Separate pooling test from convolutional test and parameterize test case (keras-team#10975) Fix issue with non-canonical TF version name format. Allow TB callback to display float values. ...

gabrieldemarmiesse added 2 commits September 1, 2018 00:30

Used decorators and WITH_NP to avoid tests duplication.

509e6eb

fix pep8

41b292c

fchollet approved these changes Sep 1, 2018

View reviewed changes

fchollet merged commit c7f4ad5 into keras-team:master Sep 1, 2018

gabrieldemarmiesse deleted the duplicated_tests_in_the_backend_again branch September 18, 2018 06:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Used decorators and WITH_NP to avoid tests duplication. #11050

Used decorators and WITH_NP to avoid tests duplication. #11050

Uh oh!

gabrieldemarmiesse commented Aug 31, 2018

Uh oh!

gabrieldemarmiesse commented Aug 31, 2018

Uh oh!

fchollet left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Used decorators and WITH_NP to avoid tests duplication. #11050

Used decorators and WITH_NP to avoid tests duplication. #11050

Uh oh!

Conversation

gabrieldemarmiesse commented Aug 31, 2018

Summary

Related Issues

PR Overview

Uh oh!

gabrieldemarmiesse commented Aug 31, 2018

Uh oh!

fchollet left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants