|
| 1 | +## A sample Pytest module for a Scikit-learn model training function |
| 2 | + |
| 3 | +### How to run Pytest |
| 4 | + |
| 5 | +- Install pytest `pip install pytest` |
| 6 | + |
| 7 | +- Copy/clone the two Python scripts from this directory |
| 8 | +- The `linear_model.py` has a single function that trains a simple linear regression model using scikit-learn. Note that it has basic assertion tests and `try-except` construct to handle potential input errors. |
| 9 | +- The `test_linear_model.py` file is the test module which acts as the input to the Pytest program. |
| 10 | +- Run `pytest test_linear_model.py -v` on your terminal to run the tests. You should see something like following, |
| 11 | + |
| 12 | +``` |
| 13 | +======================================================================================================= test session starts ======================================================================================================== |
| 14 | +platform win32 -- Python 3.9.1, pytest-6.2.2, py-1.10.0, pluggy-0.13.1 -- c:\program files\python39\python.exe |
| 15 | +cachedir: .pytest_cache |
| 16 | +rootdir: C:\Users\TirthajyotiSarkar\Documents\Python Notebooks\Pytest |
| 17 | +plugins: anyio-2.0.2 |
| 18 | +collected 7 items |
| 19 | +
|
| 20 | +test_linear_model.py::test_model_return_object PASSED [ 14%] |
| 21 | +test_linear_model.py::test_model_return_vals PASSED [ 28%] |
| 22 | +test_linear_model.py::test_model_save_load PASSED [ 42%] |
| 23 | +test_linear_model.py::test_loaded_model_works PASSED [ 57%] |
| 24 | +test_linear_model.py::test_model_works_data_range PASSED [ 71%] |
| 25 | +test_linear_model.py::test_noise_impact PASSED [ 85%] |
| 26 | +test_linear_model.py::test_wrong_input_raises_assertion PASSED [100%] |
| 27 | +
|
| 28 | +========================================================================================================= warnings summary ========================================================================================================= |
| 29 | +..\..\..\..\..\..\program files\python39\lib\site-packages\win32\lib\pywintypes.py:2 |
| 30 | + c:\program files\python39\lib\site-packages\win32\lib\pywintypes.py:2: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses |
| 31 | + import imp, sys, os |
| 32 | +
|
| 33 | +-- Docs: https://docs.pytest.org/en/stable/warnings.html |
| 34 | +=================================================================================================== 7 passed, 1 warning in 1.03s =================================================================================================== |
| 35 | +``` |
| 36 | + |
| 37 | +### What does it mean? |
| 38 | + |
| 39 | +- The terminal message (above) indicates that 7 tests were run (corresponding to the 7 functions in the `test_linear_model.py` module) and all of them passed. |
| 40 | + |
| 41 | +- It also shows the order of the tests run (this is because you included the `- v` argument on the command line while running `pytest` command). Pytest allows you to randomize the testing sequence but that discussion is for another day. |
| 42 | + |
| 43 | +### Notes on the test module |
| 44 | + |
| 45 | +- Note, how the `test_linear_model.py` contains 7 functions with names starting with `test...`. Those contain the actual test code. It also has a couple of data constructor functions whose names do not start with `test...` and they are ignored by Pytest. |
| 46 | + |
| 47 | +- Note that we need to import a bunch of libraries to test all kind of things e.g. we imported libraries like `joblib`, `os`, `sklearn`, `numpy`, and of course, the `train_linear_model` function from the `linear_model` module. |
| 48 | +- Note the clear and distinctive names for the testing functions e.g. `test_model_return_object()` which only checks the returned object from the `train_linear_model` function, or the `test_model_save_load()` which checks whether the saved model can be loaded properly (but does not try to make predictions or anything). |
| 49 | +- For checking the predictions i.e. whether the trained model really works or not, we have the `test_loaded_model_works()` function which uses a fixed data generator with no noise (as compared to other cases, where we can use a random data generator with random noise). It passes on the fixed `X` and `y` data, loads the trained model, checks if the $R^2$ scores are perfectly equal to 1.0 (true for a fixed dataset with no noise) and then compare the model predictions with the original ground truth `y` vector. Note, how it uses a special Numpy testing function `np.testing.assert_allclose` instead of the regular `assert` statement. This is to avoid any potential numerical precision issues associated with the model data i.e. Numpy arrays and the prediction algorithm involving linear algebra operations. |
| 50 | +- Take a look at the `random_data_constructor` and `fixed_data_constructor` functions too to see how they are designed and used in the test code. The `random_data_constructor` even takes a `noise_mag` argument which is used to control the magnitude of noise to test the expected behavior of a linear regression algorithm. Refer to the `test_noise_impact` function for this. |
0 commit comments