Skip to content

Commit 675c3bd

Browse files
authored
Create readme.md
1 parent 113c7f3 commit 675c3bd

File tree

1 file changed

+50
-0
lines changed

1 file changed

+50
-0
lines changed

Pytest/readme.md

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
## A sample Pytest module for a Scikit-learn model training function
2+
3+
### How to run Pytest
4+
5+
- Install pytest `pip install pytest`
6+
7+
- Copy/clone the two Python scripts from this directory
8+
- The `linear_model.py` has a single function that trains a simple linear regression model using scikit-learn. Note that it has basic assertion tests and `try-except` construct to handle potential input errors.
9+
- The `test_linear_model.py` file is the test module which acts as the input to the Pytest program.
10+
- Run `pytest test_linear_model.py -v` on your terminal to run the tests. You should see something like following,
11+
12+
```
13+
======================================================================================================= test session starts ========================================================================================================
14+
platform win32 -- Python 3.9.1, pytest-6.2.2, py-1.10.0, pluggy-0.13.1 -- c:\program files\python39\python.exe
15+
cachedir: .pytest_cache
16+
rootdir: C:\Users\TirthajyotiSarkar\Documents\Python Notebooks\Pytest
17+
plugins: anyio-2.0.2
18+
collected 7 items
19+
20+
test_linear_model.py::test_model_return_object PASSED [ 14%]
21+
test_linear_model.py::test_model_return_vals PASSED [ 28%]
22+
test_linear_model.py::test_model_save_load PASSED [ 42%]
23+
test_linear_model.py::test_loaded_model_works PASSED [ 57%]
24+
test_linear_model.py::test_model_works_data_range PASSED [ 71%]
25+
test_linear_model.py::test_noise_impact PASSED [ 85%]
26+
test_linear_model.py::test_wrong_input_raises_assertion PASSED [100%]
27+
28+
========================================================================================================= warnings summary =========================================================================================================
29+
..\..\..\..\..\..\program files\python39\lib\site-packages\win32\lib\pywintypes.py:2
30+
c:\program files\python39\lib\site-packages\win32\lib\pywintypes.py:2: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
31+
import imp, sys, os
32+
33+
-- Docs: https://docs.pytest.org/en/stable/warnings.html
34+
=================================================================================================== 7 passed, 1 warning in 1.03s ===================================================================================================
35+
```
36+
37+
### What does it mean?
38+
39+
- The terminal message (above) indicates that 7 tests were run (corresponding to the 7 functions in the `test_linear_model.py` module) and all of them passed.
40+
41+
- It also shows the order of the tests run (this is because you included the `- v` argument on the command line while running `pytest` command). Pytest allows you to randomize the testing sequence but that discussion is for another day.
42+
43+
### Notes on the test module
44+
45+
- Note, how the `test_linear_model.py` contains 7 functions with names starting with `test...`. Those contain the actual test code. It also has a couple of data constructor functions whose names do not start with `test...` and they are ignored by Pytest.
46+
47+
- Note that we need to import a bunch of libraries to test all kind of things e.g. we imported libraries like `joblib`, `os`, `sklearn`, `numpy`, and of course, the `train_linear_model` function from the `linear_model` module.
48+
- Note the clear and distinctive names for the testing functions e.g. `test_model_return_object()` which only checks the returned object from the `train_linear_model` function, or the `test_model_save_load()` which checks whether the saved model can be loaded properly (but does not try to make predictions or anything).
49+
- For checking the predictions i.e. whether the trained model really works or not, we have the `test_loaded_model_works()` function which uses a fixed data generator with no noise (as compared to other cases, where we can use a random data generator with random noise). It passes on the fixed `X` and `y` data, loads the trained model, checks if the $R^2$ scores are perfectly equal to 1.0 (true for a fixed dataset with no noise) and then compare the model predictions with the original ground truth `y` vector. Note, how it uses a special Numpy testing function `np.testing.assert_allclose` instead of the regular `assert` statement. This is to avoid any potential numerical precision issues associated with the model data i.e. Numpy arrays and the prediction algorithm involving linear algebra operations.
50+
- Take a look at the `random_data_constructor` and `fixed_data_constructor` functions too to see how they are designed and used in the test code. The `random_data_constructor` even takes a `noise_mag` argument which is used to control the magnitude of noise to test the expected behavior of a linear regression algorithm. Refer to the `test_noise_impact` function for this.

0 commit comments

Comments
 (0)