diff --git a/README.md b/README.md index 8f4e38a..1827f4a 100644 --- a/README.md +++ b/README.md @@ -478,3 +478,114 @@ class Encoder: ... ``` Now, if some user irresponsibly does a wild import with `parts`, they can only import Encoder, Decoder, Loss. Personally, I also find `__all__` helpful as it gives me an overview of the module. + +## 6. Decorator to time your functions +It's often useful to know how long it takes a function to run, e.g. when you need to compare the performance of two algorithms that do the same thing. One naive way is to call `time.time()` at the begin and end of each function and print out the difference. + +For example: compare two algorithms to calculate the n-th Fibonacci number, one uses memoization and one doesn't. + +```python +def fib_helper(n): + if n < 2: + return n + return fib_helper(n - 1) + fib_helper(n - 2) + +def fib(n): + """ fib is a wrapper function so that later we can change its behavior + at the top level without affecting the behavior at every recursion step. + """ + return fib_helper(n) + +def fib_m_helper(n, computed): + if n in computed: + return computed[n] + computed[n] = fib_m_helper(n - 1, computed) + fib_m_helper(n - 2, computed) + return computed[n] + +def fib_m(n): + return fib_m_helper(n, {0: 0, 1: 1}) +``` + +Let's make sure that `fib` and `fib_m` are functionally equivalent. + +```python +for n in range(20): + assert fib(n) == fib_m(n) +``` + +```python +import time + +start = time.time() +fib(30) +print(f'Without memoization, it takes {time.time() - start:7f} seconds.') + +==> Without memoization, it takes 0.267569 seconds. + +start = time.time() +fib_m(30) +print(f'With memoization, it takes {time.time() - start:.7f} seconds.') + +==> With memoization, it takes 0.0000713 seconds. +``` + +If you want to time multiple functions, it can be a drag having to write the same code over and over again. It'd be nice to have a way to specify how to change any function in the same way. In this case would be to call time.time() at the beginning and the end of each function, and print out the time difference. + +This is exactly what decorators do. They allow programmers to change the behavior of a function or class. Here's an example to create a decorator `timeit`. + +```python +def timeit(fn): + # *args and **kwargs are to support positional and named arguments of fn + def get_time(*args, **kwargs): + start = time.time() + output = fn(*args, **kwargs) + print(f"Time taken in {fn.__name__}: {time.time() - start:.7f}") + return output # make sure that the decorator returns the output of fn + return get_time +``` + +Add the decorator `@timeit` to your functions. + +```python +@timeit +def fib(n): + return fib_helper(n) + +@timeit +def fib_m(n): + return fib_m_helper(n, {0: 0, 1: 1}) + +fib(30) +fib_m(30) + +==> Time taken in fib: 0.2787242 +==> Time taken in fib_m: 0.0000138 +``` + +## 7. Caching with @functools.lru_cache +Memoization is a form of cache: we cache the previously calculated Fibonacci numbers so that we don't have to calculate them again. + +Caching is such an important technique that Python provides a built-in decorator to give your function the caching capacity. If you want `fib_helper` to reuse the previously calculated Fibonacci numbers, you can just add the decorator `lru_cache` from `functools`. `lru` stands for "least recently used". For more information on cache, see [here](https://docs.python.org/3/library/functools.html). + +```python +import functools + +@functools.lru_cache() +def fib_helper(n): + if n < 2: + return n + return fib_helper(n - 1) + fib_helper(n - 2) + +@timeit +def fib(n): + """ fib is a wrapper function so that later we can change its behavior + at the top level without affecting the behavior at every recursion step. + """ + return fib_helper(n) + +fib(50) +fib_m(50) + +==> Time taken in fib: 0.0000412 +==> Time taken in fib_m: 0.0000281 +``` diff --git a/cool-python-tips.ipynb b/cool-python-tips.ipynb index 549c3cc..6c06cf3 100644 --- a/cool-python-tips.ipynb +++ b/cool-python-tips.ipynb @@ -562,7 +562,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "\n", + "\n", "['i', 'want', 'to']\n", "['want', 'to', 'go']\n", "['to', 'go', 'to']\n", @@ -598,7 +598,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "\n", + "\n", "('i', 'want', 'to')\n", "('want', 'to', 'go')\n", "('to', 'go', 'to')\n", @@ -665,7 +665,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "<__main__.Node object at 0x10aeb84e0>\n" + "<__main__.Node object at 0x10e1a8890>\n" ] } ], @@ -819,7 +819,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "{'learning_rate': 0.0003, 'num_layers': 3, 'hidden_size': 100, 'self': <__main__.Model1 object at 0x10aea3d68>}\n" + "{'self': <__main__.Model1 object at 0x10f210590>, 'hidden_size': 100, 'num_layers': 3, 'learning_rate': 0.0003}\n" ] } ], @@ -876,7 +876,7 @@ { "data": { "text/plain": [ - "{'learning_rate': 0.0003, 'num_layers': 3, 'hidden_size': 100}" + "{'hidden_size': 100, 'num_layers': 3, 'learning_rate': 0.0003}" ] }, "execution_count": 28, @@ -975,6 +975,224 @@ "\n", "Now, if some user irresponsibly does a wildcard import with `parts`, they can only import Encoder, Decoder, Loss. Personally, I also find `__all__` helpful as it gives me an overview of the module." ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# 6. Decorator to time your functions\n", + "\n", + "It's often useful to know how long it takes a function to run, e.g. when you need to compare the performance of two algorithms that do the same thing. One naive way is to call `time.time()` at the begin and end of each function and print out the difference.\n", + "\n", + "For example: compare two algorithms to calculate the n-th Fibonacci number, one uses memoization and one doesn't." + ] + }, + { + "cell_type": "code", + "execution_count": 30, + "metadata": {}, + "outputs": [], + "source": [ + "def fib_helper(n):\n", + " if n < 2:\n", + " return n\n", + " return fib_helper(n - 1) + fib_helper(n - 2)\n", + "\n", + "def fib(n):\n", + " \"\"\" fib is a wrapper function so that later we can change its behavior\n", + " at the top level without affecting the behavior at every recursion step.\n", + " \"\"\"\n", + " return fib_helper(n)\n", + "\n", + "def fib_m_helper(n, computed):\n", + " if n in computed:\n", + " return computed[n]\n", + " computed[n] = fib_m_helper(n - 1, computed) + fib_m_helper(n - 2, computed)\n", + " return computed[n]\n", + "\n", + "def fib_m(n):\n", + " return fib_m_helper(n, {0: 0, 1: 1})" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's make sure that `fib` and `fib_m` are functionally equivalent." + ] + }, + { + "cell_type": "code", + "execution_count": 31, + "metadata": {}, + "outputs": [], + "source": [ + "for n in range(20):\n", + " assert fib(n) == fib_m(n)" + ] + }, + { + "cell_type": "code", + "execution_count": 32, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Without memoization, it takes 0.267569 seconds.\n", + "With memoization, it takes 0.0000713 seconds.\n" + ] + } + ], + "source": [ + "import time\n", + "\n", + "start = time.time()\n", + "fib(30)\n", + "print(f'Without memoization, it takes {time.time() - start:7f} seconds.')\n", + "\n", + "start = time.time()\n", + "fib_m(30)\n", + "print(f'With memoization, it takes {time.time() - start:.7f} seconds.')\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If you want to time multiple functions, it can be a drag having to write the same code over and over again. It'd be nice to have a way to specify how to change any function in the same way. In this case would be to call time.time() at the beginning and the end of each function, and print out the time difference.\n", + "\n", + "This is exactly what decorators do. They allow programmers to change the behavior of a function or class. Here's an example to create a decorator `timeit`." + ] + }, + { + "cell_type": "code", + "execution_count": 33, + "metadata": {}, + "outputs": [], + "source": [ + "def timeit(fn): \n", + " # *args and **kwargs are to support positional and named arguments of fn\n", + " def get_time(*args, **kwargs): \n", + " start = time.time() \n", + " output = fn(*args, **kwargs)\n", + " print(f\"Time taken in {fn.__name__}: {time.time() - start:.7f}\")\n", + " return output # make sure that the decorator returns the output of fn\n", + " return get_time " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Add the decorator `@timeit` to your functions." + ] + }, + { + "cell_type": "code", + "execution_count": 34, + "metadata": {}, + "outputs": [], + "source": [ + "@timeit\n", + "def fib(n):\n", + " return fib_helper(n)\n", + "\n", + "@timeit\n", + "def fib_m(n):\n", + " return fib_m_helper(n, {0: 0, 1: 1})" + ] + }, + { + "cell_type": "code", + "execution_count": 35, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Time taken in fib: 0.2787242\n", + "Time taken in fib_m: 0.0000138\n" + ] + }, + { + "data": { + "text/plain": [ + "832040" + ] + }, + "execution_count": 35, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "fib(30)\n", + "fib_m(30)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# 7. Caching with @functools.lru_cache\n", + "Memoization is a form of cache: we cache the previously calculated Fibonacci numbers so that we don't have to calculate them again.\n", + "\n", + "Caching is such an important technique that Python provides a built-in decorator to give your function the caching capacity. If you want `fib_helper` to reuse the previously calculated Fibonacci numbers, you can just add the decorator `lru_cache` from `functools`. `lru` stands for \"least recently used\". For more information on cache, see [here](https://docs.python.org/3/library/functools.html)." + ] + }, + { + "cell_type": "code", + "execution_count": 38, + "metadata": {}, + "outputs": [], + "source": [ + "import functools\n", + "\n", + "@functools.lru_cache()\n", + "def fib_helper(n):\n", + " if n < 2:\n", + " return n\n", + " return fib_helper(n - 1) + fib_helper(n - 2)\n", + "\n", + "@timeit\n", + "def fib(n):\n", + " \"\"\" fib is a wrapper function so that later we can change its behavior\n", + " at the top level without affecting the behavior at every recursion step.\n", + " \"\"\"\n", + " return fib_helper(n)" + ] + }, + { + "cell_type": "code", + "execution_count": 39, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Time taken in fib: 0.0000412\n", + "Time taken in fib_m: 0.0000281\n" + ] + }, + { + "data": { + "text/plain": [ + "12586269025" + ] + }, + "execution_count": 39, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "fib(50)\n", + "fib_m(50)" + ] } ], "metadata": { @@ -993,7 +1211,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.6.8" + "version": "3.7.4" } }, "nbformat": 4,