You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
### Dr. Tirthajyoti Sarkar, Fremont, California ([Please feel free to connect on LinkedIn here](https://www.linkedin.com/in/tirthajyoti-sarkar-2127aa7))
6
9
7
-
### Dr. Tirthajyoti Sarkar, Fremont, CA ([Please feel free to add me on LinkedIn here](https://www.linkedin.com/in/tirthajyoti-sarkar-2127aa7))
8
10
---
9
11
10
-
### Requirements
11
-
***Python 3.5+**
12
-
***NumPy (`$ pip install numpy`)**
13
-
***Pandas (`$ pip install pandas`)**
14
-
***Scikit-learn (`$ pip install scikit-learn`)**
15
-
***SciPy (`$ pip install scipy`)**
16
-
***Statsmodels (`$ pip install statsmodels`)**
17
-
***MatplotLib (`$ pip install matplotlib`)**
18
-
***Seaborn (`$ pip install seaborn`)**
19
-
***Sympy (`$ pip install sympy`)**
12
+
## Also check out these super-useful Repos that I curated
13
+
14
+
[Highly cited and useful papers related to machine learning, deep learning, AI, game theory, reinforcement learning](https://github.com/tirthajyoti/Papers-Literature-ML-DL-RL-AI)
15
+
16
+
[Carefully curated resource links for data science in one place](https://github.com/tirthajyoti/Data-science-best-resources)
17
+
18
+
## Requirements
19
+
***Python 3.6+**
20
+
***NumPy (`pip install numpy`)**
21
+
***Pandas (`pip install pandas`)**
22
+
***Scikit-learn (`pip install scikit-learn`)**
23
+
***SciPy (`pip install scipy`)**
24
+
***Statsmodels (`pip install statsmodels`)**
25
+
***MatplotLib (`pip install matplotlib`)**
26
+
***Seaborn (`pip install seaborn`)**
27
+
***Sympy (`pip install sympy`)**
28
+
***Flask (`pip install flask`)**
29
+
***WTForms (`pip install wtforms`)**
30
+
***Tensorflow (`pip install tensorflow>=1.15`)**
31
+
***Keras (`pip install keras`)**
32
+
***pdpipe (`pip install pdpipe`)**
33
+
20
34
---
21
35
22
36
You can start with this article that I wrote in Heartbeat magazine (on Medium platform):
23
37
### ["Some Essential Hacks and Tricks for Machine Learning with Python"](https://heartbeat.fritz.ai/some-essential-hacks-and-tricks-for-machine-learning-with-python-5478bc6593f2)
### [Basics of visualization with Matplotlib and descriptive stats](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Pandas%20and%20Numpy/Basics%20of%20Matplotlib%20and%20Descriptive%20Statistics.ipynb)
### [How to read various data sources](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Pandas%20and%20Numpy/Read_data_various_sources/How%20to%20read%20various%20sources%20in%20a%20DataFrame.ipynb)
34
-
### [PDF reading and table processing demo](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Pandas%20and%20Numpy/Read_data_various_sources/PDF%20table%20reading%20and%20processing%20demo.ipynb)
35
-
### [How fast are Numpy operations compared to pure Python code?](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Pandas%20and%20Numpy/How%20fast%20are%20NumPy%20ops.ipynb) (Read my [article](https://towardsdatascience.com/why-you-should-forget-for-loop-for-data-science-code-and-embrace-vectorization-696632622d5f) on Medium related to this topic)
36
-
### [Fast reading from Numpy using .npy file format](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Pandas%20and%20Numpy/Numpy_Reading.ipynb) (Read my [article](https://towardsdatascience.com/why-you-should-start-using-npy-file-more-often-df2a13cc0161) on Medium on this topic)
41
+
Jupyter notebooks covering a wide range of functions and operations on the topics of NumPy, Pandans, Seaborn, Matplotlib etc.
*[Numpy and Pandas quick basics](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Pandas%20and%20Numpy/Numpy_Pandas_Quick.ipynb)
46
+
*[Matplotlib and Seaborn quick basics](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Pandas%20and%20Numpy/Matplotlib_Seaborn_basics.ipynb)
*[How to read various data sources](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Pandas%20and%20Numpy/Read_data_various_sources/How%20to%20read%20various%20sources%20in%20a%20DataFrame.ipynb)
49
+
*[PDF reading and table processing demo](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Pandas%20and%20Numpy/Read_data_various_sources/PDF%20table%20reading%20and%20processing%20demo.ipynb)
50
+
*[How fast are Numpy operations compared to pure Python code?](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Pandas%20and%20Numpy/How%20fast%20are%20NumPy%20ops.ipynb) (Read my [article](https://towardsdatascience.com/why-you-should-forget-for-loop-for-data-science-code-and-embrace-vectorization-696632622d5f) on Medium related to this topic)
51
+
*[Fast reading from Numpy using .npy file format](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Pandas%20and%20Numpy/Numpy_Reading.ipynb) (Read my [article](https://towardsdatascience.com/why-you-should-start-using-npy-file-more-often-df2a13cc0161) on Medium on this topic)
37
52
38
53
## Tutorial-type notebooks covering regression, classification, clustering, dimensionality reduction, and some basic neural network algorithms
39
54
@@ -47,34 +62,44 @@ Jupyter notebooks covering a wide range of functions and operations on the topic
* Polynomial regression using ***scikit-learn pipeline feature*** ([check the article I wrote on *Towards Data Science*](https://towardsdatascience.com/machine-learning-with-python-easy-and-robust-method-to-fit-nonlinear-data-19e8a1ddbd49))
50
-
* Decision trees and Random Forest regression (showing how the Random Forest works as a robust/regularized meta-estimator rejecting overfitting)
65
+
66
+
*[Decision trees and Random Forest regression](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Regression/Random_Forest_Regression.ipynb) (showing how the Random Forest works as a robust/regularized meta-estimator rejecting overfitting)
51
67
52
68
*[Detailed visual analytics and goodness-of-fit diagnostic tests for a linear regression problem](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Regression/Regression_Diagnostics.ipynb)
53
69
70
+
*[Robust linear regression using `HuberRegressor` from Scikit-learn](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Regression/Robust%20Linear%20Regression.ipynb)
71
+
54
72
-----
55
73
56
74
### Classification
57
-
* Logistic regression/classification
75
+
* Logistic regression/classification ([Here is the Notebook](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Classification/Logistic_Regression_Classification.ipynb))
* Support vector machine classification (**[check the article I wrote in Towards Data Science on SVM and sorting algorithm](https://towardsdatascience.com/how-the-good-old-sorting-algorithm-helps-a-great-machine-learning-technique-9e744020254b))**
78
+
*_k_-nearest neighbor classification ([Here is the Notebook](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Classification/KNN_Classification.ipynb))
79
+
80
+
* Decision trees and Random Forest Classification ([Here is the Notebook](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Classification/DecisionTrees_RandomForest_Classification.ipynb))
81
+
82
+
* Support vector machine classification ([Here is the Notebook](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Classification/Support_Vector_Machine_Classification.ipynb)) (**[check the article I wrote in Towards Data Science on SVM and sorting algorithm](https://towardsdatascience.com/how-the-good-old-sorting-algorithm-helps-a-great-machine-learning-technique-9e744020254b))**
* Naive Bayes classification ([Here is the Notebook](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Classification/Naive_Bayes_Classification.ipynb))
* Affinity propagation (showing its time complexity and the effect of damping factor)
75
-
* Mean-shift technique (showing its time complexity and the effect of noise on cluster discovery)
76
-
* DBSCAN (showing how it can generically detect areas of high density irrespective of cluster shapes, which the k-means fails to do)
77
-
* Hierarchical clustering with Dendograms showing how to choose optimal number of clusters
93
+
*_K_-means clustering ([Here is the Notebook](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Clustering-Dimensionality-Reduction/K_Means_Clustering_Practice.ipynb))
94
+
95
+
* Affinity propagation (showing its time complexity and the effect of damping factor) ([Here is the Notebook](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Clustering-Dimensionality-Reduction/Affinity_Propagation.ipynb))
96
+
97
+
* Mean-shift technique (showing its time complexity and the effect of noise on cluster discovery) ([Here is the Notebook](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Clustering-Dimensionality-Reduction/Mean_Shift_Clustering.ipynb))
98
+
99
+
* DBSCAN (showing how it can generically detect areas of high density irrespective of cluster shapes, which the k-means fails to do) ([Here is the Notebook](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Clustering-Dimensionality-Reduction/DBScan_Clustering.ipynb))
100
+
101
+
* Hierarchical clustering with Dendograms showing how to choose optimal number of clusters ([Here is the Notebook](https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Clustering-Dimensionality-Reduction/Hierarchical_Clustering.ipynb))
@@ -96,8 +121,12 @@ Jupyter notebooks covering a wide range of functions and operations on the topic
96
121
* How to use [Sympy package](https://www.sympy.org/en/index.html) to generate random datasets using symbolic mathematical expressions.
97
122
98
123
* Here is my article on Medium on this topic: [Random regression and classification problem generation with symbolic expression](https://towardsdatascience.com/random-regression-and-classification-problem-generation-with-symbolic-expression-a4e190e37b8d)
### Simple deployment examples (serving ML models on web API)
102
131
*[Serving a linear regression model through a simple HTTP server interface](https://github.com/tirthajyoti/Machine-Learning-with-Python/tree/master/Deployment/Linear_regression). User needs to request predictions by executing a Python script. Uses `Flask` and `Gunicorn`.
103
132
@@ -113,3 +142,6 @@ See my articles on Medium on this topic.
113
142
*[Object-oriented programming for data scientists: Build your ML estimator](https://towardsdatascience.com/object-oriented-programming-for-data-scientists-build-your-ml-estimator-7da416751f64)
114
143
*[How a simple mix of object-oriented programming can sharpen your deep learning prototype](https://towardsdatascience.com/how-a-simple-mix-of-object-oriented-programming-can-sharpen-your-deep-learning-prototype-19893bd969bd)
115
144
145
+
---
146
+
### Unit testing ML code with Pytest
147
+
Check the files and detailed instructions in the [Pytest](https://github.com/tirthajyoti/Machine-Learning-with-Python/tree/master/Pytest) directory to understand how one should write unit testing code/module for machine learning models
0 commit comments