ML From Scratch

A project implementing machine learning algorithms from scratch using only PyTorch's autograd.

This project reorganizes learning materials from University of Chicago STAT 37710 (Machine Learning) into a modern structure.

Features

From-scratch implementation: Built without torch.nn.Linear, torch.optim.SGD, etc.
PyTorch autograd only: Backpropagation implemented using only automatic differentiation
Learning-friendly: Numbered notebooks for step-by-step learning
sklearn-style API: fit(), predict(), transform() interface

Quick Start

1. Environment Setup

# Install uv (if not installed)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Clone and setup project
git clone https://github.com/username/ml-from-scratch.git
cd ml-from-scratch
uv sync

2. Run Notebooks

uv run jupyter notebook notebooks/

3. Data

Datasets like MNIST, CIFAR-10 are automatically downloaded on first run
No additional setup required!

Project Structure

ml-from-scratch/
├── mlfs/                          # Core package
│   ├── nn/                        # Neural network modules
│   │   ├── layers.py              # Linear, Conv2d, MaxPool2d
│   │   ├── activations.py         # ReLU, Sigmoid, Softmax
│   │   ├── losses.py              # CrossEntropyLoss, MSELoss
│   │   ├── optim.py               # SGD, Adam
│   │   └── models.py              # MLP, CNN
│   ├── classical/                 # Traditional ML
│   │   ├── clustering.py          # KMeans, SpectralClustering
│   │   ├── reduction.py           # PCA, LLE, Isomap
│   │   └── ensemble.py            # AdaBoost
│   ├── probabilistic/             # Probabilistic models
│   │   ├── em.py                  # EM Algorithm, GMM
│   │   └── sparse.py              # Sparse Coding
│   └── utils/                     # Utilities
│       ├── data.py                # Data loading
│       └── viz.py                 # Visualization
└── notebooks/                     # Tutorial notebooks

Learning Path

#	Notebook	Topic	Key Concepts
01	`perceptron`	Perceptron	Single neuron, Gradient descent
02	`mlp`	Multi-layer Perceptron	Backpropagation, Hidden layers
03	`cnn_mnist`	CNN (MNIST)	Convolution, Pooling
04	`cnn_cifar`	CNN (CIFAR)	Color images, Deep networks
05	`kmeans`	K-Means	Clustering, K-Means++
06	`spectral_clustering`	Spectral Clustering	Graph Laplacian
07	`pca`	PCA	Dimensionality reduction, Eigendecomposition
08	`lle`	LLE	Locally Linear Embedding
09	`isomap`	Isomap	Geodesic distance, Manifold
10	`ensemble`	Ensemble	AdaBoost, Weak classifiers
11	`em_algorithm`	EM Algorithm	GMM, Expectation Maximization
12	`face_detection`	Face Detection	Haar-like, Viola-Jones
13	`sparse_coding`	Sparse Coding	Sparse representation, Dictionary

PyTorch Usage Rules

Allowed (Can use)

import torch
import torch.nn.functional as F

# Tensor operations
x = torch.randn(32, 784)
y = x @ weight + bias

# Automatic differentiation
loss.backward()
param.grad

# Basic functions
F.relu(x), F.softmax(x, dim=1)

Prohibited (Must implement)

# Layers - must implement yourself
torch.nn.Linear      # → mlfs.nn.layers.Linear
torch.nn.Conv2d      # → mlfs.nn.layers.Conv2d

# Optimizers - must implement yourself
torch.optim.SGD      # → mlfs.nn.optim.SGD
torch.optim.Adam     # → mlfs.nn.optim.Adam

Usage Example

from mlfs.nn.layers import Linear
from mlfs.nn.optim import SGD
from mlfs.utils.data import load_mnist

# Load data
X_train, y_train = load_mnist(train=True)

# Create model (custom implemented layers)
layer = Linear(784, 10)

# Forward pass
output = layer.forward(X_train[:32])

# Optimizer (custom implemented)
optimizer = SGD(layer.parameters(), lr=0.01)

License

MIT License

References

This project was created for educational purposes.
Based on University of Chicago STAT 37710 (Machine Learning) course materials.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
mlfs		mlfs
notebooks		notebooks
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML From Scratch

Features

Quick Start

1. Environment Setup

2. Run Notebooks

3. Data

Project Structure

Learning Path

PyTorch Usage Rules

Allowed (Can use)

Prohibited (Must implement)

Usage Example

License

References

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

sucpark/ml-from-scratch

Folders and files

Latest commit

History

Repository files navigation

ML From Scratch

Features

Quick Start

1. Environment Setup

2. Run Notebooks

3. Data

Project Structure

Learning Path

PyTorch Usage Rules

Allowed (Can use)

Prohibited (Must implement)

Usage Example

License

References

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages