LLM Scratch Scratch

Intro

This is a scratchpad project that implements different LLM parts from scratch, also builds and trains small model variants of popular LLM architectures.

Acknowledgments

History

2025/06/26 Project start
2025/06/27
- Add scaled dot-product attention
- Add multi-head attention (MHA)
2025/06/29
- Add multi-query attention (MQA)
- Add attention and MHA variants explanation in attention readme
2025/07/24
- Add multi-head latent attention (MLA)
2025/08/05
- Update MLA implementation to follow DeepSeek-v2 official formula

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
llm_algo_components		llm_algo_components
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLM Scratch Scratch

Intro

Contents

LLM algo components

Models From Scratch

Training / Fine-tuning

Optimization / Distributed

Acknowledgments

Book

Papers

Online articles

Code

History

About

Uh oh!

Releases

Packages

Languages

furixturi/llm_scratch_scratch

Folders and files

Latest commit

History

Repository files navigation

LLM Scratch Scratch

Intro

Contents

LLM algo components

Models From Scratch

Training / Fine-tuning

Optimization / Distributed

Acknowledgments

Book

Papers

Online articles

Code

History

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages