gtdb

gtdb is a toy relational database system built from scratch for educational purposes.
It implements core database system components such as external sorting, logging and recovery, concurrency control, and query optimization — motivated by BuzzDB.

Features

External Sort

An external merge sort algorithm for sorting large datasets that cannot fit in memory.

The sorting is done in two phases:
First, the input is read in blocks, sorted in-memory, and flushed to temporary files.
Second, the temporary sorted runs are merged together using a priority queue.

Logging and Recovery

Inspired by ARIES, gtdb supports Write-Ahead Logging (WAL) to ensure durability and atomicity in the event of a crash.

Log records are written before the actual data pages are flushed (WAL protocol).
On startup, gtdb performs recovery in two phases:
- REDO: reapplies committed changes to ensure all effects are persisted.
- UNDO: rolls back incomplete transactions using the log history.

Concurrency Control

gtdb implements concurrency control using two-phase locking (2PL) and deadlock detection.

Each transaction acquires shared or exclusive locks on records or pages as needed.
The system enforces lock compatibility and supports lock upgrades.
Deadlocks are detected via a timeout, and resolved by aborting victim transactions.

Query Optimizer

A cost-based query optimizer motivated by Selinger-style Optimizer, using selectivity estimation and dynamic programming.

Table stats allow to estimate predicate selectivities and cost of scans, built on bucket-based histograms.
Join cardinality and cost, for equality joins or range scan / inequality joins.
Join order optimization using dynamic programming and cost estimation.

Resources

BuzzDB
Database System Concepts by Silberschatz, Korth, and Sudarshan.
CMU 15-445 Database Systems

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
concurrency-control		concurrency-control
external-sort		external-sort
logging-and-recovery		logging-and-recovery
query-optimizer		query-optimizer
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

gtdb

Features

External Sort

Logging and Recovery

Concurrency Control

Query Optimizer

Resources

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

gtdb

Features

External Sort

Logging and Recovery

Concurrency Control

Query Optimizer

Resources

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages