A batteries-included local data engineering platform running on Minikube. Start with Airflow, add Spark, and scale to a full data stack - all on your laptop.
KLDP provides a production-like Kubernetes data platform for local development, testing, and learning. No cloud costs, no complex setup - just run a script and start building pipelines.
# Prerequisites: Docker, Minikube, Helm, kubectl
# Clone and setup
git clone https://github.com/gridatek/kldp.git
cd kldp
# Start with Airflow only
./scripts/install-airflow.sh
# Or install everything
./scripts/install-all.sh
# Access Airflow UI
minikube service airflow-webserver -n airflow- ✅ Apache Airflow - Workflow orchestration with KubernetesExecutor
- ✅ PostgreSQL - Metadata database
- ✅ MinIO - S3-compatible object storage
- 🔄 Spark Operator - Distributed data processing
- 🔄 Sample Pipelines - Airflow + Spark integration examples
- 📋 Prometheus + Grafana - Monitoring and observability
- 📋 Kafka - Streaming data platform
- 📋 JupyterHub - Interactive notebooks
- 📋 Data Catalog - Metadata management
kldp/
├── core/
│ └── airflow/ # Airflow Helm configs
├── compute/
│ └── spark/ # Spark operator configs
├── storage/
│ ├── minio/ # Object storage
│ └── postgresql/ # Shared database
├── monitoring/
│ └── observability/ # Prometheus, Grafana
├── scripts/
│ ├── init-cluster.sh # Initialize minikube
│ ├── install-airflow.sh # Install Airflow
│ ├── install-spark.sh # Install Spark operator
│ ├── install-all.sh # Full stack installation
│ └── destroy.sh # Cleanup everything
├── examples/
│ ├── airflow-basics/ # Basic Airflow DAGs
│ └── spark-pipeline/ # Airflow + Spark examples
└── docs/
├── GETTING_STARTED.md
├── ARCHITECTURE.md
└── TROUBLESHOOTING.md
- 4 CPU cores
- 8 GB RAM
- 20 GB disk space
- 6 CPU cores
- 12 GB RAM
- 40 GB disk space
- Learning: Hands-on experience with production data tools
- Development: Test pipelines locally before deploying to prod
- Prototyping: Experiment with data architectures risk-free
- Teaching: Workshop and training material
Contributions welcome! See CONTRIBUTING.md for guidelines.
MIT License - See LICENSE for details.
Built on top of:
Note: KLDP is optimized for local development. For production deployments, use managed services or properly configured production clusters.