Stars
open source training courses about distributed database and distributed systems
Automatically identify anti-patterns in SQL queries
A fast type checker and language server for Python
A reactive notebook for Python โ run reproducible experiments, query with SQL, execute as a script, deploy as an app, and version with git. Stored as pure Python. All in a modern, AI-native editor.
Unofficial rust implementation of Apache Iceberg with integration for Datafusion
A collection of RBIR projects and posts for anyone interested in joining this journey.
Data Agent Ready Warehouse : One for Analytics, Search, AI, Python Sandbox. โ rebuilt from scratch. Unified architecture on your S3.
A curated awesome list of lists of interview questions. Feel free to contribute! ๐
A pure-python rules engine. Packed with components to build rules and a rule parser.
This is a repo with links to everything you'd ever want to learn about data engineering
A cross platform way to express data transformation, relational algebra, standardized record expression and plans.
data load tool (dlt) is an open source Python library that makes data loading easy ๐ ๏ธ
Apache Spark - A unified analytics engine for large-scale data processing
The Auron accelerator for distributed computing framework (e.g., Spark) leverages native vectorized execution to accelerate query processing
Algorithm and data structure articles for https://cp-algorithms.com (based on http://e-maxx.ru)
Free & OSS PostgreSQL RDS / DBaaS, Self-Host PG like a Pro
This repository contains the source code for the paper First Order Motion Model for Image Animation
๐ Parameterize, execute, and analyze notebooks
