Stars
DuckDB is an analytical in-process SQL database management system
Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
The Auron accelerator for distributed computing framework (e.g., Spark) leverages native vectorized execution to accelerate query processing
Apache DataFusion Comet Spark Accelerator
The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance …
JuiceFS is a distributed POSIX file system built on top of Redis and S3.
Page Cache stat: get page cache stats for files on Linux
ClickHouse® is a real-time analytics database management system
A composable and fully extensible C++ execution engine library for data management systems.
Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
Alibaba Java Diagnostic Tool Arthas/Alibaba Java诊断利器Arthas
Apache Spark - A unified analytics engine for large-scale data processing