Skip to content
View vimagick's full-sized avatar
🐰
🐰🐰🐰🐰🐰🐰🐰🐰🐰
🐰
🐰🐰🐰🐰🐰🐰🐰🐰🐰

Organizations

@EasyPi

Block or report vimagick

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

BigData

🐘
31 repositories

Distributed Big Data Orchestration Service

Java 1,763 371 Updated Jan 31, 2026

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

Python 44,616 16,683 Updated Mar 13, 2026

OpenRefine is a free, open source power tool for working with messy data and improving it

Java 11,776 2,130 Updated Mar 12, 2026

A data visualization and analytics component, especially well-suited for large and/or streaming datasets.

C++ 10,388 1,281 Updated Mar 11, 2026

Web-based SQL editor

JavaScript 5,181 818 Updated Aug 23, 2025

Pentaho Data Integration ( ETL ) a.k.a Kettle

Java 8,313 3,588 Updated Mar 13, 2026

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

Python 20,876 5,082 Updated Mar 13, 2026

Build, Manage and Deploy AI/ML Systems

Python 9,935 1,155 Updated Mar 13, 2026

Open-Source Web UI for Apache Kafka Management

Java 11,950 1,373 Updated Jul 26, 2024

The official repository for ROOT: analyzing, storing and visualizing big data, scientifically

C++ 3,128 1,500 Updated Mar 12, 2026

Cap'n Proto serialization/RPC system - core tools and C++ library

C++ 12,903 1,033 Updated Mar 12, 2026

A interactive Zookeeper client.

Go 148 28 Updated May 18, 2025

Generic command line non-JVM Apache Kafka producer and consumer

C 5,732 500 Updated Jul 9, 2024

gRPC/REST proxy for Kafka

Go 789 114 Updated Apr 23, 2024

dCache - a system for storing and retrieving huge amounts of data, distributed among a large number of heterogenous server nodes, under a single virtual filesystem tree with a variety of standard a…

Java 314 144 Updated Mar 12, 2026

The Metadata Platform for your Data and AI Stack

Java 11,654 3,392 Updated Mar 13, 2026

Command-line tool (+ C library) for converting SAS, Stata, and SPSS files πŸ’Ύ

C 303 79 Updated Feb 2, 2026

Apache StreamPipes - A self-service (Industrial) IoT toolbox to enable non-technical users to connect, analyze and explore IoT data streams.

Java 715 225 Updated Mar 12, 2026

Quickly search, compare, and analyze genomic and metagenomic data sets.

Python 540 90 Updated Mar 9, 2026

An R package for the Quantitative Analysis of Textual Data

R 875 191 Updated Mar 5, 2026

Visualizer for pandas data structures

TypeScript 5,070 432 Updated Mar 12, 2026

Apache DataFusion SQL Query Engine

Rust 8,495 1,997 Updated Mar 13, 2026
Java 247 129 Updated Mar 13, 2026

An Open Standard for lineage metadata collection

Java 2,348 434 Updated Mar 12, 2026

Collect, aggregate, and visualize a data ecosystem's metadata

Java 2,137 385 Updated Mar 12, 2026

Lightning fast data version control system for structured and unstructured machine learning datasets. We aim to make versioning datasets as easy as versioning code.

Rust 1,116 23 Updated Mar 13, 2026

A tool for data sampling, data generation, and data diffing

Scala 346 53 Updated Jan 8, 2026

Open Food Facts database, API server and web interface - πŸͺπŸ¦‹ Perl, CSS and JS coders welcome 😊 For helping in Python, see Robotoff or taxonomy-editor

HTML 949 578 Updated Mar 12, 2026

Parsing, Manipulation, and Visualization of Metabarcoding/Taxonomic data

R 144 29 Updated Feb 6, 2026

Python Client for OpenSearch

Python 457 216 Updated Mar 13, 2026