Skip to content

NVIDIA/data-federation-mesh

NVIDIA Data Federation Mesh

Data Federation Mesh (DFM) is a Python-based framework designed to facilitate creation and orchestration of complex workflows processing data coming from various distributed sources, and streaming those data into applications. Our mission is creating smart system that determines for the user where to run each operation of a data processing pipeline and whether data need to be moved in order for each operation to function

- DFM Documentation -

Install | User-Guide | Tutorials | API

DFM Banner

Project Structure

This is a monorepo containing multiple Python packages:

Package Description
nv-dfm-core Core DFM package containing API, execution and generation engines and CLI
nv-dfm-lib-common Common utilities shared across adapter libraries
nv-dfm-lib-weather Weather and climate data adapters (GFS, ECMWF, HRRR, SFNO, cBottle)
data-federation-mesh/
├── packages/
│   ├── nv-dfm-core/             # Core framework package
│   ├── nv-dfm-lib-common/       # Common utilities
│   └── nv-dfm-lib-weather/      # Weather adapters
├── ci/                          # CI/CD infrastructure
├── docs/                        # Documentation
├── tutorials/                   # Tutorials, examples and startup folder 
└── tests/                       # Unit tests

Quick Start

Installation from PyPI

# Install core framework only
pip install nv-dfm-core

# Install weather adapters library (see warning below)
pip install nv-dfm-lib-weather

# Install weather adapters with AI model support (requires GPU, see below)
pip install nv-dfm-lib-weather[cbottle]   # cBottle model adapters
pip install nv-dfm-lib-weather[sfno]      # SFNO model adapters
pip install nv-dfm-lib-weather[all]       # all AI model adapters

Note: nv-dfm-lib-weather depends on earth2studio, which may require additional dependencies depending on your environment. The SFNO and cBottle AI model adapters additionally require a CUDA-capable NVIDIA GPU and model-specific setup. See the installation guide for full prerequisites.

Development Setup

To work with the source, run tutorials, or contribute, clone the repository and use uv to manage the workspace.

Note: If you don't have uv installed, follow the uv installation instructions.

git clone https://github.com/NVIDIA/data-federation-mesh.git
cd data-federation-mesh

This is a multi-package workspace. Use uv sync to install packages into the local .venv:

# Install all workspace packages and their dependencies
uv sync --all-packages

# Install a single package (for example core only)
uv sync --package nv-dfm-core

# Install with tutorial extras (adds JupyterLab, ipywidgets, leafmap)
uv sync --all-packages --extra tutorials

Important: Each uv sync invocation reconfigures the virtual environment to match exactly the requested set of packages. Syncing for a single package will remove dependencies that are not required by that package. Use --all-packages when you need the full workspace available.

Examples

  • Basic introduction into federation setup and adapters development, see zero-to-thirty tutorial.

  • To start your own federation from scratch: cookiecutter startup folder.

  • Tutorial on creating pipelines and using adapters for loading and processing weather data: weather-fed.

Overview

Watch the video

DFM is a programmable framework for managing and orchestrating various services, distributed across potentially numerous sites, to collaborate and implement common functionalities. It is engineered to deliver "glue code as a service" to facilitate creating of complex pipelines and workflows to process data.

DFM Overview

DFM consists of multiple sites, which are groups of collocated services that are deployed together. Multiple sites communicating with each other in a peer-to-peer way, form a federation. DFM can be approached from the perspective of developers and users. The developers implement functionality that each site provides, in the form of a plugin-like mechanism called adapters. The adapters are not exposed directly to the users, but rather assigned within the federation to a public interface called operations. The users create and submit data processing pipelines to the federation using provided operations API. DFM ensures execution of each operations on dedicated sites and transfer of data between sites.

NVIDIA Flare

DFM is built on top of NVIDIA Flare, which provides runtime services such as distributed messaging, job management, security, deployment, and simulation framework.

DFM Command Line Interface

DFM CLI is a command line tool that facilitates management of DFM and underlying NVIDIA Flare and provides a convenient way to perform many DFM-related tasks (including development tasks, such as testing and linting). See DFM CLI Documentation for details.

Contributors

This project is currently not accepting contributions.

License

DFM is provided under the Apache License 2.0, refer to the LICENSE file for full license text.

About

Python-based framework designed to facilitate creation and orchestration of complex workflows processing data coming from various distributed sources, and streaming those data into applications.

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Contributors