Skip to content

flowlog-rs/flowlog

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

186 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FlowLog Logo

Composable Datalog engine that compiles programs into efficient and scalable Differential Dataflow executables.

Quick StartArchitectureCompiler CLIFlowLog Paper

Status: FlowLog is under active development; interfaces may change without notice.

Architecture

A .dl program flows through the following pipeline:

.dl source → parser → stratifier → planner → compiler → executable
                                     ▲  ▲
                                catalog  optimizer
Crate Role
parser Pest grammar → AST
stratifier Dependency analysis and SCC-based rule scheduling
catalog Per-rule metadata used during planning
optimizer Heuristic cost model for join ordering
planner Lowers strata into dataflow transformation plans
compiler Generates and builds a Timely / Differential Dataflow executable
common Shared CLI and utility helpers
profiler Optional execution statistics

Getting Started

Prerequisites

$ bash tools/env.sh

The bootstrap script installs a stable Rust toolchain and a few helper utilities. At a minimum you need rustup, cargo, and a compiler capable of building Timely/Differential (Rust 1.80+ recommended).

Build the Workspace

$ cargo build --release

Compiler CLI

Compile a FlowLog program into a Timely/Differential Dataflow executable.

$ flowlog <PROGRAM> [OPTIONS]
Flag Description Required Notes
PROGRAM Path to a .dl file. Accepts all or --all to iterate over every program in example/. Yes Parsed relative to the workspace unless absolute.
-F, --fact-dir <DIR> Directory containing input CSVs referenced by .input directives. When .input uses relative filenames Prepends <DIR> to each filename= parameter; omit to use paths embedded in the program.
-o <PATH> Path for the generated executable. No Defaults to the program stem (e.g., reach.dl./reach).
-D, --output-dir <DIR> Location for materializing .output relations. Required when any relation uses .output Pass - to print tuples to stderr instead of writing files.
--mode <MODE> Choose execution semantics: datalog-batch (default), datalog-inc, extend-batch, or extend-inc. No datalog-batch uses Present diff; all other modes use i32. Extended modes enable explicit loop blocks.
-P, --profile Enable profiling (collect execution statistics). No Writes profiler logs.
-h, --help Show full Clap help text. No Includes additional examples and environment variables.

End-to-End Example

The example/reach.dl program computes nodes reachable from a small seed set. Below is the same program for reference.

Note: The example commands below only show batch-mode parameters. For incremental mode and profiler usage, please refer to the official website: https://www.flowlog-rs.com/

.decl Source(id: number)
.input Source(IO="file", filename="Source.csv", delimiter=",")

.decl Arc(x: number, y: number)
.input Arc(IO="file", filename="Arc.csv", delimiter=",")

.decl Reach(id: number)
.printsize Reach

Reach(y) :- Source(y).
Reach(y) :- Reach(x), Arc(x, y).

1. Prepare a Tiny Dataset

$ mkdir -p reach
$ cat <<'EOF' > reach/Source.csv
1
EOF

$ cat <<'EOF' > reach/Arc.csv
1,2
2,3
EOF

2. Compile and Run

# Compile the .dl program into a binary executable
$ flowlog example/reach.dl -F reach -o reach_bin -D -

# Run the generated executable
$ ./reach_bin -w 4

Key flags:

  • -F reach points the compiler at the directory holding Source.csv and Arc.csv.
  • -o reach_bin names the output executable.
  • -D - prints IDB tuples and sizes to stderr; pass a directory path to materialize CSV output files instead.
  • -w 4 tells the generated executable to use 4 worker threads.

End-to-End Tests

End-to-end tests live in tests/, organized by evaluation semantics:

Directory Mode
tests/datalog-batch/ Standard batch Datalog (default)
tests/datalog-inc/ Incremental Datalog
tests/extend-batch/ Extended batch (loops)
tests/extend-inc/ Extended incremental

Run the full suite with:

$ bash tests/run.sh

Or run specific tests by name:

$ bash tests/run.sh loop_fixpoint negation

Each test is a directory containing:

  • program.dl — Datalog source (must use .output directives).
  • data/ — Optional CSV input facts copied into the test working directory.
  • expected/ — Expected output files (one per output relation).
  • commands.txt — Optional incremental transcript (enables incremental mode).
  • runtime_flags — Optional runtime flags (e.g. -w 4 for multi-worker).

Background Reading

FlowLog: Efficient and Extensible Datalog via Incrementality
Hangdong Zhao, Zhenghong Yu, Srinag Rao, Simon Frisk, Zhiwei Fan, Paraschos Koutris
VLDB 2026 (Boston) — pVLDBVLDB 2026 Artifacts

Contributing

Contributions and bug reports are welcome. Please open an issue or submit a pull request once you have reproduced the change with cargo test and bash tests/run.sh.

About

FlowLog is an actively developed Datalog-to-Timely compiler that turns Souffle-compatible programs into standalone Differential Dataflow executables.

Resources

License

Stars

Watchers

Forks

Contributors