Spacker is a unified framework designed for state migration in distributed stream processing systems (SPEs) such as Apache Flink. It enables configurable state migration strategies to accommodate varying performance goals, such as minimizing latency spikes, optimizing completion times, and reducing system overhead. Spacker separates the logical planning of state migration from the physical execution, allowing fine-grained control over state migration at a key-level granularity.
Spacker decouples the state migration process into two main components:
Planning: Defines the migration strategy (key prioritization, progressiveness, and replication). Execution: Implements the non-disruptive protocol to physically move the state, ensuring minimal disruption to ongoing data processing.
- Python3
- Zookeeper
- Kafka
- Java 1.8
The source code of Spacker has been placed into Spacker-on-Flink, because Flink has network stack for us to achieve RPC among our components.
The main source code entrypoint is in flink-runtime/spector/.
- Compile
Spacker-on-Flinkwith :mvn clean install -DskipTests -Dcheckstyle.skip -Drat.skip=true. - Compile
experimentswith:mvn clean package. - Try Spacker with the following command:
cd Spacker-on-Flink/build-targetand start a standalone cluster:./bin/start-cluster.sh. - Launch an example
StatefulDemoin experiments folder:./bin/flink run -c flinkapp.StatefulDemo experiments/target/testbed-1.0-SNAPSHOT.jar
We have placed Spacker into the Flink, and uses Flink configuration tools to configure the parameters of Spacker. There are some configurations you can try in flink-conf.yaml to use the Spacker-on-Flink:
In this project, we have mainly run experiments for three workloads:
- Stock experiment
- Nexmark experiment
- Micro-Benchmark experiment
We have placed our scripts to run the experiments in experiments/exp_scripts folder, in which there are mainly three sub-folders.
spector_reconfigcontains scripts to run the corresponding experiments.flink-confcontains experiment cluster setup configurations.analysiscontains the analysis scripts to process raw data and draw figures shown in our paper.
After configuring the local environment, every experiment support a one-click run such as using spector_reconfig/micro-bench.sh to reproduce micro-bench experiment.