Skip to content

[ENHANCEMENT]: Improve the behavior of replication since_seq parameter #5867

@nickva

Description

@nickva

Provide a brief overview of what the new feature is all about

Currently the since_seq replication parameter forces a replication job to start from the given since_seq sequence ignoring the checkpoints. That's its intended behavior but it only works well until the first time the replication checkpoints. After it checkpoints, and the job restarts, those later checkpoints will be ignored and the job will again restart from since_seq. So a job with a since_seq parameter will always replay the whole changes feed since the starting since_seq parameter, and what's worse, clobber and rewind the replication checkpoints in the process as well.

Tell us how the new feature should work. Be specific

A better behavior might be to check the pending changes count with the checkpointed sequence (or 0 if there are none) and the since_seq, and use the one which has fewer pending changes. This way if a user sets and forgets a since_seq, the replication will start from the since_seq value they set if that's later than the last checkpointed sequence. Replication will then make some progress and make new checkpoints and later start using those newer checkpoints.

If a user really wants to reset the replication back they can still do it, by deleting the replication checkpoints on both source and target, and set an arbitrary since_seq. On startup since the replication checkpoints were reset they will have the largest pending count so since_seq will be used.

Not required. Suggest how to implement the addition or change

No response

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions