Skip to content

checkpoint: support checkpoint create command#4484

Merged
AkihiroSuda merged 10 commits into
containerd:mainfrom
ChengyuZhu6:checkpoint
Oct 27, 2025
Merged

checkpoint: support checkpoint create command#4484
AkihiroSuda merged 10 commits into
containerd:mainfrom
ChengyuZhu6:checkpoint

Conversation

@ChengyuZhu6
Copy link
Copy Markdown
Member

Fixes: #4483

@ChengyuZhu6 ChengyuZhu6 mentioned this pull request Aug 20, 2025
4 tasks
@ChengyuZhu6 ChengyuZhu6 added this to the v2.1.5 (?) milestone Aug 29, 2025
@ChengyuZhu6 ChengyuZhu6 force-pushed the checkpoint branch 8 times, most recently from b6e106e to fee41f4 Compare September 1, 2025 11:44
@ChengyuZhu6 ChengyuZhu6 marked this pull request as ready for review September 14, 2025 04:09
@ChengyuZhu6 ChengyuZhu6 force-pushed the checkpoint branch 7 times, most recently from 292ceb4 to c4ed970 Compare September 14, 2025 08:36
@ChengyuZhu6
Copy link
Copy Markdown
Member Author

ChengyuZhu6 commented Sep 14, 2025

# nerdctl run -d --name test-container ghcr.io/stargz-containers/alpine:3.13-org sleep infinity
b2a7213fd10f898f5af65493692aa2d1c18b9abd26fc477cc2c302170398987d

# nerdctl checkpoint create --leave-running=true --checkpoint-dir /tmp/test-checkpoints test-container test-checkpoint
test-checkpoint

# ls /tmp/test-checkpoints/test-checkpoint/
cgroup.img        files.img      ipcns-var-11.img    pages-1.img   tmpfs-dev-49.tar.gz.img  tmpfs-dev-55.tar.gz.img
core-1.img        fs-1.img       mm-1.img            pstree.img    tmpfs-dev-52.tar.gz.img  utsns-12.img
descriptors.json  ids-1.img      mountpoints-13.img  seccomp.img   tmpfs-dev-53.tar.gz.img
fdinfo-2.img      inventory.img  pagemap-1.img       timens-0.img  tmpfs-dev-54.tar.gz.img

@ChengyuZhu6 ChengyuZhu6 force-pushed the checkpoint branch 10 times, most recently from 05cc852 to 61e9bac Compare September 15, 2025 02:35
ExitCode: 0,
Output: expect.All(
func(_ string, t tig.T) {
// Validate state continuity only: counter should not reset and must keep increasing
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this check the in-memory state?
e.g., redis

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed the test to mount /state as tmpfs to check the in-memory state.

@ChengyuZhu6 ChengyuZhu6 force-pushed the checkpoint branch 4 times, most recently from c888a8f to 951b250 Compare October 17, 2025 06:02
@AkihiroSuda
Copy link
Copy Markdown
Member

CI can't be green after restarting several times 😢

@ChengyuZhu6
Copy link
Copy Markdown
Member Author

CI can't be green after restarting several times 😢

That's so sad. I think the failures on ci are not related to this PR.

@AkihiroSuda
Copy link
Copy Markdown
Member

Could you try rebasing after merging:

@ChengyuZhu6
Copy link
Copy Markdown
Member Author

Could you try rebasing after merging:

Sure.

@ChengyuZhu6
Copy link
Copy Markdown
Member Author

rebase to main, no code change.

- Create checkpoints from running containers using containerd APIs
- Support both leave-running and exit modes via --leave-running flag
- Configurable checkpoint directory via --checkpoint-dir flag

Signed-off-by: ChengyuZhu6 <hudson@cyzhu.com>
add unit tests for checkpoint create command.

Signed-off-by: ChengyuZhu6 <hudson@cyzhu.com>
add checkpoint create command reference.

Signed-off-by: ChengyuZhu6 <hudson@cyzhu.com>
add checkpoint restore support to container start.
e.g.:
$ nerdctl run --name cr -d busybox sleep infinity
$ nerdctl checkpoint create cr checkpoint1
$ nerdctl start --checkpoint checkpoint  cr

Signed-off-by: ChengyuZhu6 <hudson@cyzhu.com>
add unit test for container start with checkpoint.

Signed-off-by: ChengyuZhu6 <hudson@cyzhu.com>
add nerdctl start with checkpoint command reference.

Signed-off-by: ChengyuZhu6 <hudson@cyzhu.com>
install criu in ci to test checkpoint.

Signed-off-by: ChengyuZhu6 <hudson@cyzhu.com>
introduce taskoptions to reduce argument numbers.
Otherwise, ci would be failed by:
```
Error: pkg/taskutil/taskutil.go:51:1: argument-limit:
maximum number of arguments per function exceeded; max 12 but got 13
func NewTask(ctx context.Context, client *containerd.Client,
container containerd.Container, attachStreamOpt []string,
isInteractive, isTerminal, isDetach bool, con console.Console,
logURI, detachKeys, namespace string, detachC chan<- struct{}, checkpointDir string)
```

Signed-off-by: ChengyuZhu6 <hudson@cyzhu.com>
@AkihiroSuda
Copy link
Copy Markdown
Member

Currently, nerdctl CI uses docker 28.0.4, while docker version 28.x
has a known regression that breaks Checkpoint/Restore functionality.
The issue is tracked in the moby/moby project as moby/moby#50750.

Signed-off-by: ChengyuZhu6 <hudson@cyzhu.com>
@ChengyuZhu6
Copy link
Copy Markdown
Member Author

Re-push to trigger ci, no code change.

Signed-off-by: ChengyuZhu6 <hudson@cyzhu.com>
Copy link
Copy Markdown
Member

@AkihiroSuda AkihiroSuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Track development: nerdctl checkpoint create

2 participants