You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: features/snapshots/snapshots.md
+66-3Lines changed: 66 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -29,10 +29,10 @@ conform to an api (the SMAPI) which has operations including
29
29
- vdi_snapshot: create a snapshot of a disk
30
30
31
31
32
-
Example vhd implementation
33
-
==========================
32
+
File-based vhd implementation
33
+
=============================
34
34
35
-
The existing "EXT" and "NFS" Xapi SM plugins store disk data in
35
+
The existing "EXT" and "NFS" file-based Xapi SM plugins store disk data in
36
36
trees of .vhd files as in the following diagram:
37
37
38
38

@@ -41,6 +41,32 @@ From the XenAPI point of view, we have one current VDI and a set of snapshots,
41
41
each taken at a different point in time. These VDIs correspond to leaf vhds in
42
42
a tree stored on disk, where the non-leaf nodes contain all the shared blocks.
43
43
44
+
The vhd files are always thinly-provisioned which means they only allocate new
45
+
blocks on an as-needed basis. The snapshot leaf vhd files only contain vhd
46
+
metadata and therefore are very small (a few KiB). The parent nodes containing
47
+
the shared blocks only contain the shared blocks. The current leaf initially
48
+
contains only the vhd metadata and therefore is very small (a few KiB) and will
49
+
only grow when the VM writes blocks.
50
+
51
+
File-based vhd implementations are a good choice if a "gold image" snapshot
52
+
is going to be cloned lots of times.
53
+
54
+
Block-based vhd implementation
55
+
==============================
56
+
57
+
The existing "LVM", "LVMoISCSI" and "LVMoHBA" block-based Xapi SM plugins store
58
+
disk data in trees of .vhd files contained within LVM logical volumes:
59
+
60
+

61
+
62
+
Non-snapshot VDIs are always stored full size (a.k.a. thickly-provisioned).
63
+
When parent nodes are created they are automatically shrunk to the minimum size
64
+
needed to store the shared blocks. The LVs corresponding with snapshot VDIs
65
+
only contain vhd metadata and by default consume 8MiB. Note: this is different
66
+
to VDI.clones which are stored full size.
67
+
68
+
Block-based vhd implementations are not a good choice if a "gold image" snapshot
69
+
is going to be cloned lots of times, since each clone will be stored full size.
44
70
45
71
Hypothetical LUN implementation
46
72
===============================
@@ -85,9 +111,46 @@ We have fields that help navigate the new objects: ```VM.snapshot_of```,
85
111
and ```VDI.snapshot_of```. These, like you would expect, point to the
86
112
relevant other objects.
87
113
114
+
Deleting VM snapshots
115
+
=====================
116
+
117
+
When a snapshot is deleted Xapi calls the SM API `vdi_delete`. The Xapi SM
118
+
plugins which use vhd format data do not reclaim space immediately; instead
119
+
they mark the corresponding vhd leaf node as "hidden" and, at some point later,
120
+
run a garbage collector process.
121
+
122
+
The garbage collector will first determine whether a "coalesce" should happen i.e.
123
+
whether any parent nodes have only one child i.e. the "shared" blocks are only
124
+
"shared" with one other node. In the following example the snapshot delete leaves
125
+
such a parent node and the coalesce process copies blocks from the redundant
126
+
parent's only child into the parent:
127
+
128
+

129
+
130
+
Note that if the vhd data is being stored in LVM, then the parent node will
131
+
have had to be expanded to full size to accommodate the writes. Unfortunately
132
+
this means the act of reclaiming space actually consumes space itself, which
133
+
means it is important to never completely run out of space in such an SR.
134
+
135
+
Once the blocks have been copied, we can now cut one of the parents out of the
136
+
tree by relinking its children into their grandparent:
137
+
138
+

139
+
140
+
Finally the garbage collector can remove unused vhd files / LVM LVs:
141
+
142
+

143
+
88
144
Reverting VM snapshots
89
145
======================
90
146
147
+
The XenAPI call `VM.revert` overwrites the VM metadata with the snapshot VM
148
+
metadata, deletes the current VDIs and replaces them with clones of the
149
+
snapshot VDIs. Note there is no "vdi_revert" in the SMAPI.
150
+
151
+
Revert implementation details
152
+
-----------------------------
153
+
91
154
This is the process by which we revert a VM to a snapshot. The
92
155
first thing to notice is that there is some logic that is called
93
156
from [message_forwarding.ml](https://github.com/xapi-project/xen-api/blob/ce6d3f276f0a56ef57ebcf10f45b0f478fd70322/ocaml/xapi/message_forwarding.ml#L1528),
0 commit comments