Skip to content

Commit 3dc7521

Browse files
authored
Merge branch 'release-1.8' into patch-51
2 parents 4ebd0e3 + 6dcb9d4 commit 3dc7521

File tree

12 files changed

+207
-65
lines changed

12 files changed

+207
-65
lines changed

_data/tasks.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -124,6 +124,7 @@ toc:
124124
- docs/tasks/administer-cluster/quota-pod-namespace.md
125125
- docs/tasks/administer-cluster/quota-api-object.md
126126
- docs/tasks/administer-cluster/opaque-integer-resource-node.md
127+
- docs/tasks/administer-cluster/cpu-management-policies.md
127128
- docs/tasks/administer-cluster/access-cluster-api.md
128129
- docs/tasks/administer-cluster/access-cluster-services.md
129130
- docs/tasks/administer-cluster/securing-a-cluster.md
@@ -140,7 +141,6 @@ toc:
140141
- docs/tasks/administer-cluster/cpu-memory-limit.md
141142
- docs/tasks/administer-cluster/out-of-resource.md
142143
- docs/tasks/administer-cluster/reserve-compute-resources.md
143-
- docs/tasks/administer-cluster/cpu-management-policies.md
144144
- docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods.md
145145
- docs/tasks/administer-cluster/declare-network-policy.md
146146
- title: Install Network Policy Provider

docs/concepts/architecture/nodes.md

Lines changed: 24 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,27 @@ The node condition is represented as a JSON object. For example, the following r
6565

6666
If the Status of the Ready condition is "Unknown" or "False" for longer than the `pod-eviction-timeout`, an argument is passed to the [kube-controller-manager](/docs/admin/kube-controller-manager) and all of the Pods on the node are scheduled for deletion by the Node Controller. The default eviction timeout duration is **five minutes**. In some cases when the node is unreachable, the apiserver is unable to communicate with the kubelet on it. The decision to delete the pods cannot be communicated to the kubelet until it re-establishes communication with the apiserver. In the meantime, the pods which are scheduled for deletion may continue to run on the partitioned node.
6767

68-
In versions of Kubernetes prior to 1.5, the node controller would [force delete](/docs/concepts/workloads/pods/pod/#force-deletion-of-pods) these unreachable pods from the apiserver. However, in 1.5 and higher, the node controller does not force delete pods until it is confirmed that they have stopped running in the cluster. One can see these pods which may be running on an unreachable node as being in the "Terminating" or "Unknown" states. In cases where Kubernetes cannot deduce from the underlying infrastructure if a node has permanently left a cluster, the cluster administrator may need to delete the node object by hand. Deleting the node object from Kubernetes causes all the Pod objects running on it to be deleted from the apiserver, freeing up their names.
68+
In versions of Kubernetes prior to 1.5, the node controller would [force delete](/docs/concepts/workloads/pods/pod/#force-deletion-of-pods)
69+
these unreachable pods from the apiserver. However, in 1.5 and higher, the node controller does not force delete pods until it is
70+
confirmed that they have stopped running in the cluster. One can see these pods which may be running on an unreachable node as being in
71+
the "Terminating" or "Unknown" states. In cases where Kubernetes cannot deduce from the underlying infrastructure if a node has
72+
permanently left a cluster, the cluster administrator may need to delete the node object by hand. Deleting the node object from
73+
Kubernetes causes all the Pod objects running on it to be deleted from the apiserver, freeing up their names.
74+
75+
Version 1.8 introduces an alpha feature that automatically creates
76+
[taints](/docs/concepts/configuration/taint-and-toleration) that represent conditions.
77+
To enable this behavior, pass an additional feature gate flag `--feature-gates=...,TaintNodesByCondition=true`
78+
to the API server, controller manager, and scheduler.
79+
When `TaintNodesByCondition` is enabled, the scheduler ignores conditions when considering a Node; instead
80+
it looks at the Node's taints and a Pod's tolerations.
81+
82+
Now users can choose between the old scheduling model and a new, more flexible scheduling model.
83+
A Pod that does not have any tolerations gets scheduled according to the old model. But a Pod that
84+
tolerates the taints of a particular Node can be scheduled on that Node.
85+
86+
Note that because of small delay, usually less than one second, between time when condition is observed and a taint
87+
is created, it's possible that enabling this feature will slightly increase number of Pods that are successfully
88+
scheduled but rejected by the kubelet.
6989

7090
### Capacity
7191

@@ -174,6 +194,9 @@ NodeController is responsible for adding taints corresponding to node problems l
174194
node unreachable or not ready. See [this documentation](/docs/concepts/configuration/taint-and-toleration)
175195
for details about `NoExecute` taints and the alpha feature.
176196

197+
Starting in version 1.8, the node controller can be made responsible for creating taints that represent
198+
Node conditions. This is an alpha feature of version 1.8.
199+
177200
### Self-Registration of Nodes
178201

179202
When the kubelet flag `--register-node` is true (the default), the kubelet will attempt to

docs/concepts/configuration/manage-compute-resources-container.md

Lines changed: 97 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -305,6 +305,8 @@ where `OOM` stands for Out Of Memory.
305305

306306
## Opaque integer resources (Alpha feature)
307307

308+
{% include feature-state-deprecated.md %}
309+
308310
Kubernetes version 1.5 introduces Opaque integer resources. Opaque
309311
integer resources allow cluster operators to advertise new node-level
310312
resources that would be otherwise unknown to the system.
@@ -313,9 +315,12 @@ Users can consume these resources in Pod specs just like CPU and memory.
313315
The scheduler takes care of the resource accounting so that no more than the
314316
available amount is simultaneously allocated to Pods.
315317

316-
**Note:** Opaque integer resources are Alpha in Kubernetes version 1.5.
317-
Only resource accounting is implemented; node-level isolation is still
318-
under active development.
318+
**Note:** Opaque Integer Resources will be removed in version 1.9.
319+
[Extended Resources](#extended-resources) are a replacement for Opaque Integer
320+
Resources. Users can use any domain name prefix outside of the `kubernetes.io/`
321+
domain instead of the previous `pod.alpha.kubernetes.io/opaque-int-resource-`
322+
prefix.
323+
{: .note}
319324

320325
Opaque integer resources are resources that begin with the prefix
321326
`pod.alpha.kubernetes.io/opaque-int-resource-`. The API server
@@ -339,22 +344,9 @@ first pod that requests the resource to be scheduled on that node.
339344

340345
**Example:**
341346

342-
Here is an HTTP request that advertises five "foo" resources on node `k8s-node-1` whose master is `k8s-master`.
343-
344-
```http
345-
PATCH /api/v1/nodes/k8s-node-1/status HTTP/1.1
346-
Accept: application/json
347-
Content-Type: application/json-patch+json
348-
Host: k8s-master:8080
349-
350-
[
351-
{
352-
"op": "add",
353-
"path": "/status/capacity/pod.alpha.kubernetes.io~1opaque-int-resource-foo",
354-
"value": "5"
355-
}
356-
]
357-
```
347+
Here is an example showing how to use `curl` to form an HTTP request that
348+
advertises five "foo" resources on node `k8s-node-1` whose master is
349+
`k8s-master`.
358350

359351
```shell
360352
curl --header "Content-Type: application/json-patch+json" \
@@ -395,6 +387,92 @@ spec:
395387
pod.alpha.kubernetes.io/opaque-int-resource-foo: 1
396388
```
397389

390+
## Extended Resources
391+
392+
Kubernetes version 1.8 introduces Extended Resources. Extended Resources are
393+
fully-qualified resource names outside the `kubernetes.io` domain. Extended
394+
Resources allow cluster operators to advertise new node-level resources that
395+
would be otherwise unknown to the system. Extended Resource quantities must be
396+
integers and cannot be overcommitted.
397+
398+
Users can consume Extended Resources in Pod specs just like CPU and memory.
399+
The scheduler takes care of the resource accounting so that no more than the
400+
available amount is simultaneously allocated to Pods.
401+
402+
The API server restricts quantities of Extended Resources to whole numbers.
403+
Examples of _valid_ quantities are `3`, `3000m` and `3Ki`. Examples of
404+
_invalid_ quantities are `0.5` and `1500m`.
405+
406+
**Note:** Extended Resources replace [Opaque Integer
407+
Resources](#opaque-integer-resources-alpha-feature). Users can use any domain
408+
name prefix outside of the `kubernetes.io/` domain instead of the previous
409+
`pod.alpha.kubernetes.io/opaque-int-resource-` prefix.
410+
{: .note}
411+
412+
There are two steps required to use Extended Resources. First, the
413+
cluster operator must advertise a per-node Extended Resource on one or more
414+
nodes. Second, users must request the Extended Resource in Pods.
415+
416+
To advertise a new Extended Resource, the cluster operator should
417+
submit a `PATCH` HTTP request to the API server to specify the available
418+
quantity in the `status.capacity` for a node in the cluster. After this
419+
operation, the node's `status.capacity` will include a new resource. The
420+
`status.allocatable` field is updated automatically with the new resource
421+
asynchronously by the kubelet. Note that because the scheduler uses the
422+
node `status.allocatable` value when evaluating Pod fitness, there may
423+
be a short delay between patching the node capacity with a new resource and the
424+
first pod that requests the resource to be scheduled on that node.
425+
426+
**Example:**
427+
428+
Here is an example showing how to use `curl` to form an HTTP request that
429+
advertises five "example.com/foo" resources on node `k8s-node-1` whose master
430+
is `k8s-master`.
431+
432+
```shell
433+
curl --header "Content-Type: application/json-patch+json" \
434+
--request PATCH \
435+
--data '[{"op": "add", "path": "/status/capacity/example.com~1foo", "value": "5"}]' \
436+
http://k8s-master:8080/api/v1/nodes/k8s-node-1/status
437+
```
438+
439+
**Note**: In the preceding request, `~1` is the encoding for the character `/`
440+
in the patch path. The operation path value in JSON-Patch is interpreted as a
441+
JSON-Pointer. For more details, see
442+
[IETF RFC 6901, section 3](https://tools.ietf.org/html/rfc6901#section-3).
443+
{: .note}
444+
445+
To consume an Extended Resource in a Pod, include the resource name as a key
446+
in the `spec.containers[].resources.requests` map.
447+
448+
**Note:** Extended resources cannot be overcommitted, so request and limit
449+
must be equal if both are present in a container spec.
450+
{: .note}
451+
452+
The Pod is scheduled only if all of the resource requests are
453+
satisfied, including cpu, memory and any Extended Resources. The Pod will
454+
remain in the `PENDING` state as long as the resource request cannot be met by
455+
any node.
456+
457+
**Example:**
458+
459+
The Pod below requests 2 cpus and 1 "example.com/foo" (an extended resource.)
460+
461+
```yaml
462+
apiVersion: v1
463+
kind: Pod
464+
metadata:
465+
name: my-pod
466+
spec:
467+
containers:
468+
- name: my-container
469+
image: myimage
470+
resources:
471+
requests:
472+
cpu: 2
473+
example.com/foo: 1
474+
```
475+
398476
## Planned Improvements
399477

400478
Kubernetes version 1.5 only allows resource quantities to be specified on a

docs/concepts/configuration/taint-and-toleration.md

Lines changed: 15 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -188,7 +188,7 @@ running on the node as follows
188188

189189
The above behavior is a beta feature. In addition, Kubernetes 1.6 has alpha
190190
support for representing node problems. In other words, the node controller
191-
automatically taints a node when certain condition is true. The builtin taints
191+
automatically taints a node when certain condition is true. The built-in taints
192192
currently include:
193193

194194
* `node.alpha.kubernetes.io/notReady`: Node is not ready. This corresponds to
@@ -249,9 +249,20 @@ admission controller](https://git.k8s.io/kubernetes/plugin/pkg/admission/default
249249

250250
* `node.alpha.kubernetes.io/unreachable`
251251
* `node.alpha.kubernetes.io/notReady`
252-
* `node.kubernetes.io/memoryPressure`
253-
* `node.kubernetes.io/diskPressure`
254-
* `node.kubernetes.io/outOfDisk` (*only for critical pods*)
255252

256253
This ensures that DaemonSet pods are never evicted due to these problems,
257254
which matches the behavior when this feature is disabled.
255+
256+
## Taint Nodes by Condition
257+
258+
Version 1.8 introduces an alpha feature that causes the node controller to create taints corresponding to
259+
Node conditions. When this feature is enabled, the scheduler does not check conditions; instead the scheduler checks taints. This assures that conditions don't affect what's scheduled onto the Node. The user can choose to ignore some of the Node's problems (represented as conditions) by adding appropriate Pod tolerations.
260+
261+
To make sure that turning on this feature doesn't break DaemonSets, starting in version 1.8, the DaemonSet controller automatically adds the following `NoSchedule` tolerations to all daemons:
262+
263+
* `node.kubernetes.io/memory-pressure`
264+
* `node.kubernetes.io/disk-pressure`
265+
* `node.kubernetes.io/out-of-disk` (*only for critical pods*)
266+
267+
The above settings ensure backward compatibility, but we understand they may not fit all user's needs, which is why
268+
cluster admin may choose to add arbitrary tolerations to DaemonSets.

docs/concepts/workloads/controllers/daemonset.md

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -103,19 +103,22 @@ but they are created with `NoExecute` tolerations for the following taints with
103103

104104
- `node.alpha.kubernetes.io/notReady`
105105
- `node.alpha.kubernetes.io/unreachable`
106-
- `node.alpha.kubernetes.io/memoryPressure`
107-
- `node.alpha.kubernetes.io/diskPressure`
108-
109-
When the support to critical pods is enabled and the pods in a DaemonSet are
110-
labelled as critical, the Daemon pods are created with an additional
111-
`NoExecute` toleration for the `node.alpha.kubernetes.io/outOfDisk` taint with
112-
no `tolerationSeconds`.
113106

114107
This ensures that when the `TaintBasedEvictions` alpha feature is enabled,
115108
they will not be evicted when there are node problems such as a network partition. (When the
116109
`TaintBasedEvictions` feature is not enabled, they are also not evicted in these scenarios, but
117110
due to hard-coded behavior of the NodeController rather than due to tolerations).
118111

112+
They also tolerate following `NoSchedule` taints:
113+
114+
- `node.kubernetes.io/memory-pressure`
115+
- `node.kubernetes.io/disk-pressure`
116+
117+
When the support to critical pods is enabled and the pods in a DaemonSet are
118+
labelled as critical, the Daemon pods are created with an additional
119+
`NoSchedule` toleration for the `node.kubernetes.io/out-of-disk` taint.
120+
121+
Note that all above `NoSchedule` taints above are created only in version 1.8 or later if the alpha feature `TaintNodesByCondition` is enabled.
119122

120123
## Communicating with Daemon Pods
121124

docs/concepts/workloads/pods/init-containers.md

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,8 @@ scripts not present in an app image.
1414

1515
This feature has exited beta in 1.6. Init Containers can be specified in the PodSpec
1616
alongside the app `containers` array. The beta annotation value will still be respected
17-
and overrides the PodSpec field value.
17+
and overrides the PodSpec field value, however, they are deprecated in 1.6 and 1.7.
18+
In 1.8, the annotations are no longer supported and must be converted to the PodSpec field.
1819

1920
{% capture body %}
2021
## Understanding Init Containers
@@ -123,7 +124,7 @@ spec:
123124
command: ['sh', '-c', 'echo The app is running! && sleep 3600']
124125
```
125126
126-
There is a new syntax in Kubernetes 1.6, although the old annotation syntax still works. We have moved the declaration of init containers to `spec`:
127+
There is a new syntax in Kubernetes 1.6, although the old annotation syntax still works for 1.6 and 1.7. The new syntax must be used for 1.8 or greater. We have moved the declaration of init containers to `spec`:
127128

128129
```yaml
129130
apiVersion: v1
@@ -146,7 +147,7 @@ spec:
146147
command: ['sh', '-c', 'until nslookup mydb; do echo waiting for mydb; sleep 2; done;']
147148
```
148149

149-
1.5 syntax still works on 1.6, but we recommend using 1.6 syntax. In Kubernetes 1.6, Init Containers were made a field in the API. The beta annotation is still respected but will be deprecated in future releases.
150+
1.5 syntax still works on 1.6, but we recommend using 1.6 syntax. In Kubernetes 1.6, Init Containers were made a field in the API. The beta annotation is still respected in 1.6 and 1.7, but is not supported in 1.8 or greater.
150151

151152
Yaml file below outlines the `mydb` and `myservice` services:
152153

@@ -311,6 +312,10 @@ into alpha and beta annotations so that Kubelets version 1.3.0 or greater can ex
311312
Init Containers, and so that a version 1.6 apiserver can safely be rolled back to version
312313
1.5.x without losing Init Container functionality for existing created pods.
313314

315+
In Apiserver and Kubelet versions 1.8.0 or greater, support for the alpha and beta annotations
316+
is removed, requiring a conversion from the deprecated annotations to the
317+
`spec.initContainers` field.
318+
314319
{% endcapture %}
315320

316321

0 commit comments

Comments
 (0)