k8s_drain new module (#141)

k8s_drain new module

SUMMARY

new module to drain, cordon or uncordon node from k8s cluster.
#141

ISSUE TYPE


New Module Pull Request

COMPONENT NAME

k8s_drain

Reviewed-by: Abhijeet Kasurde <None>
Reviewed-by: None <None>
Reviewed-by: Mike Graves <mgraves@redhat.com>
Reviewed-by: None <None>
This commit is contained in:
abikouo
2021-08-09 10:00:34 +02:00
committed by GitHub
parent 76a77aff1f
commit b875531c8a
8 changed files with 1208 additions and 50 deletions

View File

@@ -59,6 +59,7 @@ Name | Description
[kubernetes.core.k8s](https://github.com/ansible-collections/kubernetes.core/blob/main/docs/kubernetes.core.k8s_module.rst)|Manage Kubernetes (K8s) objects
[kubernetes.core.k8s_cluster_info](https://github.com/ansible-collections/kubernetes.core/blob/main/docs/kubernetes.core.k8s_cluster_info_module.rst)|Describe Kubernetes (K8s) cluster, APIs available and their respective versions
[kubernetes.core.k8s_cp](https://github.com/ansible-collections/kubernetes.core/blob/main/docs/kubernetes.core.k8s_cp_module.rst)|Copy files and directories to and from pod.
[kubernetes.core.k8s_drain](https://github.com/ansible-collections/kubernetes.core/blob/main/docs/kubernetes.core.k8s_drain_module.rst)|Drain, Cordon, or Uncordon node in k8s cluster
[kubernetes.core.k8s_exec](https://github.com/ansible-collections/kubernetes.core/blob/main/docs/kubernetes.core.k8s_exec_module.rst)|Execute command in Pod
[kubernetes.core.k8s_info](https://github.com/ansible-collections/kubernetes.core/blob/main/docs/kubernetes.core.k8s_info_module.rst)|Describe Kubernetes (K8s) objects
[kubernetes.core.k8s_json_patch](https://github.com/ansible-collections/kubernetes.core/blob/main/docs/kubernetes.core.k8s_json_patch_module.rst)|Apply JSON patch operations to existing objects

View File

@@ -8,7 +8,7 @@ kubernetes.core.k8s_cp
**Copy files and directories to and from pod.**
Version added: 2.1.0
Version added: 2.2.0
.. contents::
:local:
@@ -221,11 +221,14 @@ Parameters
<b>no_preserve</b>
<a class="ansibleOptionLink" href="#parameter-" title="Permalink to this option"></a>
<div style="font-size: small">
<span style="color: purple">string</span>
<span style="color: purple">boolean</span>
</div>
</td>
<td>
<b>Default:</b><br/><div style="color: blue">"no"</div>
<ul style="margin: 0; padding: 0"><b>Choices:</b>
<li><div style="color: blue"><b>no</b>&nbsp;&larr;</div></li>
<li>yes</li>
</ul>
</td>
<td>
<div>The copied file/directory&#x27;s ownership and permissions will not be preserved in the container.</div>
@@ -452,6 +455,7 @@ Notes
-----
.. note::
- the tar binary is required on the container when copying from local filesystem to pod.
- To avoid SSL certificate validation errors when ``validate_certs`` is *True*, the full certificate chain for the API server must be provided via ``ca_cert`` or in the kubeconfig file.
@@ -463,7 +467,7 @@ Examples
# kubectl cp /tmp/foo some-namespace/some-pod:/tmp/bar
- name: Copy /tmp/foo local file to /tmp/bar in a remote pod
kubernetes.core.k8s:
kubernetes.core.k8s_cp:
namespace: some-namespace
pod: some-pod
remote_path: /tmp/bar
@@ -471,7 +475,7 @@ Examples
# kubectl cp /tmp/foo_dir some-namespace/some-pod:/tmp/bar_dir
- name: Copy /tmp/foo_dir local directory to /tmp/bar_dir in a remote pod
kubernetes.core.k8s:
kubernetes.core.k8s_cp:
namespace: some-namespace
pod: some-pod
remote_path: /tmp/bar_dir
@@ -479,7 +483,7 @@ Examples
# kubectl cp /tmp/foo some-namespace/some-pod:/tmp/bar -c some-container
- name: Copy /tmp/foo local file to /tmp/bar in a remote pod in a specific container
kubernetes.core.k8s:
kubernetes.core.k8s_cp:
namespace: some-namespace
pod: some-pod
container: some-container
@@ -490,7 +494,7 @@ Examples
# kubectl cp some-namespace/some-pod:/tmp/foo /tmp/bar
- name: Copy /tmp/foo from a remote pod to /tmp/bar locally
kubernetes.core.k8s:
kubernetes.core.k8s_cp:
namespace: some-namespace
pod: some-pod
remote_path: /tmp/foo
@@ -499,7 +503,7 @@ Examples
# copy content into a file in the remote pod
- name: Copy /tmp/foo from a remote pod to /tmp/bar locally
kubernetes.core.k8s:
kubernetes.core.k8s_cp:
state: to_pod
namespace: some-namespace
pod: some-pod

View File

@@ -0,0 +1,559 @@
.. _kubernetes.core.k8s_drain_module:
*************************
kubernetes.core.k8s_drain
*************************
**Drain, Cordon, or Uncordon node in k8s cluster**
.. contents::
:local:
:depth: 1
Synopsis
--------
- Drain node in preparation for maintenance same as kubectl drain.
- Cordon will mark the node as unschedulable.
- Uncordon will mark the node as schedulable.
- The given node will be marked unschedulable to prevent new pods from arriving.
- Then drain deletes all pods except mirror pods (which cannot be deleted through the API server).
Requirements
------------
The below requirements are needed on the host that executes this module.
- python >= 3.6
- kubernetes >= 12.0.0
Parameters
----------
.. raw:: html
<table border=0 cellpadding=0 class="documentation-table">
<tr>
<th colspan="2">Parameter</th>
<th>Choices/<font color="blue">Defaults</font></th>
<th width="100%">Comments</th>
</tr>
<tr>
<td colspan="2">
<div class="ansibleOptionAnchor" id="parameter-"></div>
<b>api_key</b>
<a class="ansibleOptionLink" href="#parameter-" title="Permalink to this option"></a>
<div style="font-size: small">
<span style="color: purple">string</span>
</div>
</td>
<td>
</td>
<td>
<div>Token used to authenticate with the API. Can also be specified via K8S_AUTH_API_KEY environment variable.</div>
</td>
</tr>
<tr>
<td colspan="2">
<div class="ansibleOptionAnchor" id="parameter-"></div>
<b>ca_cert</b>
<a class="ansibleOptionLink" href="#parameter-" title="Permalink to this option"></a>
<div style="font-size: small">
<span style="color: purple">path</span>
</div>
</td>
<td>
</td>
<td>
<div>Path to a CA certificate used to authenticate with the API. The full certificate chain must be provided to avoid certificate validation errors. Can also be specified via K8S_AUTH_SSL_CA_CERT environment variable.</div>
<div style="font-size: small; color: darkgreen"><br/>aliases: ssl_ca_cert</div>
</td>
</tr>
<tr>
<td colspan="2">
<div class="ansibleOptionAnchor" id="parameter-"></div>
<b>client_cert</b>
<a class="ansibleOptionLink" href="#parameter-" title="Permalink to this option"></a>
<div style="font-size: small">
<span style="color: purple">path</span>
</div>
</td>
<td>
</td>
<td>
<div>Path to a certificate used to authenticate with the API. Can also be specified via K8S_AUTH_CERT_FILE environment variable.</div>
<div style="font-size: small; color: darkgreen"><br/>aliases: cert_file</div>
</td>
</tr>
<tr>
<td colspan="2">
<div class="ansibleOptionAnchor" id="parameter-"></div>
<b>client_key</b>
<a class="ansibleOptionLink" href="#parameter-" title="Permalink to this option"></a>
<div style="font-size: small">
<span style="color: purple">path</span>
</div>
</td>
<td>
</td>
<td>
<div>Path to a key file used to authenticate with the API. Can also be specified via K8S_AUTH_KEY_FILE environment variable.</div>
<div style="font-size: small; color: darkgreen"><br/>aliases: key_file</div>
</td>
</tr>
<tr>
<td colspan="2">
<div class="ansibleOptionAnchor" id="parameter-"></div>
<b>context</b>
<a class="ansibleOptionLink" href="#parameter-" title="Permalink to this option"></a>
<div style="font-size: small">
<span style="color: purple">string</span>
</div>
</td>
<td>
</td>
<td>
<div>The name of a context found in the config file. Can also be specified via K8S_AUTH_CONTEXT environment variable.</div>
</td>
</tr>
<tr>
<td colspan="2">
<div class="ansibleOptionAnchor" id="parameter-"></div>
<b>delete_options</b>
<a class="ansibleOptionLink" href="#parameter-" title="Permalink to this option"></a>
<div style="font-size: small">
<span style="color: purple">dictionary</span>
</div>
</td>
<td>
</td>
<td>
<div>Specify options to delete pods.</div>
<div>This option has effect only when <code>state</code> is set to <em>drain</em>.</div>
</td>
</tr>
<tr>
<td class="elbow-placeholder"></td>
<td colspan="1">
<div class="ansibleOptionAnchor" id="parameter-"></div>
<b>disable_eviction</b>
<a class="ansibleOptionLink" href="#parameter-" title="Permalink to this option"></a>
<div style="font-size: small">
<span style="color: purple">boolean</span>
</div>
</td>
<td>
<ul style="margin: 0; padding: 0"><b>Choices:</b>
<li><div style="color: blue"><b>no</b>&nbsp;&larr;</div></li>
<li>yes</li>
</ul>
</td>
<td>
<div>Forces drain to use delete rather than evict.</div>
</td>
</tr>
<tr>
<td class="elbow-placeholder"></td>
<td colspan="1">
<div class="ansibleOptionAnchor" id="parameter-"></div>
<b>force</b>
<a class="ansibleOptionLink" href="#parameter-" title="Permalink to this option"></a>
<div style="font-size: small">
<span style="color: purple">boolean</span>
</div>
</td>
<td>
<ul style="margin: 0; padding: 0"><b>Choices:</b>
<li><div style="color: blue"><b>no</b>&nbsp;&larr;</div></li>
<li>yes</li>
</ul>
</td>
<td>
<div>Continue even if there are pods not managed by a ReplicationController, Job, or DaemonSet.</div>
</td>
</tr>
<tr>
<td class="elbow-placeholder"></td>
<td colspan="1">
<div class="ansibleOptionAnchor" id="parameter-"></div>
<b>ignore_daemonsets</b>
<a class="ansibleOptionLink" href="#parameter-" title="Permalink to this option"></a>
<div style="font-size: small">
<span style="color: purple">boolean</span>
</div>
</td>
<td>
<ul style="margin: 0; padding: 0"><b>Choices:</b>
<li><div style="color: blue"><b>no</b>&nbsp;&larr;</div></li>
<li>yes</li>
</ul>
</td>
<td>
<div>Ignore DaemonSet-managed pods.</div>
</td>
</tr>
<tr>
<td class="elbow-placeholder"></td>
<td colspan="1">
<div class="ansibleOptionAnchor" id="parameter-"></div>
<b>terminate_grace_period</b>
<a class="ansibleOptionLink" href="#parameter-" title="Permalink to this option"></a>
<div style="font-size: small">
<span style="color: purple">integer</span>
</div>
</td>
<td>
</td>
<td>
<div>Specify how many seconds to wait before forcefully terminating.</div>
<div>If not specified, the default grace period for the object type will be used.</div>
<div>The value zero indicates delete immediately.</div>
</td>
</tr>
<tr>
<td class="elbow-placeholder"></td>
<td colspan="1">
<div class="ansibleOptionAnchor" id="parameter-"></div>
<b>wait_sleep</b>
<a class="ansibleOptionLink" href="#parameter-" title="Permalink to this option"></a>
<div style="font-size: small">
<span style="color: purple">integer</span>
</div>
</td>
<td>
<b>Default:</b><br/><div style="color: blue">5</div>
</td>
<td>
<div>Number of seconds to sleep between checks.</div>
<div>Ignored if <code>wait_timeout</code> is not set.</div>
</td>
</tr>
<tr>
<td class="elbow-placeholder"></td>
<td colspan="1">
<div class="ansibleOptionAnchor" id="parameter-"></div>
<b>wait_timeout</b>
<a class="ansibleOptionLink" href="#parameter-" title="Permalink to this option"></a>
<div style="font-size: small">
<span style="color: purple">integer</span>
</div>
</td>
<td>
</td>
<td>
<div>The length of time to wait in seconds for pod to be deleted before giving up, zero means infinite.</div>
</td>
</tr>
<tr>
<td colspan="2">
<div class="ansibleOptionAnchor" id="parameter-"></div>
<b>host</b>
<a class="ansibleOptionLink" href="#parameter-" title="Permalink to this option"></a>
<div style="font-size: small">
<span style="color: purple">string</span>
</div>
</td>
<td>
</td>
<td>
<div>Provide a URL for accessing the API. Can also be specified via K8S_AUTH_HOST environment variable.</div>
</td>
</tr>
<tr>
<td colspan="2">
<div class="ansibleOptionAnchor" id="parameter-"></div>
<b>kubeconfig</b>
<a class="ansibleOptionLink" href="#parameter-" title="Permalink to this option"></a>
<div style="font-size: small">
<span style="color: purple">path</span>
</div>
</td>
<td>
</td>
<td>
<div>Path to an existing Kubernetes config file. If not provided, and no other connection options are provided, the Kubernetes client will attempt to load the default configuration file from <em>~/.kube/config</em>. Can also be specified via K8S_AUTH_KUBECONFIG environment variable.</div>
</td>
</tr>
<tr>
<td colspan="2">
<div class="ansibleOptionAnchor" id="parameter-"></div>
<b>name</b>
<a class="ansibleOptionLink" href="#parameter-" title="Permalink to this option"></a>
<div style="font-size: small">
<span style="color: purple">string</span>
/ <span style="color: red">required</span>
</div>
</td>
<td>
</td>
<td>
<div>The name of the node.</div>
</td>
</tr>
<tr>
<td colspan="2">
<div class="ansibleOptionAnchor" id="parameter-"></div>
<b>password</b>
<a class="ansibleOptionLink" href="#parameter-" title="Permalink to this option"></a>
<div style="font-size: small">
<span style="color: purple">string</span>
</div>
</td>
<td>
</td>
<td>
<div>Provide a password for authenticating with the API. Can also be specified via K8S_AUTH_PASSWORD environment variable.</div>
<div>Please read the description of the <code>username</code> option for a discussion of when this option is applicable.</div>
</td>
</tr>
<tr>
<td colspan="2">
<div class="ansibleOptionAnchor" id="parameter-"></div>
<b>persist_config</b>
<a class="ansibleOptionLink" href="#parameter-" title="Permalink to this option"></a>
<div style="font-size: small">
<span style="color: purple">boolean</span>
</div>
</td>
<td>
<ul style="margin: 0; padding: 0"><b>Choices:</b>
<li>no</li>
<li>yes</li>
</ul>
</td>
<td>
<div>Whether or not to save the kube config refresh tokens. Can also be specified via K8S_AUTH_PERSIST_CONFIG environment variable.</div>
<div>When the k8s context is using a user credentials with refresh tokens (like oidc or gke/gcloud auth), the token is refreshed by the k8s python client library but not saved by default. So the old refresh token can expire and the next auth might fail. Setting this flag to true will tell the k8s python client to save the new refresh token to the kube config file.</div>
<div>Default to false.</div>
<div>Please note that the current version of the k8s python client library does not support setting this flag to True yet.</div>
<div>The fix for this k8s python library is here: https://github.com/kubernetes-client/python-base/pull/169</div>
</td>
</tr>
<tr>
<td colspan="2">
<div class="ansibleOptionAnchor" id="parameter-"></div>
<b>proxy</b>
<a class="ansibleOptionLink" href="#parameter-" title="Permalink to this option"></a>
<div style="font-size: small">
<span style="color: purple">string</span>
</div>
</td>
<td>
</td>
<td>
<div>The URL of an HTTP proxy to use for the connection. Can also be specified via K8S_AUTH_PROXY environment variable.</div>
<div>Please note that this module does not pick up typical proxy settings from the environment (e.g. HTTP_PROXY).</div>
</td>
</tr>
<tr>
<td colspan="2">
<div class="ansibleOptionAnchor" id="parameter-"></div>
<b>proxy_headers</b>
<a class="ansibleOptionLink" href="#parameter-" title="Permalink to this option"></a>
<div style="font-size: small">
<span style="color: purple">dictionary</span>
</div>
<div style="font-style: italic; font-size: small; color: darkgreen">added in 2.0.0</div>
</td>
<td>
</td>
<td>
<div>The Header used for the HTTP proxy.</div>
<div>Documentation can be found here <a href='https://urllib3.readthedocs.io/en/latest/reference/urllib3.util.html?highlight=proxy_headers#urllib3.util.make_headers'>https://urllib3.readthedocs.io/en/latest/reference/urllib3.util.html?highlight=proxy_headers#urllib3.util.make_headers</a>.</div>
</td>
</tr>
<tr>
<td class="elbow-placeholder"></td>
<td colspan="1">
<div class="ansibleOptionAnchor" id="parameter-"></div>
<b>basic_auth</b>
<a class="ansibleOptionLink" href="#parameter-" title="Permalink to this option"></a>
<div style="font-size: small">
<span style="color: purple">string</span>
</div>
</td>
<td>
</td>
<td>
<div>Colon-separated username:password for basic authentication header.</div>
<div>Can also be specified via K8S_AUTH_PROXY_HEADERS_BASIC_AUTH environment.</div>
</td>
</tr>
<tr>
<td class="elbow-placeholder"></td>
<td colspan="1">
<div class="ansibleOptionAnchor" id="parameter-"></div>
<b>proxy_basic_auth</b>
<a class="ansibleOptionLink" href="#parameter-" title="Permalink to this option"></a>
<div style="font-size: small">
<span style="color: purple">string</span>
</div>
</td>
<td>
</td>
<td>
<div>Colon-separated username:password for proxy basic authentication header.</div>
<div>Can also be specified via K8S_AUTH_PROXY_HEADERS_PROXY_BASIC_AUTH environment.</div>
</td>
</tr>
<tr>
<td class="elbow-placeholder"></td>
<td colspan="1">
<div class="ansibleOptionAnchor" id="parameter-"></div>
<b>user_agent</b>
<a class="ansibleOptionLink" href="#parameter-" title="Permalink to this option"></a>
<div style="font-size: small">
<span style="color: purple">string</span>
</div>
</td>
<td>
</td>
<td>
<div>String representing the user-agent you want, such as foo/1.0.</div>
<div>Can also be specified via K8S_AUTH_PROXY_HEADERS_USER_AGENT environment.</div>
</td>
</tr>
<tr>
<td colspan="2">
<div class="ansibleOptionAnchor" id="parameter-"></div>
<b>state</b>
<a class="ansibleOptionLink" href="#parameter-" title="Permalink to this option"></a>
<div style="font-size: small">
<span style="color: purple">string</span>
</div>
</td>
<td>
<ul style="margin: 0; padding: 0"><b>Choices:</b>
<li>cordon</li>
<li><div style="color: blue"><b>drain</b>&nbsp;&larr;</div></li>
<li>uncordon</li>
</ul>
</td>
<td>
<div>Determines whether to drain, cordon, or uncordon node.</div>
</td>
</tr>
<tr>
<td colspan="2">
<div class="ansibleOptionAnchor" id="parameter-"></div>
<b>username</b>
<a class="ansibleOptionLink" href="#parameter-" title="Permalink to this option"></a>
<div style="font-size: small">
<span style="color: purple">string</span>
</div>
</td>
<td>
</td>
<td>
<div>Provide a username for authenticating with the API. Can also be specified via K8S_AUTH_USERNAME environment variable.</div>
<div>Please note that this only works with clusters configured to use HTTP Basic Auth. If your cluster has a different form of authentication (e.g. OAuth2 in OpenShift), this option will not work as expected and you should look into the <span class='module'>community.okd.k8s_auth</span> module, as that might do what you need.</div>
</td>
</tr>
<tr>
<td colspan="2">
<div class="ansibleOptionAnchor" id="parameter-"></div>
<b>validate_certs</b>
<a class="ansibleOptionLink" href="#parameter-" title="Permalink to this option"></a>
<div style="font-size: small">
<span style="color: purple">boolean</span>
</div>
</td>
<td>
<ul style="margin: 0; padding: 0"><b>Choices:</b>
<li>no</li>
<li>yes</li>
</ul>
</td>
<td>
<div>Whether or not to verify the API server&#x27;s SSL certificates. Can also be specified via K8S_AUTH_VERIFY_SSL environment variable.</div>
<div style="font-size: small; color: darkgreen"><br/>aliases: verify_ssl</div>
</td>
</tr>
</table>
<br/>
Notes
-----
.. note::
- To avoid SSL certificate validation errors when ``validate_certs`` is *True*, the full certificate chain for the API server must be provided via ``ca_cert`` or in the kubeconfig file.
Examples
--------
.. code-block:: yaml
- name: Drain node "foo", even if there are pods not managed by a ReplicationController, Job, or DaemonSet on it.
kubernetes.core.k8s_drain:
state: drain
name: foo
force: yes
- name: Drain node "foo", but abort if there are pods not managed by a ReplicationController, Job, or DaemonSet, and use a grace period of 15 minutes.
kubernetes.core.k8s_drain:
state: drain
name: foo
delete_options:
terminate_grace_period: 900
- name: Mark node "foo" as schedulable.
kubernetes.core.k8s_drain:
state: uncordon
name: foo
- name: Mark node "foo" as unschedulable.
kubernetes.core.k8s_drain:
state: cordon
name: foo
Return Values
-------------
Common return values are documented `here <https://docs.ansible.com/ansible/latest/reference_appendices/common_return_values.html#common-return-values>`_, the following are the fields unique to this module:
.. raw:: html
<table border=0 cellpadding=0 class="documentation-table">
<tr>
<th colspan="1">Key</th>
<th>Returned</th>
<th width="100%">Description</th>
</tr>
<tr>
<td colspan="1">
<div class="ansibleOptionAnchor" id="return-"></div>
<b>result</b>
<a class="ansibleOptionLink" href="#return-" title="Permalink to this return value"></a>
<div style="font-size: small">
<span style="color: purple">string</span>
</div>
</td>
<td>success</td>
<td>
<div>The node status and the number of pods deleted.</div>
<br/>
</td>
</tr>
</table>
<br/><br/>
Status
------
Authors
~~~~~~~
- Aubin Bikouo (@abikouo)

View File

@@ -13,7 +13,7 @@ provisioner:
log: true
config_options:
inventory:
enable_plugins: kubernetes.core.k8s
enable_plugins: kubernetes.core.k8s,yaml
lint: {}
inventory:
hosts:

View File

@@ -6,45 +6,7 @@
collections:
- kubernetes.core
vars:
node_taints:
- "node.kubernetes.io/not-ready"
- "node.kubernetes.io/unreachable"
- "node.kubernetes.io/unschedulable"
kind_bin_path: "/usr/local/bin/kind"
tasks:
# We are spawning k8s cluster using kind executable and we ensure that the cluster is up
# and node is ready, if this is not validated we may face issue later on when running tests
- name: Check if kind binary is available or not
stat:
path: "{{ kind_bin_path }}"
register: r
- name: Download kind if not available
get_url:
url: https://kind.sigs.k8s.io/dl/v0.11.1/kind-linux-amd64
dest: "{{ kind_bin_path }}"
when: not r.stat.exists
- name: Make kind executable
file:
path: "{{ kind_bin_path }}"
mode: '0755'
- name: Check if Kind cluster exists
command: "{{ kind_bin_path }} get clusters"
register: r
ignore_errors: true
- name: Create cluster
command: "{{ kind_bin_path }} create cluster"
when: r.stdout == ''
- name: Assert that nodes are ready
k8s_info:
kind: Node
retries: 10
delay: 30
register: nodes
until: nodes.resources | selectattr("spec.taints", "defined") | map(attribute="spec.taints") | list | length == 0
- name: Include drain.yml
include_tasks:
file: tasks/drain.yml

View File

@@ -0,0 +1,219 @@
---
- block:
- name: Set common facts
set_fact:
drain_namespace: "drain"
drain_daemonset_name: "promotheus-dset"
drain_pod_name: "pod-drain"
- name: Create {{ drain_namespace }} namespace
k8s:
kind: Namespace
name: '{{ drain_namespace }}'
- name: list cluster nodes
k8s_info:
kind: node
register: nodes
- name: Select uncordoned nodes
set_fact:
uncordoned_nodes: "{{ nodes.resources | selectattr('spec.unschedulable', 'undefined') | map(attribute='metadata.name') | list}}"
- name: Assert that at least one node is schedulable
assert:
that:
- uncordoned_nodes | length > 0
- name: select node to drain
set_fact:
node_to_drain: '{{ uncordoned_nodes[0] }}'
- name: Deploy daemonset on cluster
k8s:
namespace: '{{ drain_namespace }}'
definition:
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: '{{ drain_daemonset_name }}'
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchFields:
- key: metadata.name
operator: In
values:
- '{{ node_to_drain }}'
selector:
matchLabels:
name: prometheus-exporter
template:
metadata:
labels:
name: prometheus-exporter
spec:
containers:
- name: prometheus
image: prom/node-exporter
ports:
- containerPort: 80
- name: Create Pods not managed by ReplicationController, ReplicaSet, Job, DaemonSet or StatefulSet.
k8s:
namespace: '{{ drain_namespace }}'
wait: yes
definition:
apiVersion: v1
kind: Pod
metadata:
name: '{{ drain_pod_name }}'
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchFields:
- key: metadata.name
operator: In
values:
- '{{ node_to_drain }}'
containers:
- name: c0
image: busybox
command:
- /bin/sh
- -c
- while true;do date;sleep 5; done
- name: Cordon node
k8s_drain:
state: cordon
name: '{{ node_to_drain }}'
register: cordon
- name: assert that cordon is changed
assert:
that:
- cordon is changed
- name: Test cordon idempotency
k8s_drain:
state: cordon
name: '{{ node_to_drain }}'
register: cordon
- name: assert that cordon is not changed
assert:
that:
- cordon is not changed
- name: Get pods
k8s_info:
kind: Pod
namespace: '{{ drain_namespace }}'
register: Pod
- name: assert that pods are running on cordoned node
assert:
that:
- "{{ Pod.resources | selectattr('status.phase', 'equalto', 'Running') | selectattr('spec.nodeName', 'equalto', node_to_drain) | list | length > 0 }}"
- name: Uncordon node
k8s_drain:
state: uncordon
name: '{{ node_to_drain }}'
register: uncordon
- name: assert that uncordon is changed
assert:
that:
- uncordon is changed
- name: Test uncordon idempotency
k8s_drain:
state: uncordon
name: '{{ node_to_drain }}'
register: uncordon
- name: assert that uncordon is not changed
assert:
that:
- uncordon is not changed
- name: Drain node
k8s_drain:
state: drain
name: '{{ node_to_drain }}'
ignore_errors: true
register: drain_result
- name: assert that drain failed due to DaemonSet managed Pods
assert:
that:
- drain_result is failed
- '"cannot delete DaemonSet-managed Pods" in drain_result.msg'
- '"cannot delete Pods not managed by ReplicationController, ReplicaSet, Job, DaemonSet or StatefulSet" in drain_result.msg'
- name: Drain node using ignore_daemonsets and force options
k8s_drain:
state: drain
name: '{{ node_to_drain }}'
delete_options:
force: true
ignore_daemonsets: true
wait_timeout: 0
register: drain_result
- name: assert that node has been drained
assert:
that:
- drain_result is changed
- '"node {{ node_to_drain }} marked unschedulable." in drain_result.result'
- name: assert that unmanaged pod were deleted
k8s_info:
namespace: '{{ drain_namespace }}'
kind: Pod
register: _result
failed_when: _result.resources | selectattr("metadata.ownerReferences", "undefined") | length > 0
- name: Test drain idempotency
k8s_drain:
state: drain
name: '{{ node_to_drain }}'
delete_options:
force: true
ignore_daemonsets: true
register: drain_result
- name: Check idempotency
assert:
that:
- drain_result is not changed
- name: Get DaemonSet
k8s_info:
kind: DaemonSet
namespace: '{{ drain_namespace }}'
name: '{{ drain_daemonset_name }}'
register: dset_result
- name: assert that daemonset managed pods were not removed
assert:
that:
- dset_result.resources | list | length > 0
- name: Uncordon node
k8s_drain:
state: uncordon
name: '{{ node_to_drain }}'
always:
- name: delete namespace
k8s:
state: absent
kind: namespace
name: '{{ drain_namespace }}'

View File

@@ -0,0 +1,409 @@
#!/usr/bin/python
# -*- coding: utf-8 -*-
# Copyright (c) 2021, Aubin Bikouo <@abikouo>
# GNU General Public License v3.0+ (see COPYING or https://www.gnu.org/licenses/gpl-3.0.txt)
from __future__ import absolute_import, division, print_function
__metaclass__ = type
DOCUMENTATION = r'''
module: k8s_drain
short_description: Drain, Cordon, or Uncordon node in k8s cluster
author: Aubin Bikouo (@abikouo)
description:
- Drain node in preparation for maintenance same as kubectl drain.
- Cordon will mark the node as unschedulable.
- Uncordon will mark the node as schedulable.
- The given node will be marked unschedulable to prevent new pods from arriving.
- Then drain deletes all pods except mirror pods (which cannot be deleted through the API server).
extends_documentation_fragment:
- kubernetes.core.k8s_auth_options
options:
state:
description:
- Determines whether to drain, cordon, or uncordon node.
type: str
default: drain
choices: [ cordon, drain, uncordon ]
name:
description:
- The name of the node.
required: true
type: str
delete_options:
type: dict
description:
- Specify options to delete pods.
- This option has effect only when C(state) is set to I(drain).
suboptions:
terminate_grace_period:
description:
- Specify how many seconds to wait before forcefully terminating.
- If not specified, the default grace period for the object type will be used.
- The value zero indicates delete immediately.
required: false
type: int
force:
description:
- Continue even if there are pods not managed by a ReplicationController, Job, or DaemonSet.
type: bool
default: False
ignore_daemonsets:
description:
- Ignore DaemonSet-managed pods.
type: bool
default: False
disable_eviction:
description:
- Forces drain to use delete rather than evict.
type: bool
default: False
wait_timeout:
description:
- The length of time to wait in seconds for pod to be deleted before giving up, zero means infinite.
type: int
wait_sleep:
description:
- Number of seconds to sleep between checks.
- Ignored if C(wait_timeout) is not set.
default: 5
type: int
requirements:
- python >= 3.6
- kubernetes >= 12.0.0
'''
EXAMPLES = r'''
- name: Drain node "foo", even if there are pods not managed by a ReplicationController, Job, or DaemonSet on it.
kubernetes.core.k8s_drain:
state: drain
name: foo
force: yes
- name: Drain node "foo", but abort if there are pods not managed by a ReplicationController, Job, or DaemonSet, and use a grace period of 15 minutes.
kubernetes.core.k8s_drain:
state: drain
name: foo
delete_options:
terminate_grace_period: 900
- name: Mark node "foo" as schedulable.
kubernetes.core.k8s_drain:
state: uncordon
name: foo
- name: Mark node "foo" as unschedulable.
kubernetes.core.k8s_drain:
state: cordon
name: foo
'''
RETURN = r'''
result:
description:
- The node status and the number of pods deleted.
returned: success
type: str
'''
import copy
from datetime import datetime
import time
from ansible_collections.kubernetes.core.plugins.module_utils.ansiblemodule import AnsibleModule
from ansible_collections.kubernetes.core.plugins.module_utils.args_common import AUTH_ARG_SPEC
from ansible.module_utils._text import to_native
try:
from kubernetes.client.api import core_v1_api
from kubernetes.client.models import V1beta1Eviction, V1DeleteOptions
from kubernetes.client.exceptions import ApiException
except ImportError:
# ImportError are managed by the common module already.
pass
def filter_pods(pods, force, ignore_daemonset):
k8s_kind_mirror = "kubernetes.io/config.mirror"
daemonSet, unmanaged, mirror, localStorage, to_delete = [], [], [], [], []
for pod in pods:
# check mirror pod: cannot be delete using API Server
if pod.metadata.annotations and k8s_kind_mirror in pod.metadata.annotations:
mirror.append((pod.metadata.namespace, pod.metadata.name))
continue
# Any finished pod can be deleted
if pod.status.phase in ('Succeeded', 'Failed'):
to_delete.append((pod.metadata.namespace, pod.metadata.name))
continue
# Pod with local storage cannot be deleted
# TODO: support new option delete-emptydatadir in order to allow deletion of such pod
if pod.spec.volumes and any([vol.empty_dir for vol in pod.spec.volumes]):
localStorage.append((pod.metadata.namespace, pod.metadata.name))
continue
# Check replicated Pod
owner_ref = pod.metadata.owner_references
if not owner_ref:
unmanaged.append((pod.metadata.namespace, pod.metadata.name))
else:
for owner in owner_ref:
if owner.kind == "DaemonSet":
daemonSet.append((pod.metadata.namespace, pod.metadata.name))
else:
to_delete.append((pod.metadata.namespace, pod.metadata.name))
warnings, errors = [], []
if unmanaged:
pod_names = ','.join([pod[0] + "/" + pod[1] for pod in unmanaged])
if not force:
errors.append("cannot delete Pods not managed by ReplicationController, ReplicaSet, Job,"
" DaemonSet or StatefulSet (use option force set to yes): {0}.".format(pod_names))
else:
# Pod not managed will be deleted as 'force' is true
warnings.append("Deleting Pods not managed by ReplicationController, ReplicaSet, Job, DaemonSet or StatefulSet: {0}.".format(pod_names))
to_delete += unmanaged
# mirror pods warning
if mirror:
pod_names = ','.join([pod[0] + "/" + pod[1] for pod in mirror])
warnings.append("cannot delete mirror Pods using API server: {0}.".format(pod_names))
# local storage
if localStorage:
errors.append("cannot delete Pods with local storage: {0}.".format(pod_names))
# DaemonSet managed Pods
if daemonSet:
pod_names = ','.join([pod[0] + "/" + pod[1] for pod in daemonSet])
if not ignore_daemonset:
errors.append("cannot delete DaemonSet-managed Pods (use option ignore_daemonset set to yes): {0}.".format(pod_names))
else:
warnings.append("Ignoring DaemonSet-managed Pods: {0}.".format(pod_names))
return to_delete, warnings, errors
class K8sDrainAnsible(object):
def __init__(self, module):
from ansible_collections.kubernetes.core.plugins.module_utils.common import (
K8sAnsibleMixin, get_api_client)
self._module = module
self._k8s_ansible_mixin = K8sAnsibleMixin(module)
self._k8s_ansible_mixin.client = get_api_client(module=self._module)
self._k8s_ansible_mixin.module = self._module
self._k8s_ansible_mixin.argspec = self._module.argument_spec
self._k8s_ansible_mixin.check_mode = self._module.check_mode
self._k8s_ansible_mixin.params = self._module.params
self._k8s_ansible_mixin.fail_json = self._module.fail_json
self._k8s_ansible_mixin.fail = self._module.fail_json
self._k8s_ansible_mixin.exit_json = self._module.exit_json
self._k8s_ansible_mixin.warn = self._module.warn
self._k8s_ansible_mixin.warnings = []
self._api_instance = core_v1_api.CoreV1Api(self._k8s_ansible_mixin.client.client)
self._k8s_ansible_mixin.check_library_version()
# delete options
self._drain_options = module.params.get('delete_options', {})
self._delete_options = None
if self._drain_options.get('terminate_grace_period'):
self._delete_options = {}
self._delete_options.update({'apiVersion': 'v1'})
self._delete_options.update({'kind': 'DeleteOptions'})
self._delete_options.update({'gracePeriodSeconds': self._drain_options.get('terminate_grace_period')})
self._changed = False
def wait_for_pod_deletion(self, pods, wait_timeout, wait_sleep):
start = datetime.now()
def _elapsed_time():
return (datetime.now() - start).seconds
response = None
pod = pods.pop()
while (_elapsed_time() < wait_timeout or wait_timeout == 0) and pods:
if not pod:
pod = pods.pop()
try:
response = self._api_instance.read_namespaced_pod(namespace=pod[0], name=pod[1])
if not response:
pod = None
time.sleep(wait_sleep)
except ApiException as exc:
if exc.reason != "Not Found":
self._module.fail_json(msg="Exception raised: {0}".format(exc.reason))
pod = None
except Exception as e:
self._module.fail_json(msg="Exception raised: {0}".format(to_native(e)))
if not pods:
return None
return "timeout reached while pods were still running."
def evict_pods(self, pods):
for namespace, name in pods:
definition = {
'metadata': {
'name': name,
'namespace': namespace
}
}
if self._delete_options:
definition.update({'delete_options': self._delete_options})
try:
if self._drain_options.get('disable_eviction'):
body = V1DeleteOptions(**definition)
self._api_instance.delete_namespaced_pod(name=name, namespace=namespace, body=body)
else:
body = V1beta1Eviction(**definition)
self._api_instance.create_namespaced_pod_eviction(name=name, namespace=namespace, body=body)
self._changed = True
except ApiException as exc:
if exc.reason != "Not Found":
self._module.fail_json(msg="Failed to delete pod {0}/{1} due to: {2}".format(namespace, name, exc.reason))
except Exception as exc:
self._module.fail_json(msg="Failed to delete pod {0}/{1} due to: {2}".format(namespace, name, to_native(exc)))
def delete_or_evict_pods(self, node_unschedulable):
# Mark node as unschedulable
result = []
if not node_unschedulable:
self.patch_node(unschedulable=True)
result.append("node {0} marked unschedulable.".format(self._module.params.get('name')))
self._changed = True
else:
result.append("node {0} already marked unschedulable.".format(self._module.params.get('name')))
def _revert_node_patch():
if self._changed:
self._changed = False
self.patch_node(unschedulable=False)
try:
field_selector = "spec.nodeName={name}".format(name=self._module.params.get('name'))
pod_list = self._api_instance.list_pod_for_all_namespaces(field_selector=field_selector)
# Filter pods
force = self._drain_options.get('force', False)
ignore_daemonset = self._drain_options.get('ignore_daemonsets', False)
pods, warnings, errors = filter_pods(pod_list.items, force, ignore_daemonset)
if errors:
_revert_node_patch()
self._module.fail_json(msg="Pod deletion errors: {0}".format(" ".join(errors)))
except ApiException as exc:
if exc.reason != "Not Found":
_revert_node_patch()
self._module.fail_json(msg="Failed to list pod from node {name} due to: {reason}".format(
name=self._module.params.get('name'), reason=exc.reason), status=exc.status)
pods = []
except Exception as exc:
_revert_node_patch()
self._module.fail_json(msg="Failed to list pod from node {name} due to: {error}".format(
name=self._module.params.get('name'), error=to_native(exc)))
# Delete Pods
if pods:
self.evict_pods(pods)
number_pod = len(pods)
if self._drain_options.get('wait_timeout') is not None:
warn = self.wait_for_pod_deletion(pods,
self._drain_options.get('wait_timeout'),
self._drain_options.get('wait_sleep'))
if warn:
warnings.append(warn)
result.append("{0} Pod(s) deleted from node.".format(number_pod))
if warnings:
return dict(result=' '.join(result), warnings=warnings)
return dict(result=' '.join(result))
def patch_node(self, unschedulable):
body = {
'spec': {'unschedulable': unschedulable}
}
try:
self._api_instance.patch_node(name=self._module.params.get('name'), body=body)
except Exception as exc:
self._module.fail_json(msg="Failed to patch node due to: {0}".format(to_native(exc)))
def execute_module(self):
state = self._module.params.get('state')
name = self._module.params.get('name')
try:
node = self._api_instance.read_node(name=name)
except ApiException as exc:
if exc.reason == "Not Found":
self._module.fail_json(msg="Node {0} not found.".format(name))
self._module.fail_json(msg="Failed to retrieve node '{0}' due to: {1}".format(name, exc.reason), status=exc.status)
except Exception as exc:
self._module.fail_json(msg="Failed to retrieve node '{0}' due to: {1}".format(name, to_native(exc)))
result = {}
if state == "cordon":
if node.spec.unschedulable:
self._module.exit_json(result="node {0} already marked unschedulable.".format(name))
self.patch_node(unschedulable=True)
result['result'] = "node {0} marked unschedulable.".format(name)
self._changed = True
elif state == "uncordon":
if not node.spec.unschedulable:
self._module.exit_json(result="node {0} already marked schedulable.".format(name))
self.patch_node(unschedulable=False)
result['result'] = "node {0} marked schedulable.".format(name)
self._changed = True
else:
# drain node
# Delete or Evict Pods
ret = self.delete_or_evict_pods(node_unschedulable=node.spec.unschedulable)
result.update(ret)
self._module.exit_json(changed=self._changed, **result)
def argspec():
argument_spec = copy.deepcopy(AUTH_ARG_SPEC)
argument_spec.update(
dict(
state=dict(default="drain", choices=["cordon", "drain", "uncordon"]),
name=dict(required=True),
delete_options=dict(
type='dict',
default={},
options=dict(
terminate_grace_period=dict(type='int'),
force=dict(type='bool', default=False),
ignore_daemonsets=dict(type='bool', default=False),
disable_eviction=dict(type='bool', default=False),
wait_timeout=dict(type='int'),
wait_sleep=dict(type='int', default=5),
)
),
)
)
return argument_spec
def main():
module = AnsibleModule(argument_spec=argspec())
k8s_drain = K8sDrainAnsible(module)
k8s_drain.execute_module()
if __name__ == '__main__':
main()

View File

@@ -30,6 +30,10 @@ passenv =
commands=
ansible-test integration --docker -v --color --retry-on-error --diff --coverage --continue-on-error --python {posargs}
[testenv:add_docs]
deps = git+https://github.com/ansible-network/collection_prep
commands = collection_prep_add_docs -p .
[testenv:linters]
deps = yamllint
flake8