We could let the web container terminate as usual, as there are no
reasons to keep it running as it doesn't participate in
job control. Additionally, it stops receiving traffic with the beginning
of termination
> At the same time as the kubelet is starting graceful shutdown, the
> control plane removes that shutting-down Pod from EndpointSlice (and
> Endpoints) objects where these represent a Service with a configured
> selector
@ https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination
- previously, there was no way to auto-assign a port by default
which led to conflicts with other deployments at times
- nodeport_port param can still be used to specify a port if desired
With the previous approach, not all associated (mounted) CM/Secrets
changes caused the Deployment to be rolled out, but also the Deployment
could have been rolled out unnecessary during e.g. Ingress or Service
changes (which do not require Pod restarts).
Previously existing Pod removal (state: absent) was not complete as
other pods continued to exist, but also is not needed with this commit
change due to added Pods annotations.
The added Deployment Pod annotations now cause the new ReplicaSet
version to be rolled out, effectively causing replacement of the
previously existing Pods in accordance with the deployment `strategy`
(https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.25/#deploymentstrategy-v1-apps,
`RollingUpdate`) whenever there is a change in the associated CMs or
Secrets referenced in annotations. This implementation is quite standard
and widely used for Helm workflows -
https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments
Do not consider Pods marked for deletion when calculating tower_pod to
address replicas scale down case - where normally Pods spawned recently
are being taken for removal. As well as the case when operator kicked
off but some old replicas are still terminating.
Respect `creationTimestamp` so to make sure that the newest Pod is taken
after Deployment application, in which case multiple RS Pods (from old
RS and new RS) could be running simultaneously while the rollout is
happening.
Proper waiting is already performed earlier during Deplyment{apply: yes, wait: yes} -
e6ac874098/plugins/module_utils/k8s/waiter.py (L27).
And also not every Deployment change produces new RS/Pods. For example,
changing Deployment labels won't cause new rollout, but will cause
`until` loop to be invoked unnecessarily (when replicas=1).
There are cases when having a new Deployment may be taking above the
default timeout of 120s.
For instance, when a Deployment has multiple replicas, and each replica
starts on a separate node, and the Deployment specifies new images, then
just pulling these new images for each replica may be taking above the
default timeout of 120s.
Having the default time multiplied by the number of replicas should
provide generally enough time for all replicas to start
* Move label templates into `common` role
So that there is single source of labels management, and labels are
unified across the other roles
* Introduce `additional_labels`
* Fix paths for labels templates
* Return `additional_labels_items` as list
* Add molecule tests
* Add an option to specify affinity rules for the awx pod
In some cases, you may want to use affinity rules instead of a
node selector so you can have more flexbility. For example if you want
to have "soft" rules i.e. run my pod on this node if possible otherwise
run it anywhere
* Rename `node_affinity` to `affinity`
* Maintain defaults and CSV
* Add fields validation
Co-authored-by: Olivier <oliverf1ca@yahoo.com>
Support external execution nodes
- Allow receptor.conf to be editable at runtime
- Create CA cert and key as a k8s secret
- Create work signing RSA keypair as a k8s secret
- Setup volume mounts for containers to have access to the needed
Receptor keys / certs to facilitate generating the install bundle
for a new execution node
- added firewall rule, work signing and tls cert configuration to default receptor.conf
The volume mount changes in this PR fulfill the following:
- `receptor.conf` need to be shared between task container and ee container
- **task** container writes the `receptor.conf`
- **ee** consume the `receptor.conf`
- receptor ca cert/key need to be mounted by both ee container and web container
- **ee** container need the ca cert
- **web** container will need the ca key to sign client cert for remote execution node
- **web** container will need the ca cert to generate install bundle for remote execution node
- receptor work private/public key need to be mounted by both ee container and web container
- **ee** container need to private key to sign the work
- **web** container need the public key to generate install bundle for remote execution node
- **task** container need the private key to sign the work
Signed-off-by: Hao Liu <haoli@redhat.com>
Co-Authored-By: Seth Foster <fosterbseth@gmail.com>
Co-Authored-By: Shane McDonald <me@shanemcd.com>
Signed-off-by: Hao Liu <haoli@redhat.com>
Co-authored-by: Shane McDonald <me@shanemcd.com>
Co-authored-by: Seth Foster <fosterbseth@gmail.com>