Commit Graph

147 Commits

Author SHA1 Message Date
Christian Adams
7bf49c207a Remove the ability to customize the postgres_data_dir (#1798)
* in the sclorg Postgresql 15 image, the PGDATA directory is hardcoded
* if users were to modify this directory, they would only change the
  directory the pvc is mounted to, not the directory PostgreSQL uses.
  This would result in loss of data.
* switch from /var/lib/pgsql/data/pgdata to /var/lib/pgsql/data/userdata
2024-03-31 21:58:33 -04:00
Dimitri Savineau
80a9e8c156 postgresql: Cast sorted_old_postgres_pods as list (#1791)
With ansible 2.9.27 (operator-sdk v1.27.0) then the reverse filter
returns an iterator so we need to cast it to list.
The behavior doesn't exist when using a more recent operator-sdk
version like v1.34.0 (ansible-core 2.15.8) but using the list
filter on that version works too (even if not needed)

"sorted_old_postgres_pods": "<list_reverseiterator object at 0x7f539eaa5610>"

Signed-off-by: Dimitri Savineau <dsavinea@redhat.com>
2024-03-27 14:31:53 -04:00
kurokobo
07b8120788 fix: add retries to find running web pod (#1787) 2024-03-27 14:25:10 -04:00
aknochow
c6fe038fe4 Adding support for ansible metrics-utility (#1754)
- Adding metadata, storage_class, and pullsecret for metrics-utility
- Updating crd, csv and defaults
- Adding metrics-utility cronjob
2024-03-20 11:05:13 -04:00
Seth Foster
154b801cfc Change default value for postgres_data_path (#1766)
* Change default value for postgres_data_path

/var/lib/postgresql/data/pgdata
to
/var/lib/pgsql/data/pgdata

postgres 15 uses a different location for
postgres data directory.

Fixes issue were database was not being written
to the mounted in volume, and if the postgres
container restarted, data would be lost.

Signed-off-by: Seth Foster <fosterbseth@gmail.com>
---------

Signed-off-by: Seth Foster <fosterbseth@gmail.com>
Co-authored-by: Hao Liu <44379968+TheRealHaoLiu@users.noreply.github.com>
2024-03-13 16:17:49 -04:00
Hao Liu
a8acae4af5 Don't delete old postgres 13 volume automatically (#1767)
Leave old postgres-13 volume alone in case of unforseen upgrade failure for restore purposes

User can manually delete old PVC after verifying upgrade is completed
2024-03-13 15:23:10 -04:00
Hao Liu
b5d81b8e5d Fix awx_kube_devel (#1759)
* Fix awx_kube_devel
* Sanitize version name for kube_dev

When in development mode, awx version may look
like 23.9.1.dev18+gee9eac15dc.d20240311

k8s job to the migration can only have
a name with alphanumeric, and '.', '-'

so we can just drop off the +

Signed-off-by: Seth Foster <fosterbseth@gmail.com>

---------

Signed-off-by: Seth Foster <fosterbseth@gmail.com>
Co-authored-by: Seth Foster <fosterbseth@gmail.com>
2024-03-11 19:01:00 +00:00
bartowl
3abeec518a Bind EE images version with DEFAULT_AWX_VERSION (#1740)
* bind ee_images, control_plane_ee_image and init_container_image with DEFAULT_AWX_VERSION instead of "latest"

* fix when condition on init_container_image_version check

* Use DEFAULT_AWX_VERSION for AWXMeshIngress

* Add back AWX EE latest for backward compatibility

---------

Co-authored-by: Hao Liu <44379968+TheRealHaoLiu@users.noreply.github.com>
2024-03-11 14:12:10 -04:00
Christian Adams
d2c4b9c8a4 The pg service label_selector now uses the deployment_type variable (#1755) 2024-03-08 09:02:31 -05:00
Christian Adams
2ad1d25120 Update PostgreSQL docs about finding default version (#1747) 2024-03-07 21:47:18 -05:00
David Hageman
ffba1b4712 Add -ness checks and refactor migrations (#1674) 2024-03-05 19:54:22 -05:00
kurokobo
dba934daa0 fix: revert type of status.upgradedPostgresVersion to string (#1745) 2024-03-04 15:55:16 -05:00
aknochow
d0827ba426 Fixing postgres upgrade conditional (#1741) 2024-03-01 17:09:15 -05:00
John Westcott IV
607a7ca58c Upgrading to PostgreSQL 15 and moving to sclorg images (#1486)
* Upgrading to postgres:15
* Changing image from postgres to sclorg
* Handle scenario where upgrade status is not defined & correct pg tag
* Rework the upgrade logic to be more resiliant for multiple upgrades

---------

Co-authored-by: john-westcott-iv <john-westcott-iv@users.noreply.github.com>
Co-authored-by: Christian M. Adams <chadams@redhat.com>
2024-02-29 17:02:11 -05:00
Guillaume Lefevre
07427be0b7 Allow multiple ingress hosts to be defined when using ingress (#1377)
* Replace api version for deployment kind to apps/v1

* Add new multiple ingress spec and deprecate hostname and ingress_tls_secret

* Manage new ingress_hosts.tls_secret backup separately

* Fix ci molecule lint warnings and error

* Fix documentation

* Fix ingress_hosts tls_secret key being optional

* Remove fieldDependency:ingress_type:Ingress for Ingress Hosts

* Fix scenario when neither hostname or ingress_hosts is defined

---------

Co-authored-by: Guillaume Lefevre <guillaume.lefevre@agoda.com>
Co-authored-by: Seth Foster <fosterseth@users.noreply.github.com>
Co-authored-by: Christian Adams <chadams@redhat.com>
2024-01-05 10:15:04 -05:00
Christian Adams
582701d949 Refactor to resolve the linter warnings on PRs (#1668) 2023-12-14 09:29:35 -05:00
Christian Adams
a61ed18147 Always check and wait for a restore pg_restore to finish (#1652) 2023-12-01 16:18:23 -05:00
Seth Foster
15ed13dd8d Fix supported_pg_version (#1614)
Signed-off-by: Seth Foster <fosterbseth@gmail.com>
2023-10-25 12:47:24 -04:00
Hao Liu
019fa3d356 Add background keepalive to awx-manage migrate (#1589) 2023-10-13 09:33:27 -04:00
Christian Adams
8d91a67078 Ensure that web and task deployments scale down for upgrades (#1522) 2023-09-06 18:44:49 +00:00
Christian Adams
4c5429190c Timeout stream keep alive for Upgrades and Restores (#1542)
Signed-off-by: Christian M. Adams <chadams@redhat.com>
2023-08-29 15:36:48 -04:00
Christian Adams
7012a6acfc Modify how pg password is set in postgres pod (#1540)
Signed-off-by: Christian M. Adams <chadams@redhat.com>
2023-08-29 15:28:54 +00:00
Christian Adams
1dc64b551c Add keepalive to migrate data script (#1538)
Signed-off-by: Christian M. Adams <chadams@redhat.com>
2023-08-29 11:05:11 -04:00
Hao Liu
c949d6e58d Wait for termination grace period when scaling down the deployments (#1537) 2023-08-28 18:37:45 -04:00
Rick Elrod
c9ab99385a Allow {web_,task_,}replicas to be 0 and split out molecule tests (#1468)
Signed-off-by: Rick Elrod <rick@elrod.me>
2023-07-18 17:07:55 -04:00
Jake Jackson
7218e42771 [web/task split] fix scale down bug (#1295)
- rename scale_down vars to the new deployments since the old one no longer exists
- rename postgres.yml scale down vars as it references the old ones as well
2023-03-29 22:00:52 -04:00
Hao Liu
c2f0c214eb rename tower_pod to awx_task_pod 2023-03-29 22:00:52 -04:00
thedoubl3j
5894a4ad25 remove old deployment during upgrade 2023-03-29 22:00:52 -04:00
Hao Liu
84b766ac40 update auto_upgrade logic (#1241)
update logic for determining if install.yml task should be run
to respect the auto_upgrade field in awx resource

conditions and expected behavior
```
  auto_upgrade   awx   awx-web   awx-task   run install.yml
 -------------- ----- --------- ---------- -----------------
  T              -     -         -          T
  F              T     -         -          F
  F              -     T         T          F
  F              -     T         F          T
  F              -     F         T          T
  F              -     F         F          T
```
2023-03-29 22:00:52 -04:00
Jake Jackson
d9f3a428d4 [web/task split] split web and task deployment + a few supporting bits (#1189)
* first pass, still WIP, need tolerations etc

* add tolerations that don't work bc idk

* bug hunting

* local push, still a WIP

* affinity still needs testfor to_nice_yaml, tolerations logic is working

* fixed task deployment and affinity for both
2023-03-29 22:00:52 -04:00
Shane McDonald
19461fa86c Split web and task containers into separate deployments 2023-03-29 21:59:57 -04:00
Guillaume Lefevre
c76ad2cff1 Change ansible k8s_info tasks api_version for Deployment kind to apps/v1 (#1299)
Co-authored-by: Guillaume Lefevre <guillaume.lefevre@agoda.com>
2023-03-29 15:39:41 -04:00
Hao Liu
46da413585 Merge pull request #1193 from stanislav-zaprudskiy/add_termination_grace_period_seconds
AWX: Add `termination_grace_period_seconds`
2023-02-28 15:37:51 -05:00
Maxence Button
f328b0adb6 Customization of the init_projects_container_image is now possible (#1248) 2023-02-22 15:05:23 -05:00
Stanislav Zaprudskiy
336ea58a0a AWX: Add termination_grace_period_seconds 2023-02-07 16:33:00 +01:00
Stanislav Zaprudskiy
f042cb3d00 Fix lint warnings 2023-02-07 16:31:26 +01:00
Stanislav Zaprudskiy
94d68bf382 Make Deployment to be rolled out on CM and Secrets changes
With the previous approach, not all associated (mounted) CM/Secrets
changes caused the Deployment to be rolled out, but also the Deployment
could have been rolled out unnecessary during e.g. Ingress or Service
changes (which do not require Pod restarts).

Previously existing Pod removal (state: absent) was not complete as
other pods continued to exist, but also is not needed with this commit
change due to added Pods annotations.

The added Deployment Pod annotations now cause the new ReplicaSet
version to be rolled out, effectively causing replacement of the
previously existing Pods in accordance with the deployment `strategy`
(https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.25/#deploymentstrategy-v1-apps,
`RollingUpdate`) whenever there is a change in the associated CMs or
Secrets referenced in annotations. This implementation is quite standard
and widely used for Helm workflows -
https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments
2023-02-07 11:58:47 +01:00
Stanislav Zaprudskiy
b3a74362af Make AWX Pod variable to be calculated respecting creationTimestamp and deletionTimestamp
Do not consider Pods marked for deletion when calculating tower_pod to
address replicas scale down case - where normally Pods spawned recently
are being taken for removal. As well as the case when operator kicked
off but some old replicas are still terminating.

Respect `creationTimestamp` so to make sure that the newest Pod is taken
after Deployment application, in which case multiple RS Pods (from old
RS and new RS) could be running simultaneously while the rollout is
happening.
2023-02-07 11:47:49 +01:00
Stanislav Zaprudskiy
ad531c8dce Do not wait for a new Pod name after Deployment change
Proper waiting is already performed earlier during Deplyment{apply: yes, wait: yes} -
e6ac874098/plugins/module_utils/k8s/waiter.py (L27).

And also not every Deployment change produces new RS/Pods. For example,
changing Deployment labels won't cause new rollout, but will cause
`until` loop to be invoked unnecessarily (when replicas=1).
2023-02-07 11:43:34 +01:00
Stanislav Zaprudskiy
e589ceb661 When applying Deployment wait up to (timeout * replicas)
There are cases when having a new Deployment may be taking above the
default timeout of 120s.
For instance, when a Deployment has multiple replicas, and each replica
starts on a separate node, and the Deployment specifies new images, then
just pulling these new images for each replica may be taking above the
default timeout of 120s.

Having the default time multiplied by the number of replicas should
provide generally enough time for all replicas to start
2023-02-07 11:41:32 +01:00
Stanislav Zaprudskiy
5a856eeba8 Add additional_labels parameter (#1160)
* Move label templates into `common` role

So that there is single source of labels management, and labels are
unified across the other roles

* Introduce `additional_labels`
* Fix paths for labels templates
* Return `additional_labels_items` as list
* Add molecule tests
2023-01-30 18:51:08 -05:00
kurokobo
b1a547d2a6 fix: add quotes for PGPASSWORD in upgrade_postgres.yml (fixes #1166) (#1167) 2023-01-18 11:59:03 -05:00
David Hageman
21eb83b052 Correct admin password updating (#1179)
Corrects an issue with admin passwords failing to be updated due to shell escaping. This aligns the operator with the logic in the normal installer.
2023-01-11 11:41:35 -05:00
Christian Adams
a5e21b56ae Backup and restore receptor tls secret with expected generated name (#1107) 2022-11-07 11:04:22 -05:00
Hao Liu
0611f3efaa add migration code for receptor ca secret
Signed-off-by: Hao Liu <haoli@redhat.com>
2022-09-28 16:22:20 -04:00
Hao Liu
d64c34f8a4 Add receptor firewall rules to control nodes (#1012)
Support external execution nodes

- Allow receptor.conf to be editable at runtime
- Create CA cert and key as a k8s secret
- Create work signing RSA keypair as a k8s secret
- Setup volume mounts for containers to have access to the needed
  Receptor keys / certs to facilitate generating the install bundle
  for a new execution node
- added firewall rule, work signing and tls cert configuration to default receptor.conf

The volume mount changes in this PR fulfill the following:
- `receptor.conf` need to be shared between task container and ee container
  - **task** container writes the `receptor.conf`
  - **ee** consume the `receptor.conf`
- receptor ca cert/key need to be mounted by both ee container and web container
  - **ee** container need the ca cert
  - **web** container will need the ca key to sign client cert for remote execution node
  - **web** container will need the ca cert to generate install bundle for remote execution node
- receptor work private/public key need to be mounted by both ee container and web container
  - **ee** container need to private key to sign the work
  - **web** container need the public key to generate install bundle  for remote execution node
  - **task** container need the private key to sign the work

Signed-off-by: Hao Liu <haoli@redhat.com>
Co-Authored-By: Seth Foster <fosterbseth@gmail.com>
Co-Authored-By: Shane McDonald <me@shanemcd.com>

Signed-off-by: Hao Liu <haoli@redhat.com>
Co-authored-by: Shane McDonald <me@shanemcd.com>
Co-authored-by: Seth Foster <fosterbseth@gmail.com>
2022-09-09 15:13:05 -04:00
Shane McDonald
edecf4d2fe Move labels into reusable templates 2022-08-30 11:00:43 -04:00
Mateusz Drab
f2a9e967cc Remove reference to cluster.local 2022-08-24 20:07:11 +01:00
Christian Adams
7d2d1b3c5e Upgrade to Operator SDK v1.22.2 (#1001)
* Upgrade to Operator SDK 1.16.0

* Upgrade Operator SDK to v1.22.2 & bump base image version
2022-08-22 18:54:56 -04:00
Shane McDonald
60386bc928 Organize installer templates into subdirectories 2022-08-05 10:45:15 -04:00