Current Series Release Notes¶
19.0.0.0rc1-99¶
New Features¶
Adds support for external account binding (EAB) in Let’s Encrypt.
Adds new variables to be used by the letsencrypt role,
letsencrypt_external_cert_server
andletsencrypt_internal_cert_server
, It allows to configure ACME server for internal, external certificate generation.
Performance: Don’t notify handlers during config
This patch builds upon genconfig optimisation and it takes it further by not having genconfig ever touch the handlers! Calling the handlers and skipping them created an unnecessary slow down if only config was ran. It also depends on the config checking fix.
This gets us closer to the single responsibility principle - config only generates the config, container checks only validate whether container restart is needed.
And this also means that we will have single place were containers are restarted, were we can fix the ansible quirk of it restarting the whole group even when one container changed in the following patches.
The only exception is the loadbalance role. As the loadbalancer services have their config altered by other roles registering their services using loadbalancer-config. This is in contrast to typical roles, which do config in one step and can then run check-containers in the next step.
It is now possible to use a custom
cron-logrotate-global.conf
file.
With the
enable_ironic_dnsmasq
parameter it is possible to explicitly disable the ironic-dnsmasq service. By default, the parameter is set to the value ofenable_ironic
.
Improved the handling of multiple Ceph clusters in Kolla-Ansible by allowing explicit configuration of users, pools, and cluster names, following the official Ceph keyring format
$cluster.client.$user.keyring
.
Implement Layer 7 Healthchecks for HA Proxy. This should fix traffic being sent to unhealthy servers in some scenarios. Adds Prometheus
haproxy
user for handling authenticated l7 healthchecks.
kolla-ansible install-deps
subcommand will now retry on Ansible Galaxy collections installation failures.
nova-metadata
service has been split into it’s own container in preparation for uWSGI support.
Generates a system-scoped
public-openrc-system.sh
file. This allows running Ironic commands against the public API, which is useful when access to the internal API is unavailable. LP#2051837
Refactor services’ check-containers and optimise
This might fix some hidden bugs where the check tasks forgot to include params important for the service.
We also get a nice optimisation by using a filtered loop instead of task skipping per service with ‘when’. As proven in https://review.opendev.org/c/openstack/kolla-ansible/+/914997
This refactoring allows for further optimisation and fixing work to proceed with much less hassle. Including getting rid of many notify statements as the restarts are now safely handled by check-containers. Some notifies had to stay, because of special edge cases eg. in rolling upgrades and loadbalancer config.
One downside is we remove the little optimisation for Zun that ignored config change for copying loopback but this is an acceptable tradeoff considering the benefits above.
Reintroduce kolla-ansible check. This allows operators to quickly diagnose all containers across all hosts by running kolla-ansible check. It returns a list of containers that are missing, not running or in unhealthy state for each OpenStack service. Blueprint check-containers
service-uwsgi-config
role has been introduced for configuring uWSGI.
Setup Skyline to show resources from an external object store in the dashboard. If you run an external Swift compatible object store you have to tell Skyline about it as it can not use Keystone’s endpoint api to determine that at runtime. See the reference documentation for details.
In the Keystone role files for the
keystone_host_federation_oidc_metadata_folder
andkeystone_host_federation_oidc_attribute_mappings_folder
directories are now handled as templates. This relates to the OpenID Identity Providers metadata and the OpenStack Identity Providers attribute mappings as part of the identity federation with OIDC.
User role assignments can now customise domain and system scopes.
Adds support for running following services using uWSGI (without using Apache+mod_wsgi) which is enabled by default. To disable it please set <service>_wsgi_provider to
apache
(default isuwsgi
):Service
Variable
Nova
nova_wsgi_provider
Placement
placement_wsgi_provider
Upgrade Notes¶
Users who have previously used the letsencrypt role for an external certificate generation need to migrate their previous default value (or their overridden value) of the variable
letsencrypt_cert_server
and set it toletsencrypt_external_cert_server
.The default value washttps://acme-v02.api.letsencrypt.org/directory
enable_ironic_inspector
is set tono
by default, due to Ironic Inspector project deprecation (and plans for removal).
The variables
ceph_cinder_keyring
,ceph_cinder_backup_keyring
,ceph_glance_keyring
,ceph_gnocchi_keyring
,ceph_manila_keyring
, andceph_nova_keyring
have been removed, and their values are now automatically derived from the configurable Ceph users. Users who have relied on completely different keyrings or custom user configurations should ensure their setups are correctly aligned with the new convention as per documentation.
The
ironic-inspector
service user is now assigned the system scopeall
. If you have overridden the default list of role assignments, you should make this change too.
The following
proxysql
variables have been reverted to upstream defaults (unless overridden using Ansible variables). In case ofshun_on_failures
,connect_retries_delay
andconnect_retries_on_failure
we still have own defaults for all-in-one deployments - but rely on upstream on multinode ones.Variable name
Kolla-Ansible default
ProxySQL (upstream) default
connect_retries_delay
1000 (aio) / 1
1
connect_retries_on_failure
20 (aio) / 10
10
connect_timeout_client
100000
10000
connect_timeout_server
30000
1000
connect_timeout_server_max
100000
10000
monitor_connect_timeout
6000
1000
monitor_galera_healthcheck_interval
4000
5000
monitor_galera_healthcheck_max_timeout_count
2
3
monitor_galera_healthcheck_timeout
1000
800
monitor_ping_interval
3000
8000
monitor_ping_timeout
2000
1000
monitor_ping_max_failures
2
3
shun_on_failures
10 (aio) / 5
5
Deprecation Notes¶
ironic-inspector
deployment is deprecated for removal once implementation in Ironic reaches feature parity. See Ironic Inspector deprecation notice
swift
deployment support has been deprecated for removal in2025.2
due to failing CI and no volunteers to maintain this role.
Bug Fixes¶
Add an option to set OIDCX forwarded headers in keystone. This is useful when keystone is behind a proxy and the proxy is adding headers to the request. The new option is
keystone_federation_oidc_forwarded_headers
. The default value is empty, to preserve the current behavior. LP#2080402
Fix unintentional trigger of ansible handlers. Due to an Ansible quirk, when one container of a group changes, all containers in that group are restarted. This can cause problems with some services. LP#1863510
Fixes cases when fluentd parser fails on Python traceback. OpenStack services regex has been reworked to include both global_request_id and handling cases with Python traceback. LP#2044370
Fixes copying of custom certificates when Let’s encrypt is turned on. LP#2076331
Fixes proxysql-config’s TLS DB configuration. LP#2086466
Fixes
rabbitmq_enable_tls
in main.yaml file variable was not a boolean type. LP#2093335
Fixes Apache and placement writing to the same log file. Apache placement VirtualHost ErrorLog has been renamed to
placement-api-error.log
(similar to other services). LP#[2095607]
Set retry_tag in ElasticSearch/OpenSearch Fluentd output plugins. This is to prevent log messages from being re-processed by non-idempotent Fluentd pipeline configuration. See LP#2064104
Fixes some handlers that were missing the necessary guard, making genconfig actually able to restart some containers.
Fixes unwanted restarts during copying of certificates. By removing conditional statements from role handlers in #745164, copying certificates caused containers to restart, this is unwanted during the genconfig process. However, if we would remove handler notifiers from copying certificates, the container would never restart, since from #745164, containers will restart only if any of the files specified in config.json change. So this adds certificate folder to config.json file for containers. Certificates are copied to intermediary location inside of the container, from which the script kolla_copy_cacerts will install them in the system’s trust store.
Fixes external ceph cinder keyring is not imported into libvirt if templated. Per now, ansible/roles/nova-cell/tasks/external_ceph.yml looks cinder_cephx_raw_key up as file from cinder_cephx_keyring_file.stat.path To allow templated cinderkeyrings, the lookup is changed to “template” Fixes LP#2089229
Fixes a bug where the etcd3gw
backend_url
in cinder.conf would be invalid whenopenstack_cacert
was set. LP#2085908
Fixes cyborg deployment, which was missing variables in order to configure the haproxy listener. LP#2020088
Fixes a bug where fluentd (and the rest of the “common” services) were restarted if kolla-ansible is called from a different location. LP#2091703
Reduce the size of the fluentd buffers to avoid getting HTTP 413 errors when sending logs to opensearch/elasticsearch. The values chosen were based on what seemed a sensible size. These can be customised by editing the
fluentd_bulk_message_request_threshold
andfluentd_buffer_chunk_limit_size
variables. LP#2079988
The
ironic-inspector
service user is now assigned the system scopeall
. This allows it to create baremetal ports during node inspection again. LP#2064655
Fixes internal endpoint for the
heat-cfn
(CloudFormation) service. LP#2087537
Removes the Nova configuration option
[api] use_forwarded_for
. This option was deleted from Nova in the 2024.1 release.
Adds a check to stop deploying/upgrading the RabbitMQ containers if it will result in downgrading the version of RabbitMQ running.
Fixes a bug where the RabbitMQ version check would fail to pull the new image due to lack of auth. LP#2086171
Other Notes¶
Services and endpoints can now be removed by setting the option
state: absent
.
The www_authenticate_uri parameter, which is used to indicate to clients where they should get a token from in order to authenticate against a service, is switched from the internal identity endpoint to the public endpoint, see also this note.
19.0.0.0rc1¶
Upgrade Notes¶
Rewrite kolla-ansible CLI in Python. Moving the CLI to Python allows for easier maintenance and larger feature set. The CLI was built using the cliff package that is used in the
openstack
andkayobe
commands.This patch introduces a few breaking changes stemming from the nature of the cliff package:
the order of parameters must be
kolla-ansible <action> <arguments>
mariadb_backup
andmariadb_recovery
now aremariadb-backup
andmariadb-recovery
The
--key
parameter has also been dropped as it was duplicating--vault-password-file
.