Victoria Series (15.1.0 - 16.0.x) Release Notes¶
16.0.5-20¶
Upgrade Notes¶
When upgrading Ironic to address the
qemu-imgimage conversion security issues, theironic-python-agentramdisks will also need to be upgraded.
As a result of security fixes to address
qemu-imgimage conversion security issues, a new configuration parameter has been added to Ironic,[conductor]permitted_image_formatswith a default value of “raw,qcow2,iso”. Raw and qcow2 format disk images are the image formats the Ironic community has consistently stated as what is supported and expected for use with Ironic. These formats also match the formats which the Ironic community tests. Operators who leverage other disk image formats, may need to modify this setting further.
On Victoria release, to use certification file on HTTPS connection, iRMC driver requires python-scciclient version to be one of >=0.8.2,<0.9.0, or >=0.9.5,<0.10.0 and packaging >=16.5
Security Issues¶
Ironic now checks the supplied image format value against the detected format of the image file, and will prevent deployments should the values mismatch. If being used with Glance and a mismatch in metadata is identified, it will require images to be re-uploaded with a new image ID to represent corrected metadata. This is the result of CVE-2024-44082 tracked as bug 2071740.
Ironic always inspects the supplied user image content for safety prior to deployment of a node should the image pass through the conductor, even if the image is supplied in
rawformat. This is utilized to identify the format of the image and the overall safety of the image, such that source images with unknown or unsafe feature usage are explicitly rejected. This can be disabled by setting[conductor]disable_deep_image_inspectiontoTrue. This is the result of CVE-2024-44082 tracked as bug 2071740.
Ironic also inspect images which would normally be provided as a URL for direct download by the
ironic-python-agentramdisk. This is enabled by default and increases the overall network traffic and disk space utilization of the conductor. This level of inspection can be disabled by setting[conductor]conductor_always_validates_imagestoFalse. Doing so is not advisable as Zed release and earlierironic-python-agentramdisks will not be made available due to backport regression risk. This is the result of CVE-2024-44082 tracked as bug 2071740.
Ironic now explicitly enforces a list of permitted image types for deployment via the
[conductor]permitted_image_formatssetting, which defaults to “raw”, “qcow2”, and “iso”. While the project has classically always declared permissible images as “qcow2” and “raw”, it was previously possible to supply other image formats known toqemu-img, and the utility would attempt to convert the images. The “iso” support is required for “boot from ISO” ramdisk support.
Ironic now explicitly passes the source input format to executions of
qemu-imgto limit the permitted qemu disk image drivers which may evaluate an image to prevent any mismatched format attacks againstqemu-img.
The
ansibledeploy interface example playbooks now supply an input format to execution ofqemu-img. If you are using customized playbooks, please add “-f {{ ironic.image.disk_format }}” to your invocations ofqemu-img. If you do not do so,qemu-imgwill automatically try and guess which can lead to known security issues with the incorrect source format driver.
Operators who have implemented any custom deployment drivers or additional functionality like machine snapshot, should review their downstream code to ensure they are properly invoking
qemu-img. If there are any questions or concerns, please reach out to the Ironic project developers.
Operators are reminded that they should utilize cleaning in their environments. Disabling any security features such as cleaning or image inspection are at your own risk. Should you have any issues with security related features, please don’t hesitate to open a bug with the project.
The
[conductor]disable_deep_image_inspectionsetting is conveyed to theironic-python-agentramdisks automatically, and will prevent those operating ramdisks from performing deep inspection of images before they are written.
The
[conductor]permitted_image_formatssetting is conveyed to theironic-python-agentramdisks automatically. Should a need arise to explicitly permit an additional format, that should take place in the Ironic service configuration.
An issue in Ironic has been resolved where image checksums would not be checked prior to the conversion of an image to a
rawformat image from another image format.With default settings, this normally would not take place, however the
image_download_sourceoption, which is available to be set at anodelevel for a single deployment, by default for that baremetal node in all cases, or via the[agent]image_download_sourceconfiguration option when set tolocal. By default, this setting ishttp.This was in concert with the
[DEFAULT]force_raw_imageswhen set toTrue, which caused Ironic to download and convert the file.In a fully integrated context of Ironic’s use in a larger OpenStack deployment, where images are coming from the Glance image service, the previous pattern was not problematic. The overall issue was introduced as a result of the capability to supply, cache, and convert a disk image provided as a URL by an authenticated user.
Ironic will now validate the user supplied checksum prior to image conversion on the conductor. This can be disabled using the
[conductor]disable_file_checksumconfiguration option.
Modifies the
irmchardware type to include a capability to control enforcement of HTTPS certificate verification. By default this is enforced. python-scciclient version must be one of >=0.8.2,<0.9.0 or >=0.9.5,<0.10.0 Or certificate verification will not occur.
Bug Fixes¶
Fixes multiple issues in the handling of images as it relates to the execution of the
qemu-imgutility, which is used for image format conversion, where a malicious user could craft a disk image to potentially extract information from anironic-conductorprocess’s operating environment.Ironic now explicitly enforces a list of approved image formats as a
[conductor]permitted_image_formatslist, which mirrors the image formats the Ironic project has historically tested and expressed as known working. Testing is not based upon file extension, but upon content fingerprinting of the disk image files. This is tracked as CVE-2024-44082 via bug 2071740.
Fixes a security issue where Ironic would fail to checksum disk image files it downloads when Ironic had been requested to download and convert the image to a raw image format. This required the
image_download_sourceto be explicitly set tolocal, which is not the default.This fix can be disabled by setting
[conductor]disable_file_checksumtoTrue, however this option will be removed in new major Ironic releases.As a result of this, parity has been introduced to align Ironic to Ironic-Python-Agent’s support for checksums used by
standaloneusers of Ironic. This includes support for remote checksum files to be supplied by URL, in order to prevent breaking existing users which may have inadvertently been leveraging the prior code path. This support can be disabled by setting[conductor]disable_support_for_checksum_filestoTrue.
Fixes Ironic integration with Cinder because of changes which resulted as part of the recent Security related fix in bug 2004555. The work in Ironic to track this fix was logged in bug 2019892. Ironic now sends a service token to Cinder, which allows for access restrictions added as part of the original CVE-2023-2088 fix to be appropriately bypassed. Ironic was not vulnerable, but the restrictions added as a result did impact Ironic’s usage. This is because Ironic volume attachments are not on a shared “compute node”, but instead mapped to the physical machines and Ironic handles the attachment life-cycle after initial attachment.
Fixes rebooting into the agent after changing BIOS settings in fast-track mode with the
redfish-virtual-mediaboot interface. Previously, the ISO would not be configured.
Adds
driver_info/irmc_verify_caoption to specify certification file. Default value of driver_info/irmc_verify_ca is True.
16.0.5¶
Bug Fixes¶
Fixes connection caching issues with Redfish BMCs where AccessErrors were previously not disqualifying the cached connection from being re-used. Ironic will now explicitly open a new connection instead of using the previous connection in the cache. Under normal circumstances, the
sushyredfish library would detect and refresh sessions, however a prior case exists where it may not detect a failure and contain cached session credential data which is ultimately invalid, blocking future access to the BMC via Redfish until the cache entry expired or theironic-conductorservice was restarted. For more information please see story 2009719.
16.0.4¶
Security Issues¶
Fixes an issue with the
/v1/nodes/detailendpoint where an authenticated user could explicitly ask for aninstance_uuidlookup and the associated node would be returned to the user with sensitive fields redacted in the result payload if the user did not explicitly haveownerorlesseepermissions over the node. This is considered a low-impact low-risk issue as it requires the API consumer to already know the UUID value of the associated instance, and the returned information is mainly metadata in nature. More information can be found in Storyboard story 2008976.
Bug Fixes¶
If the agent accepts a command, but is unable to reply to Ironic (which sporadically happens before of the eventlet’s TLS implementation), we currently retry the request and fail because the command is already executing. Ironic now detects this situation by checking the list of executing commands after receiving a connection error. If the requested command is the last one, we assume that the command request succeeded.
Fixes fast-track to prevent marking the agent as alive if trying to rebuild a node before the fast-track timeout has expired.
Fixes potential cache coherency issues by caching the AgentClient per task, rather than globally.
Fixes the
[deploy]configdrive_use_object_storeoption that was broken during the Python 3 transition.
Fixes an issue with the
/v1/nodes/detailendpoint where requests for an explicitinstance_uuidmatch would not follow the standard query handling path and thus not be filtered based on policy determined access level and node levelownerorlesseefields appropriately. Additional information can be found in story 2008976.
Fixes recognition of a busy agent to also handle recognition during deployment steps by more uniformly detecting and identifying when the
ironic-python-agentservice is busy.
Fixes the problem about grub2 config file. Some higher versions of grub2 (e.g. 2.05 or 2.06-rc1) use grub.cfg-01-MAC, while another lower versions of grub2 (e.g. 2.04) use MAC.conf, so we generate both paths in order to be compatible with both.
Fixes
idrac-wsmanmanagement interfaceset_boot_devicemethod that would fail deployment when there are existing jobs present with error “Failed to change power state to ‘’power on’’ by ‘’rebooting’’. Error: DRAC operation failed. Reason: Unfinished config jobs found: <list of existing jobs>. Make sure they are completed before retrying.”. Now there can be non-BIOS jobs present during deployment. This will still fail for cases when there are BIOS jobs present. In such cases should consider moving toidrac-redfishthat does not have this limitation when setting boot device.
Fixed an issue where provisioning/cleaning would fail on IPv6 routed provider networks. See bug: 2009773.
Fixes
idrac-wsmanBIOSapply_configurationandfactory_resetclean and deploy steps to fail correctly in case of error when checking completed jobs. Before the fix when BIOS job failed, then node clean or deploy failed with timeout instead of actual error in cleaning or deploying step.
Fixes redfish firmware update for ilo5 based hardware by making necessary changes to check whether sushy_task.messages is present, since in case of iLo task data does not contain messages attribute. Also it was not calling prepare_ramdisk() before rebooting the system to update the firmware which has been fixed in this patch.
Fixes
idrac-wsmanpower interface to wait for the hardware to reach the target state before returning. For systems where soft power off at the end of deployment to boot to instance failed and forced hard power off was used, this left node successfully deployed in off state without any errors. This broke other workflows expecting node to be on booted into OS at the end of deployment. Additional information can be found in story 2009204.
Correctly wipes agent token on inspection start and abort.
Calculating the ipmitool -N and -R arguments from ironic.conf [ipmi] command_retry_timeout and min_command_interval now takes into account the 1 second interval increment that ipmitool adds on each retry event.
Failure-path ipmitool run duration will now be just less than command_retry_timeout instead of much longer.
Adds handling of Redfish BMC’s which lack a
BootSourceOverrideModeflag, such that it is no longer a fatal error for a deployment if the BMC does not support this field. This most common on BMCs which feature only a partial implementation of theComputerSystemresourceboot, but may also be observable on some older generations of BMCs which recieved updates to have partial Redfish support.
The
redfish-virtual-mediaboot interface no longer passes validation for Dell nodes. Theidrac-redfish-virtual-mediaboot interface must be used for these nodes instead.
The fix for story 2008252 synced the boot mode after changing the boot device because Supermicro nodes reset the boot mode if not included in the boot device set. However this can cause a problem on Dell nodes when changing the mode uefi->bios or bios->uefi, see story 2008712 for details. Restrict the syncing of the boot mode to Supermicro.
Retries virtual media insert on failure to allow for an eject that may not have finished. https://storyboard.openstack.org/#!/story/2008504
Fixes a bug where a conductor could fail to complete a deployment if there was contention on a shared lock. This would manifest as an instance being stuck in the “deploying” state, though the node had in fact started or even completed its final boot.
When Ironic configures the BootSourceOverrideTarget setting via Redfish, on Supermicro BMCs it must always configure BootSourceOverrideEnabled or that will revert to default (Once) on the BMC, see story 2008547 for details. This is different than what is currently implemented for other BMCs in which the BootSourceOverrideEnabled is not configured if it matches the current setting (see story 2007355).
This requires that node.properties[‘vendor’] be ‘supermicro’ which will be set by Ironic from the Redfish system response or can be set manually.
Introduces lazy-loading of ports, portgroups, volume connections and volume targets in task manager to fix performance issues. For periodic tasks which create a task manager object but don’t require the aforementioned data (e.g. power sync), this change should reduce the number of database interactions by around two thirds, speeding up overall execution.
Fixes an issue of powering off with the
idrac-wsmanmanagement interface while the execution of a clear job queue cleaning step is proceeding. Prior to this fix, the clean step would fail when powering off a node.
16.0.3¶
Upgrade Notes¶
An automated detection of a IPMI BMC hardware vendor has been added to appropriately handle IPMI BMC variations. Ironic will now query this and save this value if not already set in order to avoid querying for every single operation. Operators upgrading should expect an elongated first power state synchronization if for nodes with the
ipmihardware type.
Bug Fixes¶
Fixes
idrac-wsmanRAIDcreate_configurationclean step,apply_configurationdeploy step anddelete_configurationclean and deploy step to fail correctly in case of error when checking completed jobs. Before the fix when RAID job failed, then node cleaning or deploying failed with timeout instead of actual error in clean or deploy step.
Fixes issues when
UEFIboot mode has been requested with persistent boot toDISKwhere some versions ofipmitooldo not properly handle multiple options being set at the same time. While some of this logic was addressed in upstream ipmitool development, new versions are not released and vendors maintain downstream forks of the ipmitool utility. When considering vendor specific selector differences along with the current stance of new versions from the upstreamipmitoolcommunity, it only made sense to handle this logic with-in Ironic. In part this was because if already set the selector value would not be updated. Now ironic always transmits the selector value forUEFI.
Fixes handling of Supermicro
UEFIsupporting BMCs with theipmihardware type such that an appropriate boot device selector value is sent to the remote BMC to indicate boot from local storage. This is available for both persistent and one-time boot applications. For more information, please consult story 2008241.
Fixes handling of the
ipmihardware type whereUEFIboot mode and “one-time” boot to PXE has been requested. As Ironic now specifically transmits the raw commands, this setting should be properly appied where previously PXE boot operations may have previously occured inLegacy BIOSmode.
Fixes cleaning with the
ramdiskdeploy interface by reusing the same procedure as for thedirectdeploy interface.
Boot mode is now correctly handled when using
redfish-virtual-mediaboot with locally booted images.
Failed cleaning no longer results in maintenance mode if no clean step is running, e.g. on PXE timeout or failed clean steps validation.
Fixes permission issues when injecting network data into a virtual media.
Other Notes¶
Adds a
detect_vendormanagement interface method to theipmihardware type. This method is being promoted as a higher level interface as the fundimental need to be able to have logic aware of the hardware vendor is necessary with vendor agnostic drivers where slight differences require slightly different behavior.
The
configdriveargument to some utils inironic.common.imagesandironic.drivers.modules.image_utilshas been replaced with a newinject_filesargument. The previous approach did not really work in all situations and we don’t expect 3rd party drivers to use it.
16.0.2¶
Known Issues¶
When
redfish-virtual-mediais used, fast-track mode will not work as expected, nodes will be rebooted between operations.
Upgrade Notes¶
The default value of
[api]api_workersis now limited to 4. Set it explicitly if you need a higher value.
Bug Fixes¶
No longer launches too many API workers on systems with a lot of CPU cores by default.
Correctly handles the node’s custom network data when the
noopnetwork interface is used. Previously it was ignored.
Fixes incorrect injected network data location when using virtual media.
Fixes
redfishBIOSapply_configurationclean and deploy step to fail correctly in case of error when checking if BIOS updates are successfully applied. Before the fix when BIOS updates were unsuccessful, then node cleaning or deploying failed with timeout instead of actual error in clean or deploy step.
When configured to use json-rpc, the
[DEFAULT].hostconfiguration option to ironic-conductor can now be set to an IPv6 address. Previously it could only be an IPv4 address or a DNS name.
No longer tries to pass
BOOTIF=Noneas a kernel parameter when using virtual media. This could break inspection.
Fixes the issue that when the MAC address of a port group is not set and been attached to instance, the landed bond port cannot get IP address due to inconsistent MAC address between the tenant port and the initially allocated one in the config drive.
After changing the boot device via Redfish, check that the boot mode being reported matches what is configured and, if not, set it to the configured value. Some BMCs change the boot mode when the device is set via Redfish, see story 2008252 for details.
The virtual media ISO image building process now respects the
default_boot_modeconfiguration option.
Fixes timeout in fast-track mode with
redfish-virtual-mediawhen running one operation after another (e.g. cleaning after inspection).
16.0.1¶
Bug Fixes¶
Fix an issue when using idrac with vmedia and trying to inspect a node.
Fixes wiping agent token on rebooting via API.
16.0.0¶
Prelude¶
The Ironic team is proud to announce the release of Ironic 16.0.
For over six years, the contributors to this project have continued to drive forth and provide what we collectively feel is the best platform for managing and deploying bare metal hardware.
The innovation, the drive, and the pursuit of improving infrastructure operators’ lives has yet to cease, and has no signs of stopping anytime soon.
As with any release, we have some things we are particularly proud of:
Support for TLS encryption of Agent communications.
Support for in-band deployment steps enabling software RAID to be configured at deployment time.
Ramdisk/Virtual Media pass-through of ISO images.
BMC-less
agentpower control, so BMC’s are not required for deployments.Network configuration injection with virtual media based ramdisks.
Integrated basic authentication for standalone Ironic operators.
And with any major release, a number of bugs have been fixed. Cross-vendor features see increased parity. Every contributor has something to be proud of in this release. And with that, we hope you enjoy it!
New Features¶
Adds
ilo-uefi-httpsboot interface toilo5hardware type. This boot interface levereges the iLO UEFI firmware capability to boot from given HTTPS URLs hosted securely over HTTPS webserver with standard/custom certificates.
Adds functionality to the
iloandilo5hardware types by enabling virtual media boot without user-built deploy/rescue/boot ISO images. Instead, ironic will build necessary images out of common kernel/ramdisk pair (though user needs to provide ESP image). User provided deploy/rescue/boot ISO images are also supported.
Adds support of DHCP less deploy to
iloandilo5hardware types by using thenetwork_dataproperty on the node field, operators can now apply network configuration to be embedded in iLO based Virtual Media based deployment ramdisks which include networking configuration enabling the deployment to operate without the use of DHCP.
Adds an ability to accept a custom TLS certificate in the heartbeat API.
Adds a configuration option
webserver_verify_cato support custom certificates to validate URLs hosted on a HTTPS webserver.
Using the
network_dataproperty on the node field, operators can now apply network configuration to be embedded in Redfish based Virtual Media based deployment ramdisks which include networking configuration enabling the deployment to operate without the use of DHCP. See Redfish driver documentation for more information.
file://images are now supported in thedirectdeploy interface.
Adds a new possible value for
image_download_source:local. When used, evenhttp://images are downloaded, converted to RAW if needed and served from the conductor’s HTTP server. This feature targets primarily nodes with low RAM.
Adds support in
idrac-wsmaninspect hardware interface for reporting number of GPU devices connected to a system. This information is advertised through capabilitypci_gpu_devices, which can be used to make scheduling decisions for the node. Currently, NVIDIA Tesla T4 GPU devices are reported.
Adds support for managing BIOS settings via the Redfish out-of-band (OOB) management protocol to the
idrachardware type. The new hardware BIOS interface implementation which offers it is namedidrac-redfish.The
idrachardware type declares support for that new interface implementation, in addition to all BIOS interface implementations it has been supporting. The highest priority BIOS interface remains the same, the one which relies on the Web Services Management (WS-Man) OOB management protocol. The newidrac-redfishimmediately follows it. It now supports the following BIOS interface implementations, listed in priority order from highest to lowest:idrac-wsman,idrac-redfish, andno-bios.For more information, see story 2008100.
Adds a new configuration option
[ilo]verify_caand a newdriver_infoparameterilo_verify_cato enhance certificate verification for hardware type ilo and ilo5 which can take directory and bolean values apart from file.
Adds functionality to perform out-of-band one button secure erase operation for iLO5 based HPE Proliant servers as a
managementclean stepone_button_secure_eraseforilo5hardware type.
The
image_download_sourceconfiguration option can now also be set per node in theinstance_infoordriver_info(the former having the highest priority).
Allows configuring IPMI cipher suite via the new
driver_infoparameteripmi_cipher_suite.
Adds
driver_internal_infofield to the node-related notificationbaremetal.node.provision_set.*, new payload version 1.16.
Adds support for performing firmware updates using the
redfishandidrachardware types.A new firmware update cleaning step has been added to the
redfishhardware type. Theidrachardware type also automatically gains this capability through inheritance.
A new configuration option
[agent]require_tlsallows rejecting ramdisk callback URLs that don’t use thehttps://schema.
Supports the Fujitsu
irmchardware type again. The Third Party CI for the driver has started to work correctly in September 2020.
Upgrade Notes¶
The
one_button_secure_eraseclean step in theilo5hardware type requiresproliantutilsversion2.10.0. Please upgrade this library to leverage this feature.
The default value of the configuration option
[agent]image_download_sourcehas been changed tohttpto simplify transition from theiscsideploy interface. Set it toswiftexplicitly to maintain the previous behavior.
The deprecated
iscsideploy interface is no longer enabled by default, setenabled_deploy_interfacesto override. It is also no longer the first in the list of deploy interface priorities, so it has to be requested explicitly if thedirectdeploy is also enabled.
Since the
directdeploy interface is now used by default, you need to configure[deploy]http_urland[deploy]http_rootto point at a local HTTP server or configure access to Swift.
Support for token-less agents has been removed as the token-less agent support was deprecated in the Ussuri development cycle. The ironic-python-agent must be updated to 6.1.0 or higher to support communicating with the Ironic deployment after upgrade. This will generally require deployment, cleaning, and rescue kernels and ramdisks to be updated. If this is not done, actions such as cleaning and deployment will time out as the agent will be unable to record heartbeats with Ironic. For more information, please see the agent token documentation.
The
redfish-virtual-mediaboot interface is now the last in the list of priorities from theredfishhardware type. This means that new nodes will be created withipxeorpxeboot if they are enabled. The reason for this change is limited support for pure Redfish virtual media from hardware vendors.To use virtual media with Redfish, please provide an explicit
boot_interfaceparameter when creating nodes. If you enable only theredfishhardware type, you can also set thedefault_boot_interfaceconfiguration option toredfish-virtual-media.
Deprecation Notes¶
The
[ilo]ca_fileconfiguration option is deprecated for removal, please use[ilo]verify_cainstead which can take directory and boolean values apart from file for certificate verification.
The
iscsideploy interface is now deprecated,directoransibledeploy should be used instead. We expected the complete removal of theiscsideploy code to happen in the “X” release.
With the switch from neutronclient to openstacksdk the
[neutron]/retriesoption has been deprecated, use[neutron]/status_code_retriesand[neutron]/status_code_retry_delayinstead.
Security Issues¶
Ramdisks supporting agent token are now globally required by Ironic. As this is a core security mechanism, it cannot be disabled and support for the
[DEFAULT]require_agent_tokenconfiguration parameter has been removed as tokens are now always required by Ironic. For more information, please see the agent token documentation.
Bug Fixes¶
Fixes compatability with some hardware that requires the file name of any virtual media to end with the suffix “.iso” when Ironic generates a virtual media image. We recommend operators generating their own virtual media files to name the files with proper extensions.
Fixes the deployment failure with Ussuri (and older) ramdisks that happens when another IPA command runs after
prepare_image.
Fixes an issue with the
ansibledeployment interface where automatic root deviec selection would accidently choose the system CD-ROM device, which was likely to occur when the ansible deployment interface was used with virtual media boot. Theansibledeployment interface now ignores all Ramdisks, Loopbacks, CD-ROMs, and floppy disk devices.
Fixes an issue that caused in-band deploy steps inserted before
write_imageto be skipped when fast-track is used.
Fixes an issue where in-band deploy and clean steps were being cached across reboots of the agent.
Fixed iRMC inspection for getting MAC address.
Fixes an issue with agent token handling where the agent has not been upgraded resulting in an AgentAPIError, when the token is not required. The conductor now retries without sending an agent token.
Fixes a potential race in the hash ring code that could result in the hash rings never updated after their initial load.
Fixes the deprecated
idrachardware interface implementation__init__methods to call their base class__init__methods before emitting a log message warning about their deprecation. For more information, see story 2008197.
Fixes an issue where agent heartbeats would be queued if a pre-existing lock was being held for the node which performed a heartbeat operation. The agent heartbeat implementation will no longer retry attempts to acquire an exclusive lock.
Prevents a take over from happening in the middle of a deploy step processing. This could happen if the RPC call
continue_node_deployis routed to a different conductor.
Fixes wiping the agent secret token on manual power off or reboot. Also makes sure to remove the agent URL since it may potentially change.
Fixes HTTP 500 when trying to unset the
protectedattribute via the CLI.
Fixes cleaning and managed inspection not respecting the
default_boot_modeconfiguration option.
Fixes cleaning and managed inspection not following the standard boot mode handling logic, particularly, not trying to assert the requested boot mode if the driver allows it.
Fixes
redfishBIOS interfaceapply_configurationcleaning/deploy step to work with Redfish Services that must be supplied the Distributed Management Task Force (DMTF) Redfish standard@Redfish.SettingsApplyTimeannotation [1] to specify when to apply the requested settings, such as the Dell EMC integrated Dell Remote Acesss Controller (iDRAC).For more information, see story 2008163.
[1] http://redfish.dmtf.org/schemas/DSP0266_1.11.0.html#settings-resource
No longer silently ignores exceptions that happen when trying to run the next clean or deploy step.
Other Notes¶
The ironic conductor internal logic has been updated to return an error if no agent version has been submitted during a heartbeat. This is because versions have been transmitted by the agents for quite some time and support for the default use of agent token forces all agents to be updated. As such redundant code been removed and tests updated accordingly.
Communication with neutron is now using openstacksdk, removing the dependency on neutronclient.
15.2.0¶
New Features¶
Adds inband deploy step
flash_firmware_sumto themanagementinterface of theiloandilo5hardware types. The required minimum version for the proliantutils library is 2.9.5.
Adds functionality to the
ipxeboot interface to support use of aninstance_info\boot_isovalue with theramdiskdeployment interface.
Adds functionality to allow a user to supply a node
instance_info/boot_isoparameter on machines utilizing theredfish-virtual-mediaboot interface. When combined with theramdiskdeployment interface, this allows an instance to boot into a user supplied ISO image.
The new experimental
agentpower interface allows limited provisioning operations on nodes without BMC credentials. See story 2007771 for details.
The
agentRAID interface now supports building RAID as a deploy stepapply_configuration.
Adds raid configuration validation to deploy step
apply_configurationofagentRAID interface. Also, a post deploy hook has been added to this deploy step to update root device hint.
Adds a new
driver_infoparameteragent_verify_caand a corresponding configuration option[agent]verify_cathat allow specifying a file with certificates to use when accessing IPA. Set toFalseto disable certificate validation.
The
deploydeploy step of thedirectdeploy interface has been split into three deploy steps:deployitself (priority 100) boots the deploy ramdiskwrite_image(priority 80) downloads the user image from inside the ramdisk and writes it to the disk.prepare_instance_boot(priority 60) prepares the boot device and writes the bootloader (if needed).
Priorities 81 to 99 to be used for in-band deploy steps that run before the image is written. Priorities 61 to 79 can be used for in-band deploy steps that modify the written image before the bootloader is installed.
Provides a new option
[DEFAULT]hash_ring_algorithmthat specifies which cryptographic algorithm to use when building the hash ring. Set to something other thanmd5when using ironic on a system in FIPS mode.
Adds support for boot mode retrieval and setting with the
iloandilo5hardware types.
Adds support for running custom in-band deploy steps when provisioning. Step priorities from 41 to 59 can be used for steps that run after the image is written and the bootloader is installed.
Adds the capability for an operator to set a configuration setting which tells the ironic-python-agent it is okay to skip read-only block devices when performing an
erase_devicescleaning operation. This requires ironic-python-agent version 6.0.0 or greater and can be set using the[deploy]erase_skip_read_onlyconfiguration option.
The
deploydeploy step of theiscsideploy interface has been split into three deploy steps:deployitself (priority 100) boots the deploy ramdiskwrite_image(priority 80) writes the image to the disk exposed via iSCSI.prepare_instance_boot(priority 60) prepares the boot device and writes the bootloader (if needed).
Priorities 81 to 99 to be used for in-band deploy steps that run before the image is written. Priorities 61 to 79 can be used for in-band deploy steps that modify the written image before the bootloader is installed.
The
deploydeploy step of theansibledeploy interface has been split into two deploy steps:deployitself (priority 100) boots the deploy ramdiskwrite_image(priority 80) writes the image to the disk and configures the bootloader.
Priorities 81 to 99 to be used for in-band deploy steps that run before the image is written.
Adds network_data property to the node, a dictionary that represents the node static network configuration. The Ironic API performs formal JSON validation of node network_data content against user-supplied JSON schema at driver validation step.
Allow port lists to be filtered by project. Doing so checks the specified project against the port’s node’s owner and lessee.
Deprecation Notes¶
Running the whole deployment process as a monolithic
deploy.deploydeploy step is now deprecated. In a future release this step will only be used to prepare deployment and starting the agent, and special handling will be removed. All third party deploy interfaces must be updated to provide real deploy steps instead and set thehas_decomposed_deploy_stepsattribute toTrueon the deploy interface level.
The configuration options
[json_rpc]http_basic_usernameand[json_rpc]http_basic_passwordhave been deprecated in favour of the more generic[json_rpc]usernameand[json_rpc]password.
Bug Fixes¶
Fixes RAID
apply_configurationdeploy step foridrac-wsmanwhere deployment failed withTypeError. See story 2007963.
Fixes deployment hanging on an invalid in-band deploy step in a deploy templates.
Allows deleting nodes with a broken driver unless they require stopping serial console.
Fixes updating driver fields for nodes with a broken driver. This is required to be able to set maintenance for such nodes.
Fixes json_rpc client connections always using HTTP even if use_ssl was set to True.
When Ironic is doing IPMI retries the configured
min_command_intervalshould be used instead of a default value of1, which may be too short for some BMCs.
Fixes missing
agentRAID compatibility for theilo5andidrachardware type preventing software RAID for working with them.
Fixes an issue where
ironic-conductorinitialization could return aNodeNotLockederror for requests requiring locks when the conductor was starting. This was due to the conductor removing locks after beginning accepting new work. The lock removal has been moved to after the Database connectivity has been established but before the RPC bus is initialized.
Fixes the conductor so the power sync operations are not asserted for nodes in the
adopt failedstate.
Fixes the issue that port auto allocation for the socat console failed to correctly identify the availablility of ports under IPv6 networks.
Removes stale agent tokens when rebooting nodes using API. This prevents lookup failures for nodes that get rebooted between fast-track operations.
Removes stale agent token on rescue and unrescue operations. Previously it would cause subsequent rescue operations to fail.
Fixes the preservation of potentially incorrect power state information when adoption process fails. Power state is now wiped as part of the failure handling process instead of being preserved.
Other Notes¶
The proliantutils library version 2.9.5 enables
ssaclibased in-band deploy stepapply_configurationofagentRAID interface foriloandilo5hardware types.
Support for iPXE booting a ISO medium will only work if the ramdisk loaded by the bootloader contains all artifacts required for the booting operating system to load. This is a limitation of iPXE and x86 systems architecture, as the memory allocated for the rest of the ISO disk image in memory is freed by the booting kernel.
As part of the agent deploy interfaces refactoring, breaking changes will be made to implementations of
AgentDeployandISCSIDeploy. Third party deploy interfaces must be updated to inheritHeartbeatMixin,AgentBaseMixinorAgentDeployMixinfromironic.drivers.modules.agent_baseinstead since their API is considered more stable.
Starting in ironic-python-agent 6.0.0, metadata erasure of read-only devices is skipped by default.
A new method
supports_power_synchas been added toPowerInterface. If it returnsFalse, the conductor will not try to assert power state for the node, merely recording the returned state instead.
The base agent deploy interface code now correctly handles power interfaces that do not support the
power onaction but supportreboot.
15.1.0¶
New Features¶
Adds raid interface for ibmc driver which includes
delete_configurationandcreate_configurationsteps.
Enable Basic HTTP authentication middleware.
Having noauth as the only option for standalone ironic causes constraints on how the API is exposed on the network. Having some kind of authentication layer behind a TLS deployment eases these constraints.
When the config option
auth_strategyis set tohttp_basicthen non-public API calls require a valid HTTP Basic authentication header to be set. The config optionhttp_basic_auth_user_filedefaults to/etc/ironic/htpasswdand points to a file which supports the Apache htpasswd syntax[1]. This file is read for every request, so no service restart is required when changes are made.Like the
noauthauth strategy, thehttp_basicauth strategy is intended for standalone deployments of ironic, and integration with other OpenStack services cannot depend on a service catalog.The only password digest supported is bcrypt, and the
bcryptpython library is used for password checks since it supports$2y$prefixed bcrypt passwords as generated by the Apache htpasswd utility.To try HTTP basic authentication, the following can be done:
Set
/etc/ironic/ironic.confDEFAULTauth_strategytohttp_basicPopulate the htpasswd file with entries, for example:
htpasswd -nbB myName myPassword >> /etc/ironic/htpasswMake basic authenticated HTTP requests, for example:
curl --user myName:myPassword http://localhost:6385/v1/drivers
[1] https://httpd.apache.org/docs/current/misc/password_encryptions.html
Adds a new
[ipmi]use_ipmitool_retriesoption. When set toTrueand timing is supported by ipmitool, the number of retries and command interval will be passed to ipmitool so that ipmitool will do the retries. When set toFalse, ironic will do the retries. Default isTrue.
Adds an ability to generate network boot templates even for nodes that use local boot via the new
[pxe]enable_netboot_fallbackoption. This is required to work around the situation where switching boot devices does not work reliably.
Adds the ability for Ironic to attach a node to a specific port or portgroup. This is accomplished by having the node vif_attach API accept a port_uuid or portgroup_uuid key within vif_info. If one is specified, then Ironic will attempt to attach to the specified port/portgroup. Specifying both returns an error.
Known Issues¶
Some BMCs do not support the
Channel Cipher Suitescommand that newer versions of ipmitool use. These versions of ipmitool will resend this command for each ipmitool retry, resulting in long response times. Setting[ipmi]use_ipmitool_retriestofalsewill avoid this situation by implementing retries on the ironic level.
The SNMP hardware type cannot change boot devices and thus may fail to deploy nodes with local boot. To work around this problem, set
[pxe]enable_netboot_fallbacktoTrue.
Some redfish-enabled hardware is known not to support persistent boot device setting that is used by the Bare Metal service for deployed instances. The
redfishhardware type tries to work around this problem, but rebooting such an instance in-band may cause it to boot incorrectly. A predictable boot order should be configured in the node’s boot firmware to avoid issues and at least metadata cleaning must be enabled. See this mailing list thread for technical details.
Upgrade Notes¶
The
[conductor]api_urlwas deprecated and removed, use[service_catalog]endpoint_overrideinstead if required to use a specific ironic api url.
The
[cinder]urlwas removed, use[cinder]endpoint_overrideinstead.
The
[DEFAULT]fatal_exception_format_errorswas removed, use[ironic_lib]fatal_exception_format_errorsinstead.
Operators upgrading from earlier versions using PXE should explicitly set
[pxe]ipxe_bootfile_name,[pxe]uefi_ipxe_bootfile_name, and possibly[pxe]ipxe_bootfile_name_by_archsettings, as well as a iPXE specific[pxe]ipxe_config_templateoverride, if required.Setting the
[pxe]ipxe_config_templateto no value will result in the[pxe]pxe_config_templatebeing used. The default value points to the supplied standard iPXE template, so only highly customized operators may have to tune this setting.
Updates required ibmcclient version for ibmc drivers to 0.2.2.
A permission setting has been added for
redfish-virtual-mediaboot interface, which allows for explicit file permission setting when the driver is used. The default for the new[redfish]file_permission setting is ``0u644, or 644 if manually changed usingchmodon the command line. Operators may need to check/httpboot/redfishfolder permissions if usingredfish-virtual-mediaif they were running the conductor with a specificumaskto work around the permission setting defect.
Bug Fixes¶
Instead of increasing timeout when running long synchronous tasks on ironic-python-agent, ironic now runs them asynchronously and polls the agent until completion. It is no longer necessary to account for long-running tasks when setting
[agent]command_timeout.
Fixes a rare issue where agent successfully powers off a node after deployment, but ironic never learns about it and does another reboot.
Fixes deployment in fast-track mode by keeping the required internal fields (
agent_urlandagent_secret_token) intact when starting and finishing deployment and cleaning.
Fixes deleting nodes with maintenance mode on and an allocation present. Previously it caused an internal server error. See story 2007823 for details.
Change the default for
use_ipmitool_retriestoFalseso that Ironic will do the retries by default. This is needed for certain BMCs that don’t support the Cipher Suites command and ipmitool retries take an excessively long time. See story 2007632 for additional information.
Cleans up nodes stuck in the
deletingstate on conductor restart.
Fixes fast-track deployments with the
directdeploy interface that used to hang previously.
Fixes periodic task initialization options to prevent a negative number. If
[conductor]clean_callback_timeout,[conductor]inspect_wait_timeoutor[conductor]inspect_wait_timeouthave a negative value an error will be triggered.
Ironic now does not try to allocate the space needed for instance image conversion to raw format if it is already raw.
Addresses the lack of an ability to explicitly set different bootloaders for
iPXEandPXEbased boot operations via their respectiveipxeandpxeboot interfaces.
Fixes a bug in “fast track” where Ironic would delete the
agent tokenupon exiting cleaning steps. However, if we are in fast track mode, we can preserve the token and continue operations with the agent as it is not powered off during fast track operations.
Fixes a workaround for hardware that does not support persistent boot device setting with the
redfishoridrac-redfishmanagement interface implementation. When such situation is detected, ironic falls back to one-time boot device setting, restoring it on every reboot or power on.For more information, see story 2007733.
Fixes the virtual disks creation by changing PERC H740P controller mode from Enhanced HBA to RAID in delete_configuration clean step. PERC H740P controllers supports RAID mode and Enhanced HBA mode. When the controller is in Enhanced HBA, it creates single disk RAID0 virtual disks of NON-RAID physical disks. Hence the request for VD creation with supported RAID fails due to no available physical disk. This patch converts the PERC H740P RAID controllers to RAID mode if enhanced HBA mode found enabled See bug bug 2007711 for more details
Fixes fast track deployment preceeded by managed inspection by providing the ironic API URL to the ramdisk so that it can heartbeat.
Fixes the JSON RPC backend potentially hanging on inability to connect to a conductor. The default timeout is now 120 seconds. The timeout and the number of retries can be adjusted via the configuration options
[json_rpc]timeoutand[json_rpc]connect_retriesaccordingly.
Fixes logic that is applied to port deletions to also consider the presence of a VIF attachment record, which should be removed before attempting to delete the node. Failure to do so can result in erroneous records in the Networking Service.
No longer tries to set
local_gbtoMAXwhen building RAID with the root disk usingMAXfor its size.
To provide a workaround for incorrect boot order problems on some hardware, the
redfishhardware type now supports thenoopmanagement interface, similarly to IPMI and SNMP.
Rebooting a node with the
redfishpower interface is now implemented via a power off request followed by power on to avoid returning success when a node stays powered on after the reboot request.
Provides a workaround for hardware that does not support persistent boot device setting with the
redfishhardware type. When such situation is detected, ironic will fall back to one-time boot device setting, restoring it on every reboot.
Fixes an issue where the folder
/httpboot/redfishwas being created with incorrect permissions.
If the disk format of the image is provided in the instance_info, skip the memory check if it is set to raw and raw image streaming is enabled. That allows to stream raw images provided as URL and not through Glance.
Other Notes¶
Ramdisk logs are now collected during cleaning the same way as during deployment.
The following configuration options can now be reloaded without restarting ironic:
From
[agent]:memory_consumed_by_agent,stream_raw_images,deploy_logs_*,image_download_source,command_timeoutandneutron_agent_poll_interval.From
[api]:max_limit,public_endpointandramdisk_heartbeat_timeout.From
[conductor]:heartbeat_timeout,force_power_state_during_sync,automated_clean,soft_power_off_timeout,power_state_change_timeout,rescue_password_hash_algorithmandrequire_rescue_password_hashed.From
[DEFAULT]:default_resource_class,force_raw_images,parallel_image_downloads,default_portgroup_modeandrequire_agent_token.From
[deploy]:enable_ata_secure_erase,erase_devices_priority,erase_devices_metadata_priority,shred_random_overwrite_iterations,shred_final_overwrite_with_zeros,continue_if_disk_secure_erase_fails,disk_erasure_concurrency,power_off_after_deploy_failure,default_boot_option,default_boot_mode,configdrive_use_object_store,fast_track, andfast_track_timeout.From
[ipmi]:kill_on_timeout,disable_boot_timeout,command_retry_interval,min_command_interval,debugandadditional_retryable_ipmi_errors.From
[iscsi]:portal_port,conv_flagsandverify_attempts.From
[neutron]:port_setup_delay,*_network,*_network_security_groups,request_timeout,add_all_portsanddhcpv6_stateful_address_count.From
[nova]:send_power_notifications.From
[pxe]:pxe_append_params,default_ephemeral_format,pxe_config_template,uefi_pxe_config_template,pxe_config_template_by_arch,ip_versionandipxe_use_swift.From
[redfish]:use_swift,swift_container,swift_object_expiry_timeoutandkernel_append_params.