-
Notifications
You must be signed in to change notification settings - Fork 23
Ironic deployment guide documentation #1010
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,232 @@ | ||
====== | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for sharing the information you have @assumptionsandg. I think we need to consider what the scope of this doc is. We have some general notes on ironic deployment in https://docs.google.com/document/d/1H3hGYzJzieX8w7phxS3xmD6i6OEZgiHmBjvZ7EJEbU4/edit that are possibly more detailed than this but not public. We might also consider whether a generic description of ironic config might be better placed in kayobe or kolla-ansible upstream docs. Docs that might belong in SKC are generally either
What was the original request for these docs about? Was it to cover the use of overcloud ironic to manage hypervisors? That could be considered under 1. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the purpose was originally managing hypervisors but I wasn't sure whether to include all of the RAL documentation I had up to this point, I do still need to add the section about managing hypervisors after deployment. Maybe it would make sense to split the deployment part out and propose that upstream? The openstack-config part might be too opinionated for upstream though. |
||
Ironic | ||
====== | ||
|
||
Ironic networking | ||
================= | ||
|
||
Ironic will require the workload provisioning and cleaning networks to be | ||
configured in ``networks.yml`` | ||
|
||
The workload provisioning network will require an allocation pool for | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would helpful to clarify what these pools are used for. I.e. note that one pool is for statically assigned IPs that will be used by OpenStack Ironic services, and the other pools (Neutron ones) are for dynamically assigning IPs to hosts being provisioned / inspected. I like the idea of suggesting a /16 to start with, to leave room for growth. |
||
Ironic inspection and for Neutron, an example configuration is shown | ||
below. | ||
|
||
.. code-block:: yaml | ||
|
||
# Workload provisioning network IP information. | ||
provision_wl_net_cidr: "172.0.0.0/16" | ||
provision_wl_net_allocation_pool_start: "172.0.0.4" | ||
provision_wl_net_allocation_pool_end: "172.0.0.6" | ||
provision_wl_net_inspection_allocation_pool_start: "172.0.1.4" | ||
provision_wl_net_inspection_allocation_pool_end: "172.0.1.250" | ||
provision_wl_net_neutron_allocation_pool_start: "172.0.2.4" | ||
provision_wl_net_neutron_allocation_pool_end: "172.0.2.250" | ||
provision_wl_net_neutron_gateway: "172.0.1.1" | ||
|
||
The cleaning network will also require a Neutron allocation pool. | ||
|
||
.. code-block:: yaml | ||
|
||
# Cleaning network IP information. | ||
cleaning_net_cidr: "172.1.0.0/16" | ||
cleaning_net_allocation_pool_start: "172.1.0.4" | ||
cleaning_net_allocation_pool_end: "172.1.0.6" | ||
cleaning_net_neutron_allocation_pool_start: "172.1.2.4" | ||
cleaning_net_neutron_allocation_pool_end: "172.1.2.250" | ||
cleaning_net_neutron_gateway: "172.1.0.1" | ||
|
||
OpenStack Config | ||
================ | ||
|
||
Overcloud Ironic will require a router to exist between the internal API | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I haven't seen a site use a neutron router for this yet. It's an interesting idea. Is there a specific reason for using one? Normally, at least in the past, we have avoided a dedicated router: https://docs.google.com/document/d/1H3hGYzJzieX8w7phxS3xmD6i6OEZgiHmBjvZ7EJEbU4/edit#heading=h.gyl3n9h885m It would be helpful to start with why a router is required. Specifically, because the TFTP/HTTP server hosting the deploy images is bound exclusively to internal API network, rather than all possible networks the node might PXE boot on. When baremetal nodes PXE boot, they are given the location of the TFTP/HTTP server to fetch the images from. Since the cleaning/provisioning/inspection network is generally not the same as the internal API network, routing is required. |
||
network and the provision workload network, a way to achieve this is by | ||
using `OpenStack Config <https://github.com/stackhpc/openstack-config>` | ||
to define the internal API network in Neutron and set up a router with | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Have you tested this configuration? I imagine you still need policy based routing here to prevent traffic bypassing the router? Eg. Node on cleaning network tries HTTP fetch of IPA image, request is routed to a controller, controller replies not via the router, but directly to the node via the cleaning interface. The node sees that the traffic came back from a different MAC, and ignores it. |
||
a gateway. | ||
|
||
It not necessary to define the provision and cleaning networks in this | ||
configuration as they will be generated during | ||
|
||
.. code-block:: console | ||
|
||
kayobe overcloud post configure | ||
|
||
The openstack config file could resemble the network, subnet and router | ||
configuration shown below: | ||
|
||
.. code-block:: yaml | ||
|
||
networks: | ||
- "{{ openstack_network_intenral }}" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. s/intenral/internal |
||
openstack_network_internal: | ||
name: "internal-net" | ||
project: "admin" | ||
provider_network_type: "vlan" | ||
provider_physical_network: "physnet1" | ||
provider_segmentation_id: 458 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: reference VLAN variable |
||
shared: false | ||
external: true | ||
|
||
subnets: | ||
- "{{ openstack_subnet_internal }}" | ||
openstack_subnet_internal: | ||
name: "internal-net" | ||
project: "admin" | ||
cidr: "10.10.3.0/24" | ||
enable_dhcp: true | ||
allocation_pool_start: "10.10.3.3" | ||
allocation_pool_end: "10.10.3.3" | ||
|
||
openstack_routers: | ||
- "{{ openstack_router_ironic }}" | ||
|
||
openstack_router_ironic: | ||
- name: ironic | ||
project: admin | ||
interfaces: | ||
- net: "provision-net" | ||
subnet: "provision-net" | ||
portip: "172.0.1.1" | ||
- net: "cleaning-net" | ||
subnet: "cleaning-net" | ||
portip: "172.1.0.1" | ||
network: internal-net | ||
|
||
To provision baremetal nodes in Nova you will also require setting a flavour | ||
speciifc to that type of baremetal host. You will need to replace the custom | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. s/speciifc/specific |
||
resource ``resources:CUSTOM_<YOUR_BAREMETAL_RESOURCE_CLASS>`` placeholder with | ||
the resource class of your baremetal hosts, you will also need this later when | ||
configuring the baremetal-compute inventory. | ||
|
||
.. code-block:: yaml | ||
|
||
openstack_flavors: | ||
- "{{ openstack_flavor_baremetal_A }}" | ||
# Bare metal compute node. | ||
openstack_flavor_baremetal_A: | ||
name: "baremetal-A" | ||
ram: 1048576 | ||
disk: 480 | ||
vcpus: 256 | ||
extra_specs: | ||
"resources:CUSTOM_<YOUR_BAREMETAL_RESOURCE_CLASS>": 1 | ||
"resources:VCPU": 0 | ||
"resources:MEMORY_MB": 0 | ||
"resources:DISK_GB": 0 | ||
|
||
Enabling conntrack | ||
================== | ||
|
||
UEFI booting requires conntrack_helper to be configured on the Ironic neutron | ||
router, this is due to TFTP traffic being dropped due to being UDP. You will | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is another complexity that would go away if we didn't use the Neutron router. Assuming conntrack is already enabled on the controller. |
||
need to define some extension drivers in ``neutron.yml`` to ensure conntrack is | ||
enabled in neutron server. | ||
|
||
.. code-block:: yaml | ||
|
||
kolla_neutron_ml2_extension_drivers: | ||
port_security | ||
conntrack_helper | ||
dns_domain_ports | ||
|
||
The neutron l3 agent also requires conntrack to be set as an extension in | ||
``kolla/config/neutron/l3_agent.ini`` | ||
|
||
.. code-block:: ini | ||
|
||
[agent] | ||
extensions = conntrack_helper | ||
|
||
It is also required to load the conntrack kernel module ``nf_nat_tftp``, | ||
``nf_conntrack`` and ``nf_conntrack_tftp`` on network nodes. You can load these | ||
modules using modprobe or define these in /etc/module-load. | ||
|
||
The Ironic neutron router will also need to be configured to use | ||
conntrack_helper. | ||
|
||
.. code-block:: json | ||
|
||
"conntrack_helpers": { | ||
"protocol": "udp", | ||
"port": 69, | ||
"helper": "tftp" | ||
} | ||
|
||
Currently it's not possible to add this helper via the OpenStack CLI, to add | ||
assumptionsandg marked this conversation as resolved.
Show resolved
Hide resolved
|
||
this to the Ironic router you will need to make a request to the API directly, | ||
for example via cURL. | ||
|
||
.. code-block:: console | ||
|
||
curl -g -i -X POST \ | ||
http://<internal_api_vip>:9696/v2.0/routers/<ironic_router_uuid>/conntrack_helpers \ | ||
-H "Accept: application/json" \ | ||
-H "User-Agent: openstacksdk/2.0.0 keystoneauth1/5.4.0 python-requests/2.31.0 CPython/3.9.18" \ | ||
-H "X-Auth-Token: <issued_token>" \ | ||
-d '{ "conntrack_helper": {"helper": "tftp", "protocol": "udp", "port": 69 } }' | ||
|
||
TFTP server | ||
=========== | ||
|
||
By default the Ironic TFTP server (ironic_pxe container) will call the UEFI | ||
boot file ``ipxe-x86_64.efi`` instead of ``ipxe.efi`` meaning no boot file will | ||
be sent during the PXE boot process in the default configuration. | ||
|
||
As of now this is solved by using a hack workaround by changing the boot file | ||
in the ``ironic_pxe`` container. To do this you will need to enter the | ||
container and rename the file manually. | ||
|
||
.. code-block:: console | ||
|
||
docker exec ironic_pxe “mv /tftpboot/ipxe-x86_64.efi /tftpboot/ipxe.efi” | ||
|
||
Baremetal inventory | ||
=================== | ||
|
||
To begin enrolling nodes you will need to define them in the hosts file. | ||
|
||
.. code-block:: ini | ||
|
||
[r1] | ||
hv1 ipmi_address=10.1.28.16 | ||
hv2 ipmi_address=10.1.28.17 | ||
… | ||
|
||
[r1:vars] | ||
ironic_driver=redfish | ||
resource_class=<your_resource_class> | ||
redfish_system_id=<your_redfish_systen_id> | ||
redfish_verify_ca=<your_redfish_verify_ca> | ||
redfish_username=<your_redfish_username> | ||
redfish_password=<your_redfish_password> | ||
|
||
[baremetal-compute:children] | ||
r1 | ||
|
||
The typical layout for baremetal nodes are separated by racks, for instance | ||
in rack 1 we have the following configuration set up where the BMC addresses | ||
are defined for all nodes, and Redfish information such as username, passwords | ||
and the system ID are defined for the rack as a whole. | ||
|
||
You can add more racks to the deployment by replicating the rack 1 example and | ||
adding that as an entry to the baremetal-compute group. | ||
|
||
Node enrollment | ||
=============== | ||
|
||
When nodes are defined in the inventory you can begin enrolling them by | ||
invoking the Kayobe commmand (Note that only the Redfish driver is supported | ||
by this command) | ||
|
||
.. code-block:: console | ||
|
||
kayobe baremetal compute register | ||
assumptionsandg marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Following registration, the baremetal nodes can be inspected and made | ||
available for provisioning by Nova via the Kayobe commands | ||
|
||
.. code-block:: console | ||
|
||
kayobe baremetal compute inspect | ||
kayobe baremetal compute provide |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'd need to reference this doc from configuration/index.rst in order for it to be included in the docs.