Skip to content

Commit ed717da

Browse files
committed
Revert "Separate node and partition configuration (#183)"
This reverts commit 0dec9d8.
1 parent 0dec9d8 commit ed717da

30 files changed

+237
-267
lines changed

.github/workflows/ci.yml

+1
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,7 @@ jobs:
5959
- test11
6060
- test12
6161
- test13
62+
- test14
6263
exclude:
6364
# mariadb package provides /usr/bin/mysql on RL8 which doesn't work with geerlingguy/mysql role
6465
- scenario: test4

README.md

+60-152
Original file line numberDiff line numberDiff line change
@@ -50,53 +50,32 @@ each list element:
5050

5151
### slurm.conf
5252

53-
`openhpc_nodegroups`: Optional, default `[]`. List of mappings, each defining a
54-
unique set of homogenous nodes:
55-
* `name`: Required. Name of node group.
56-
* `ram_mb`: Optional. The physical RAM available in each node of this group
57-
([slurm.conf](https://slurm.schedmd.com/slurm.conf.html) parameter `RealMemory`)
58-
in MiB. This is set using ansible facts if not defined, equivalent to
59-
`free --mebi` total * `openhpc_ram_multiplier`.
60-
* `ram_multiplier`: Optional. An override for the top-level definition
61-
`openhpc_ram_multiplier`. Has no effect if `ram_mb` is set.
53+
`openhpc_slurm_partitions`: Optional. List of one or more slurm partitions, default `[]`. Each partition may contain the following values:
54+
* `groups`: If there are multiple node groups that make up the partition, a list of group objects can be defined here.
55+
Otherwise, `groups` can be omitted and the following attributes can be defined in the partition object:
56+
* `name`: The name of the nodes within this group.
57+
* `cluster_name`: Optional. An override for the top-level definition `openhpc_cluster_name`.
58+
* `extra_nodes`: Optional. A list of additional node definitions, e.g. for nodes in this group/partition not controlled by this role. Each item should be a dict, with keys/values as per the ["NODE CONFIGURATION"](https://slurm.schedmd.com/slurm.conf.html#lbAE) docs for slurm.conf. Note the key `NodeName` must be first.
59+
* `ram_mb`: Optional. The physical RAM available in each node of this group ([slurm.conf](https://slurm.schedmd.com/slurm.conf.html) parameter `RealMemory`) in MiB. This is set using ansible facts if not defined, equivalent to `free --mebi` total * `openhpc_ram_multiplier`.
60+
* `ram_multiplier`: Optional. An override for the top-level definition `openhpc_ram_multiplier`. Has no effect if `ram_mb` is set.
6261
* `gres`: Optional. List of dicts defining [generic resources](https://slurm.schedmd.com/gres.html). Each dict must define:
6362
- `conf`: A string with the [resource specification](https://slurm.schedmd.com/slurm.conf.html#OPT_Gres_1) but requiring the format `<name>:<type>:<number>`, e.g. `gpu:A100:2`. Note the `type` is an arbitrary string.
6463
- `file`: A string with the [File](https://slurm.schedmd.com/gres.conf.html#OPT_File) (path to device(s)) for this resource, e.g. `/dev/nvidia[0-1]` for the above example.
64+
6565
Note [GresTypes](https://slurm.schedmd.com/slurm.conf.html#OPT_GresTypes) must be set in `openhpc_config` if this is used.
66-
* `features`: Optional. List of [Features](https://slurm.schedmd.com/slurm.conf.html#OPT_Features) strings.
67-
* `node_params`: Optional. Mapping of additional parameters and values for
68-
[node configuration](https://slurm.schedmd.com/slurm.conf.html#lbAE).
69-
**NB:** Parameters which can be set via the keys above must not be included here.
70-
71-
Each nodegroup will contain hosts from an Ansible inventory group named
72-
`{{ openhpc_cluster_name }}_{{ group_name}}`. Note that:
73-
- Each host may only appear in one nodegroup.
74-
- Hosts in a nodegroup are assumed to be homogenous in terms of processor and memory.
75-
- Hosts may have arbitrary hostnames, but these should be lowercase to avoid a
76-
mismatch between inventory and actual hostname.
77-
- An inventory group may be missing or empty, in which case the node group
78-
contains no hosts.
79-
- If the inventory group is not empty the play must contain at least one host.
80-
This is used to set `Sockets`, `CoresPerSocket`, `ThreadsPerCore` and
81-
optionally `RealMemory` for the nodegroup.
82-
83-
`openhpc_partitions`: Optional. List of mappings, each defining a
84-
partition. Each partition mapping may contain:
85-
* `name`: Required. Name of partition.
86-
* `nodegroups`: Optional. List of node group names. If omitted, the node group
87-
with the same name as the partition is used.
88-
* `default`: Optional. A boolean flag for whether this partion is the default. Valid settings are `YES` and `NO`.
89-
* `maxtime`: Optional. A partition-specific time limit overriding `openhpc_job_maxtime`.
90-
* `partition_params`: Optional. Mapping of additional parameters and values for
91-
[partition configuration](https://slurm.schedmd.com/slurm.conf.html#SECTION_PARTITION-CONFIGURATION).
92-
**NB:** Parameters which can be set via the keys above must not be included here.
93-
94-
If this variable is not set one partition per nodegroup is created, with default
95-
partition configuration for each.
96-
97-
`openhpc_job_maxtime`: Maximum job time limit, default `'60-0'` (60 days), see
98-
[slurm.conf:MaxTime](https://slurm.schedmd.com/slurm.conf.html#OPT_MaxTime).
99-
**NB:** This should be quoted to avoid Ansible conversions.
66+
67+
* `default`: Optional. A boolean flag for whether this partion is the default. Valid settings are `YES` and `NO`.
68+
* `maxtime`: Optional. A partition-specific time limit following the format of [slurm.conf](https://slurm.schedmd.com/slurm.conf.html) parameter `MaxTime`. The default value is
69+
given by `openhpc_job_maxtime`. The value should be quoted to avoid Ansible conversions.
70+
* `partition_params`: Optional. Mapping of additional parameters and values for [partition configuration](https://slurm.schedmd.com/slurm.conf.html#SECTION_PARTITION-CONFIGURATION).
71+
72+
For each group (if used) or partition any nodes in an ansible inventory group `<cluster_name>_<group_name>` will be added to the group/partition. Note that:
73+
- Nodes may have arbitrary hostnames but these should be lowercase to avoid a mismatch between inventory and actual hostname.
74+
- Nodes in a group are assumed to be homogenous in terms of processor and memory.
75+
- An inventory group may be empty or missing, but if it is not then the play must contain at least one node from it (used to set processor information).
76+
77+
78+
`openhpc_job_maxtime`: Maximum job time limit, default `'60-0'` (60 days). See [slurm.conf](https://slurm.schedmd.com/slurm.conf.html) parameter `MaxTime` for format. The default is 60 days. The value should be quoted to avoid Ansible conversions.
10079

10180
`openhpc_cluster_name`: name of the cluster.
10281

@@ -165,121 +144,50 @@ accessed (with facts gathering enabled) using `ansible_local.slurm`. As per the
165144
in mixed case are from from config files. Note the facts are only refreshed
166145
when this role is run.
167146

168-
## Example
147+
## Example Inventory
169148

170-
### Simple
149+
And an Ansible inventory as this:
171150

172-
The following creates a cluster with a a single partition `compute`
173-
containing two nodes:
151+
[openhpc_login]
152+
openhpc-login-0 ansible_host=10.60.253.40 ansible_user=centos
174153

175-
```ini
176-
# inventory/hosts:
177-
[hpc_login]
178-
cluster-login-0
154+
[openhpc_compute]
155+
openhpc-compute-0 ansible_host=10.60.253.31 ansible_user=centos
156+
openhpc-compute-1 ansible_host=10.60.253.32 ansible_user=centos
179157

180-
[hpc_compute]
181-
cluster-compute-0
182-
cluster-compute-1
158+
[cluster_login:children]
159+
openhpc_login
183160

184-
[hpc_control]
185-
cluster-control
186-
```
161+
[cluster_control:children]
162+
openhpc_login
163+
164+
[cluster_batch:children]
165+
openhpc_compute
166+
167+
## Example Playbooks
168+
169+
To deploy, create a playbook which looks like this:
170+
171+
---
172+
- hosts:
173+
- cluster_login
174+
- cluster_control
175+
- cluster_batch
176+
become: yes
177+
roles:
178+
- role: openhpc
179+
openhpc_enable:
180+
control: "{{ inventory_hostname in groups['cluster_control'] }}"
181+
batch: "{{ inventory_hostname in groups['cluster_batch'] }}"
182+
runtime: true
183+
openhpc_slurm_service_enabled: true
184+
openhpc_slurm_control_host: "{{ groups['cluster_control'] | first }}"
185+
openhpc_slurm_partitions:
186+
- name: "compute"
187+
openhpc_cluster_name: openhpc
188+
openhpc_packages: []
189+
...
187190

188-
```yaml
189-
#playbook.yml
190-
---
191-
- hosts: all
192-
become: yes
193-
tasks:
194-
- import_role:
195-
name: stackhpc.openhpc
196-
vars:
197-
openhpc_cluster_name: hpc
198-
openhpc_enable:
199-
control: "{{ inventory_hostname in groups['cluster_control'] }}"
200-
batch: "{{ inventory_hostname in groups['cluster_compute'] }}"
201-
runtime: true
202-
openhpc_slurm_control_host: "{{ groups['cluster_control'] | first }}"
203-
openhpc_nodegroups:
204-
- name: compute
205-
openhpc_partitions:
206-
- name: compute
207191
---
208-
```
209-
210-
### Multiple nodegroups
211-
212-
This example shows how partitions can span multiple types of compute node.
213-
214-
This example inventory describes three types of compute node (login and
215-
control nodes are omitted for brevity):
216-
217-
```ini
218-
# inventory/hosts:
219-
...
220-
[hpc_general]
221-
# standard compute nodes
222-
cluster-general-0
223-
cluster-general-1
224-
225-
[hpc_large]
226-
# large memory nodes
227-
cluster-largemem-0
228-
cluster-largemem-1
229-
230-
[hpc_gpu]
231-
# GPU nodes
232-
cluster-a100-0
233-
cluster-a100-1
234-
...
235-
```
236-
237-
Firstly the `openhpc_nodegroups` is set to capture these inventory groups and
238-
apply any node-level parameters - in this case the `largemem` nodes have
239-
2x cores reserved for some reason, and GRES is configured for the GPU nodes:
240-
241-
```yaml
242-
openhpc_cluster_name: hpc
243-
openhpc_nodegroups:
244-
- name: general
245-
- name: large
246-
node_params:
247-
CoreSpecCount: 2
248-
- name: gpu
249-
gres:
250-
- conf: gpu:A100:2
251-
file: /dev/nvidia[0-1]
252-
```
253-
254-
Now two partitions can be configured - a default one with a short timelimit and
255-
no large memory nodes for testing jobs, and another with all hardware and longer
256-
job runtime for "production" jobs:
257-
258-
```yaml
259-
openhpc_partitions:
260-
- name: test
261-
nodegroups:
262-
- general
263-
- gpu
264-
maxtime: '1:0:0' # 1 hour
265-
default: 'YES'
266-
- name: general
267-
nodegroups:
268-
- general
269-
- large
270-
- gpu
271-
maxtime: '2-0' # 2 days
272-
default: 'NO'
273-
```
274-
Users will select the partition using `--partition` argument and request nodes
275-
with appropriate memory or GPUs using the `--mem` and `--gres` or `--gpus*`
276-
options for `sbatch` or `srun`.
277-
278-
Finally here some additional configuration must be provided for GRES:
279-
```yaml
280-
openhpc_config:
281-
GresTypes:
282-
-gpu
283-
```
284192

285193
<b id="slurm_ver_footnote">1</b> Slurm 20.11 removed `accounting_storage/filetxt` as an option. This version of Slurm was introduced in OpenHPC v2.1 but the OpenHPC repos are common to all OpenHPC v2.x releases. [](#accounting_storage)

defaults/main.yml

+1-2
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,7 @@ openhpc_slurm_service_started: "{{ openhpc_slurm_service_enabled }}"
44
openhpc_slurm_service:
55
openhpc_slurm_control_host: "{{ inventory_hostname }}"
66
#openhpc_slurm_control_host_address:
7-
openhpc_partitions: "{{ openhpc_nodegroups }}"
8-
openhpc_nodegroups: []
7+
openhpc_slurm_partitions: []
98
openhpc_cluster_name:
109
openhpc_packages:
1110
- slurm-libpmi-ohpc

molecule/README.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ test1 | 1 | N | 2x compute node, sequential na
1010
test1b | 1 | N | 1x compute node
1111
test1c | 1 | N | 2x compute nodes, nonsequential names
1212
test2 | 2 | N | 4x compute node, sequential names
13-
test3 | 1 | Y | 4x compute nodes in 2x groups, single partition
13+
test3 | 1 | Y | -
1414
test4 | 1 | N | 2x compute node, accounting enabled
1515
test5 | 1 | N | As for #1 but configless
1616
test6 | 1 | N | 0x compute nodes, configless
@@ -21,7 +21,7 @@ test10 | 1 | N | As for #5 but then tries to ad
2121
test11 | 1 | N | As for #5 but then deletes a node (actually changes the partition due to molecule/ansible limitations)
2222
test12 | 1 | N | As for #5 but enabling job completion and testing `sacct -c`
2323
test13 | 1 | N | As for #5 but tests `openhpc_config` variable.
24-
test14 | 1 | N | [removed, extra_nodes removed]
24+
test14 | 1 | N | As for #5 but also tests `extra_nodes` via State=DOWN nodes.
2525
test15 | 1 | Y | As for #5 but also tests `partitions with different name but with the same NodeName`.
2626

2727

molecule/test1/converge.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
batch: "{{ inventory_hostname in groups['testohpc_compute'] }}"
88
runtime: true
99
openhpc_slurm_control_host: "{{ groups['testohpc_login'] | first }}"
10-
openhpc_nodegroups:
10+
openhpc_slurm_partitions:
1111
- name: "compute"
1212
openhpc_cluster_name: testohpc
1313
tasks:

molecule/test10/converge.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
batch: "{{ inventory_hostname in groups['testohpc_compute'] }}"
88
runtime: true
99
openhpc_slurm_control_host: "{{ groups['testohpc_login'] | first }}"
10-
openhpc_nodegroups:
10+
openhpc_slurm_partitions:
1111
- name: "compute"
1212
openhpc_cluster_name: testohpc
1313
openhpc_slurm_configless: true

molecule/test10/verify.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@
2929
batch: "{{ inventory_hostname in groups['testohpc_compute'] }}"
3030
runtime: true
3131
openhpc_slurm_control_host: "{{ groups['testohpc_login'] | first }}"
32-
openhpc_nodegroups:
32+
openhpc_slurm_partitions:
3333
- name: "compute"
3434
openhpc_cluster_name: testohpc
3535
openhpc_slurm_configless: true

molecule/test11/converge.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
batch: "{{ inventory_hostname in groups['testohpc_compute'] }}"
1212
runtime: true
1313
openhpc_slurm_control_host: "{{ groups['testohpc_login'] | first }}"
14-
openhpc_nodegroups:
14+
openhpc_slurm_partitions:
1515
- name: "compute_orig"
1616
openhpc_cluster_name: testohpc
1717
openhpc_slurm_configless: true

molecule/test11/verify.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@
2626
batch: "{{ inventory_hostname in groups['testohpc_compute'] }}"
2727
runtime: true
2828
openhpc_slurm_control_host: "{{ groups['testohpc_login'] | first }}"
29-
openhpc_nodegroups:
29+
openhpc_slurm_partitions:
3030
- name: "compute_new"
3131
openhpc_cluster_name: testohpc
3232
openhpc_slurm_configless: true

molecule/test12/converge.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
batch: "{{ inventory_hostname in groups['testohpc_compute'] }}"
1212
runtime: true
1313
openhpc_slurm_control_host: "{{ groups['testohpc_login'] | first }}"
14-
openhpc_nodegroups:
14+
openhpc_slurm_partitions:
1515
- name: "compute"
1616
openhpc_cluster_name: testohpc
1717
openhpc_slurm_configless: true

molecule/test13/converge.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
batch: "{{ inventory_hostname in groups['testohpc_compute'] }}"
88
runtime: true
99
openhpc_slurm_control_host: "{{ groups['testohpc_control'] | first }}"
10-
openhpc_nodegroups:
10+
openhpc_slurm_partitions:
1111
- name: "compute"
1212
openhpc_cluster_name: testohpc
1313
openhpc_slurm_configless: true

molecule/test14/converge.yml

+29
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
---
2+
- name: Converge
3+
hosts: all
4+
vars:
5+
openhpc_enable:
6+
control: "{{ inventory_hostname in groups['testohpc_login'] }}"
7+
batch: "{{ inventory_hostname in groups['testohpc_compute'] }}"
8+
runtime: true
9+
openhpc_slurm_control_host: "{{ groups['testohpc_login'] | first }}"
10+
openhpc_slurm_partitions:
11+
- name: "compute"
12+
extra_nodes:
13+
# Need to specify IPs for the non-existent State=DOWN nodes, because otherwise even in this state slurmctld will exclude a node with no lookup information from the config.
14+
# We use invalid IPs here (i.e. starting 0.) to flag the fact the nodes shouldn't exist.
15+
# Note this has to be done via slurm config rather than /etc/hosts due to Docker limitations on modifying the latter.
16+
- NodeName: fake-x,fake-y
17+
NodeAddr: 0.42.42.0,0.42.42.1
18+
State: DOWN
19+
CPUs: 1
20+
- NodeName: fake-2cpu-[3,7-9]
21+
NodeAddr: 0.42.42.3,0.42.42.7,0.42.42.8,0.42.42.9
22+
State: DOWN
23+
CPUs: 2
24+
openhpc_cluster_name: testohpc
25+
openhpc_slurm_configless: true
26+
tasks:
27+
- name: "Include ansible-role-openhpc"
28+
include_role:
29+
name: "{{ lookup('env', 'MOLECULE_PROJECT_DIRECTORY') | basename }}"

0 commit comments

Comments
 (0)