Skip to content

Commit c5b053c

Browse files
authored
Merge pull request #29 from stackhpc/README-improvements
Readme improvements
2 parents 952ae6e + b1fb515 commit c5b053c

File tree

1 file changed

+12
-13
lines changed

1 file changed

+12
-13
lines changed

README.md

Lines changed: 12 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -2,28 +2,30 @@
22

33
# stackhpc.openhpc
44

5-
This Ansible role is used to install the necessary packages to have a fully functional OpenHPC cluster.
5+
This Ansible role installs packages and performs configuration to provide a fully functional OpenHPC cluster. It can also be used to drain and resume nodes.
6+
7+
As a role it must be used from a playbook, for which a simple example is given below. This approach means it is totally modular with no assumptions about available networks or any cluster features except for some hostname conventions. Any desired cluster fileystem or other required functionality may be freely integrated using additional Ansible roles or other approaches.
68

79
Role Variables
810
--------------
911

10-
`openhpc_slurm_service_enabled`: checks whether `openhpc_slurm_service` is enabled
11-
12-
`openhpc_slurm_service`: name of the slurm service e.g. `slurmd`
12+
`openhpc_slurm_service_enabled`: boolean, whether to enable the appropriate slurm service (slurmd/slurmctld)
1313

1414
`openhpc_slurm_control_host`: ansible host name of the controller e.g `"{{ groups['cluster_control'] | first }}"`
1515

1616
`openhpc_slurm_partitions`: list of one or more slurm partitions. Each partition may contain the following values:
1717
* `groups`: If there are multiple node groups that make up the partition, a list of group objects can be defined here.
18-
Otherwise, `groups` can be omitted and the following attributes can be defined in the partition object.
18+
Otherwise, `groups` can be omitted and the following attributes can be defined in the partition object:
1919
* `name`: The name of the nodes within this group.
2020
* `cluster_name`: Optional. An override for the top-level definition `openhpc_cluster_name`.
2121
* `num_nodes`: Nodes within the group are assumed to number `0:num_nodes-1`.
22-
* `ram_mb`: Optional. The physical RAM available in each server of this group.
23-
Compute node hostnames are assumed to take the form: `cluster_name-group_name-{0..num_nodes-1}`
24-
* `default`: Optional. A boolean flag for whether this partion. Valid settings are `YES` and `NO`.
25-
* `maxtime`: Optional. A partition-specific time limit in hours, minutes and seconds. The default value is
26-
`openhpc_job_maxtime`, which defaults to `24:00:00`.
22+
* `ram_mb`: Optional. The physical RAM available in each server of this group ([slurm.conf](https://slurm.schedmd.com/slurm.conf.html) parameter `RealMemory`).
23+
24+
For each group (if used) or partition there must be an ansible inventory group `cluster_name-group_name`. The compute nodes in this group must have hostnames in the form `cluster_name-group_name-{0..num_nodes-1}`.
25+
26+
* `default`: Optional. A boolean flag for whether this partion is the default. Valid settings are `YES` and `NO`.
27+
* `maxtime`: Optional. A partition-specific time limit in hours, minutes and seconds ([slurm.conf](https://slurm.schedmd.com/slurm.conf.html) parameter `MaxTime`). The default value is
28+
given by `openhpc_job_maxtime`.
2729

2830
`openhpc_job_maxtime`: A maximum time job limit in hours, minutes and seconds. The default is `24:00:00`.
2931

@@ -80,10 +82,7 @@ To deploy, create a playbook which looks like this:
8082
openhpc_slurm_control_host: "{{ groups['cluster_control'] | first }}"
8183
openhpc_slurm_partitions:
8284
- name: "compute"
83-
flavor: "compute-A"
84-
image: "CentOS7.5-OpenHPC"
8585
num_nodes: 8
86-
user: "centos"
8786
openhpc_cluster_name: openhpc
8887
openhpc_packages: []
8988
...

0 commit comments

Comments
 (0)