Skip to content

Commit 0187d6a

Browse files
AnnikaLaumjaehn
andauthored
Restructure HPC systems and add Balfrin (#370)
* Add Balfrin * Fix link * Replace Alps with general HPC section * GitHub Action: Apply external link format * Update vClusters * Update compile instructions * Fix links * Fix more links * Fix links * Small fixes --------- Co-authored-by: Michael Jähn <mjaehn@ethz.ch> Co-authored-by: mjaehn <mjaehn@users.noreply.github.com>
1 parent 8f3a0dc commit 0187d6a

17 files changed

Lines changed: 131 additions & 67 deletions

File tree

docs/SUMMARY.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
* [Events](events/)
55
* [Tasks](tasks/)
66
* [User Support](support/)
7-
* [Alps](alps/)
7+
* [HPC](hpc/)
88
* [Models](models/)
99
* [Tools](tools/)
1010
* [Datasets](datasets/)

docs/alps/SUMMARY.md

Lines changed: 0 additions & 3 deletions
This file was deleted.

docs/datasets/ecmwf_data_cube.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ which is a storage system at the ECMWF Data Center in Bologna.
1313
- 0.5 PB of fast SSD storage
1414
- 2 PB of “slow” storage
1515
- Build via work package of SwissTwins project
16-
- Physically installed at ECMWF Bologna with a direct connection to Alps Lugano [ALPS](../alps/index.md)
16+
- Physically installed at ECMWF Bologna with a direct connection to Alps Lugano [ALPS](../hpc/index.md)
1717
- Part of the multi-site distributed infrastructure (Lugano, Lausanne, Bologna)
1818

1919
## Usage

docs/events/icon_meetings/2022-4.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -89,6 +89,6 @@ BG says EXTPAR can process the CORINE dataset. She didn’t get any errors but h
8989
### General
9090
WS asks if there are any news about the ICON seamless effort.
9191
MA cannot give any more stable information.
92-
JB will send mail. There is information on the [COSMO website :material-open-in-new:](https://www.cosmo-model.org/content/tasks/workGroups/wg3b/default.htm){:target="_blank"}.
92+
JB will send mail. There is information on the [COSMO website :material-open-in-new:](https://www.cosmo-model.org/content/tasks/workGroups/wgPHY/default.htm){:target="_blank"}.
9393

9494
DB sends a reminder about the COSMO/ICON user workshop on 2 February. Please send him your registration and indicate if you want to present a poster.

docs/events/icon_meetings/2024-3.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ AH asks whether Daint or Säntis will be the long-term solution for their applic
4949
DF then asks if they can expect the images from Tödi to be available on Daint. ML replies that although it's not a long-term solution, those images should eventually be deployed on Daint. However, it won’t be as straightforward. DF points out that many people need to complete benchmarking by October. AH asks if ICON can be compiled by loading the user environment without those missing images. ML explains that the software stack image is the same as the user environment and can be accessed on each cluster, but using it isn't always straightforward. Currently, the images are interchangeable, though they still depend on the operating system. In the future, user environments on one cluster may not work on another.
5050
MK confirms that the risk of incompatibility exists and they've already seen this with MCH. Regarding the missing images on Daint, MK explains that the NVIDIA part wasn’t provided completely, which is why he created another environment. These images are not preloaded on the system, so users need to search for them using the user environment tool. MK suggests trying the command `uenv image find -s todi` to locate the images.
5151

52-
BG mentions using Tödi and that while the instructions on C2SM initially worked, two weeks later the executables stopped working and the instructions disappeared. ML responds that there is now an Alps section on the landing page. It should be possible to do the same tasks as before, but without using absolute paths to his environment. Instead, tags can be used. The updated instructions are available on the [workshop materials page :material-open-in-new:](https://c2sm.github.io/alps/#introductory-workshop-material){:target="_blank"} as well as the [user landing page :material-open-in-new:](https://c2sm.github.io/alps/uenvs/#the-uenv-command-line-tool){:target="_blank"}. ML encourages anybody to reach out to him on Slack if any issues arise.
52+
BG mentions using Tödi and that while the instructions on C2SM initially worked, two weeks later the executables stopped working and the instructions disappeared. ML responds that there is now an Alps section on the landing page. It should be possible to do the same tasks as before, but without using absolute paths to his environment. Instead, tags can be used. The updated instructions are available on the [workshop materials page](../../hpc/index.md#introductory-workshop-material) as well as in the CSCS Documentation under [uenv :material-open-in-new:](https://docs.cscs.ch/software/uenv/#downloading-uenv){:target="_blank"}. ML encourages anybody to reach out to him on Slack if any issues arise.
5353

5454
AD from Empa asks if ML has managed to run ICON-ART. ML responds that he has only been testing regular ICON, but Erik was successful in compiling it. AH confirms that Erik followed a workaround. AD asks if there is a plan for C2SM to support ICON-ART. ML explains that ICON-ART is not in the C2SM pipeline and suggests that users create a test case, and C2SM can help set up the CI infrastructure. AH mentions they have an old test case (icon-kit) that needs updating. MJ adds that ICON-ART is now part of ICON-NWP, and a test case could be set up and added.
5555

docs/hpc/SUMMARY.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
* [CSCS](index.md)
2+
* [Santis](santis.md)
3+
* [Balfrin](balfrin.md)
4+
* [Eiger](eiger.md)
5+
* ETHZ
6+
* [Euler](euler.md)

docs/hpc/balfrin.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
# Balfrin
2+
3+
Balfrin is an Alps cluster used by MeteoSwiss.
4+
5+
C2SM does not officially support Balfrin. Nevertheless, you can find instructions on how
6+
to set up ICON on Balfrin at the [Compile section of ICON](../models/icon/compile.md#balfrin).
File renamed without changes.

docs/hpc/euler.md

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
# Euler
2+
3+
Euler is ETH Zurich's central high-performance computing (HPC) cluster, providing computational resources for research across all disciplines.
4+
5+
## Access
6+
7+
Groups at IAC have shares for storage and CPU via the Euler **Climate** group. As a group member, you should have an existing account and access to Euler. If this is not the case, contact your group leader or Urs Beyerle.
8+
9+
## Useful Links
10+
11+
- [Euler documentation :material-open-in-new:](https://docs.hpc.ethz.ch/){:target="_blank"}
12+
- [Climate Euler users :material-open-in-new:](https://wiki.iac.ethz.ch/Collaboration/EulerUsers){:target="_blank"} (:material-lock: IAC login required)
13+
- [Accounting :material-open-in-new:](https://wiki.iac.ethz.ch/Collaboration/EulerAccounting){:target="_blank"} (:material-lock: IAC login required)
14+
- [Euler Climate members :material-open-in-new:](https://wiki.iac.ethz.ch/Collaboration/EulerClimateMembers){:target="_blank"} (:material-lock: IAC login required)
15+
16+
## Software Stack
17+
18+
Euler provides a [software stack :material-open-in-new:](https://docs.hpc.ethz.ch/software/software-stack/){:target="_blank"} via the module command:
19+
20+
```bash
21+
module load stack openmpi
22+
module list
23+
# Currently Loaded Modules:
24+
# 1) gcc/12.2.0 2) stack/2025-06 3) openmpi/4.1.7
25+
```
26+
27+
Afterwards, one can load specific software such as:
28+
29+
```bash
30+
module load cdo nco ncview netcdf-c
31+
module list
32+
# Currently Loaded Modules:
33+
# 1) gcc/12.2.0 2) stack/2025-06 3) openmpi/4.1.7 4) cdo/2.4.4 5) nco/5.2.4 6) ncview/2.1.9 7) netcdf-c/4.9.2
34+
```
35+
36+
With that, one has access to the most commonly used tools for climate applications:
37+
38+
```bash
39+
which cdo ncdump ncview ncrcat
40+
# /cluster/software/stacks/2025-06/linux-ubuntu22.04-x86_64_v3/gcc-12.2.0/cdo-2.4.4-spns4mysmzmbj7eh37zswj64efou4xvr/bin/cdo
41+
# /cluster/software/stacks/2025-06/linux-ubuntu22.04-x86_64_v3/gcc-12.2.0/netcdf-c-4.9.2-sekz6xps6vd4zyacqlj6e4gesed7hi7t/bin/ncdump
42+
# /cluster/software/stacks/2025-06/linux-ubuntu22.04-x86_64_v3/gcc-12.2.0/ncview-2.1.9-hszhkgti42fealjswfivjjk5r3i3xcob/bin/ncview
43+
# /cluster/software/stacks/2025-06/linux-ubuntu22.04-x86_64_v3/gcc-12.2.0/nco-5.2.4-zhb3mn5upr3rniqoebeyfreb7uabpohi/bin/ncrcat
44+
45+
```
Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# The Alps System
1+
# The Alps System at CSCS
22

33
[Alps :material-open-in-new:](https://www.cscs.ch/computers/alps){:target="_blank"} is a distributed high-performance computing (HPC) infrastructure managed by CSCS.
44
Unlike traditional HPC systems, it is composed of several logical units called vClusters (versatile clusters).
@@ -20,15 +20,14 @@ The following table shows current clusters distribution on Alps at CSCS
2020
| Santis | Weather & Climate | ~ 430 Grace-Hopper nodes | :white_check_mark: |
2121
| Balfrin | MeteoSwiss | ~ 40 A100 GPU nodes | :yellow_circle: |
2222
| Eiger | CPU-only workloads | ~ 580 multicore nodes | :yellow_circle: |
23-
| Daint | User Lab | ~ 600 Grace-Hopper nodes | :x: |
24-
| Clariden | Machine Learning | ~ 800 Grace-Hopper nodes | :x: |
2523

2624
<small>
2725
:white_check_mark: Full C2SM support<br />
2826
:yellow_circle: Partial or limited C2SM support (help available on request)<br />
29-
:x: No C2SM support
3027
</small>
3128

29+
Additional vClusters without C2SM support include Daint (User Lab) and Clariden (Machine Learning).
30+
3231
More information about clusters on Alps is available on the
3332
[official CSCS documentation :material-open-in-new:](https://docs.cscs.ch/clusters/){:target="_blank"}.
3433

@@ -44,7 +43,7 @@ Host ela
4443
User cscsusername
4544
IdentityFile ~/.ssh/cscs-key
4645
47-
Host santis* daint*
46+
Host santis* eiger*
4847
Hostname %h.alps.cscs.ch
4948
User cscsusername
5049
IdentityFile ~/.ssh/cscs-key
@@ -62,9 +61,9 @@ e.g., `ssh santis-ln002`. Replace `cscsusername` with your actual username.
6261

6362
## User Environments
6463

65-
Software stacks at CSCS are now accessible through the so-called User Environments (uenv).
64+
Software stacks at CSCS are now accessible through the so-called User Environments (uenvs).
6665
User environments contain the minimal software stack required for a certain activity, say, building and running ICON.
67-
They are generated by `spack`, packed into single `squashfs` file and then mounted by the user.
66+
They are generated by `spack`, packed into single `squashfs` files and then mounted by the user.
6867
In a way, they can be considered as poor man's containers.
6968

7069
!!! success "Main Advantages of Uenvs"

0 commit comments

Comments
 (0)