Skip to content

Commit 8954e74

Browse files
committed
create a proposal for addon dependency management
Signed-off-by: Yang Le <[email protected]>
1 parent c16b1da commit 8954e74

File tree

2 files changed

+402
-0
lines changed

2 files changed

+402
-0
lines changed
Lines changed: 390 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,390 @@
1+
# Addon Dependency Management
2+
3+
## Release Signoff Checklist
4+
5+
- [ ] Enhancement is `implementable`
6+
- [ ] Design details are appropriately documented from clear requirements
7+
- [ ] Test plan is defined
8+
- [ ] Graduation criteria for dev preview, tech preview, GA
9+
- [ ] User-facing documentation is created in [website](https://github.com/open-cluster-management/website/)
10+
11+
## Summary
12+
13+
This proposal introduces a dependency management mechanism for managed cluster addons, allowing addon authors to declare dependencies between addons. The system will ensure that dependent addons are installed and available before installing or marking an addon as healthy.
14+
15+
## Motivation
16+
17+
Currently, there is no way to declare or enforce dependencies between addons in Open Cluster Management. This creates several challenges:
18+
19+
1. **Manual coordination required**: Administrators must manually ensure that prerequisite addons are installed before installing dependent addons.
20+
21+
2. **Silent failures**: When a dependent addon is missing, addons may fail at runtime with unclear error messages, making troubleshooting difficult.
22+
23+
3. **Configuration complexity**: Some addons rely on Custom Resource Definitions (CRDs) or resources provided by other addons (e.g., Managed Service Account addon provides ManagedServiceAccount API that other addons use). Without dependency tracking, these relationships are implicit and undocumented.
24+
25+
4. **Installation ordering**: There's no automated way to ensure correct installation ordering when multiple interdependent addons are deployed simultaneously.
26+
27+
### Goals
28+
29+
- Provide a declarative way for addon authors to specify dependencies on other addons
30+
- Automatically validate that all dependencies are satisfied before considering an addon healthy
31+
- Provide clear status feedback when dependencies are not met
32+
- Maintain backward compatibility with existing addons that have no dependencies
33+
34+
### Non-Goals
35+
36+
- Automatic installation of dependencies (users still need to explicitly install required addons)
37+
- Dependency resolution and ordering during installation
38+
- Support for circular dependencies
39+
- Dependency management across different managed clusters (dependencies are validated per cluster)
40+
- Version constraints for dependencies (addon API does not currently track version information)
41+
42+
## Proposal
43+
44+
We propose to extend the `ClusterManagementAddOn` API to include a `dependencies` field that allows addon authors to declare dependencies on other addons. The addon-manager component will validate these dependencies on each managed cluster and set appropriate status conditions on the `ManagedClusterAddOn` resource.
45+
46+
### User Stories
47+
48+
#### Story 1: Addon depending on Managed Service Account
49+
50+
As an addon author, I want to declare a dependency on the Managed Service Account addon because my addon uses the `ManagedServiceAccount` API to provision service accounts on managed clusters. When the Managed Service Account addon is not installed, I want users to see a clear error message in the addon status.
51+
52+
53+
## Design Details
54+
55+
### API Changes
56+
57+
#### ClusterManagementAddOn
58+
59+
Add a new `dependencies` field to the `ClusterManagementAddOnSpec`:
60+
61+
```go
62+
// ClusterManagementAddOnSpec provides information for the add-on.
63+
type ClusterManagementAddOnSpec struct {
64+
// ... existing fields ...
65+
66+
// dependencies is a list of add-ons that this add-on depends on.
67+
// The add-on will only be installed and considered healthy if all its dependencies
68+
// are installed and available on the managed cluster.
69+
// An empty list means the add-on has no dependencies.
70+
// The default is an empty list.
71+
// +optional
72+
Dependencies []AddonDependency `json:"dependencies,omitempty"`
73+
}
74+
75+
// AddonDependency represents a dependency on another add-on.
76+
type AddonDependency struct {
77+
// name is the name of the dependent add-on.
78+
// This should match the name of a ClusterManagementAddOn resource.
79+
// +required
80+
// +kubebuilder:validation:Required
81+
// +kubebuilder:validation:MinLength:=1
82+
Name string `json:"name"`
83+
84+
// type specifies the type of the dependency.
85+
// Valid values are:
86+
// - "Optional" (default): The addon can work with reduced functionality without this dependency.
87+
// The Degraded condition will be set with reason DependencyNotSatisfied when the dependency is not satisfied,
88+
// but the Available condition may remain True if the addon is otherwise functional.
89+
// - "Required": The addon cannot function without this dependency.
90+
// The Degraded condition will be set with reason RequiredDependencyNotSatisfied, and the Available
91+
// condition should be set to False when the dependency is not satisfied.
92+
// +optional
93+
// +kubebuilder:validation:Enum=Optional;Required
94+
// +kubebuilder:default=Optional
95+
Type DependencyType `json:"type,omitempty"`
96+
}
97+
98+
// DependencyType describes the type of dependency
99+
// +kubebuilder:validation:Enum=Optional;Required
100+
type DependencyType string
101+
102+
const (
103+
// DependencyTypeOptional indicates the addon can work with reduced functionality without this dependency
104+
DependencyTypeOptional DependencyType = "Optional"
105+
// DependencyTypeRequired indicates the addon cannot function without this dependency
106+
DependencyTypeRequired DependencyType = "Required"
107+
)
108+
```
109+
110+
#### ManagedClusterAddOn Status
111+
112+
Add a new reason for the existing `Degraded` condition:
113+
114+
```go
115+
// the reasons of condition ManagedClusterAddOnConditionDegraded
116+
const (
117+
// AddonDegradedReasonDependencyNotSatisfied is the reason of condition Degraded indicating that one or more
118+
// soft (optional) dependencies of the addon are not satisfied (not installed or not available).
119+
// The addon may still be Available with reduced functionality.
120+
AddonDegradedReasonDependencyNotSatisfied = "DependencyNotSatisfied"
121+
122+
// AddonDegradedReasonRequiredDependencyNotSatisfied is the reason of condition Degraded indicating that one or more
123+
// hard (required) dependencies of the addon are not satisfied (not installed or not available).
124+
// The Available condition should also be set to False as the addon cannot function.
125+
AddonDegradedReasonRequiredDependencyNotSatisfied = "RequiredDependencyNotSatisfied"
126+
)
127+
```
128+
129+
#### Dependency Types
130+
131+
There are two types of dependencies:
132+
133+
- **Optional dependencies (default, type=Optional)**: The addon can still function with reduced functionality when the dependency is missing. When an optional dependency is not satisfied, the addon-manager will set `Degraded=True` with reason `DependencyNotSatisfied`, but the klusterlet-agent will not modify the `Available` condition, allowing the addon to remain available if its health checks pass.
134+
135+
- **Required dependencies (type=Required)**: The addon cannot function at all without the dependency. When a required dependency is not satisfied, the addon-manager will set `Degraded=True` with reason `RequiredDependencyNotSatisfied`, and the klusterlet-agent will detect this specific reason and set `Available=False`.
136+
137+
### Implementation Details
138+
139+
#### Dependency Validation
140+
141+
**Addon-Manager (on Hub):**
142+
143+
The addon-manager component will implement dependency validation with the following logic:
144+
145+
1. **Read dependencies**: When reconciling a `ManagedClusterAddOn`, read the corresponding `ClusterManagementAddOn` to get the list of dependencies.
146+
147+
2. **Check each dependency**: For each dependency in the list:
148+
- Check if a `ManagedClusterAddOn` with the same name exists in the same namespace (managed cluster namespace)
149+
- Check if the dependent addon's `Available` condition is `True`
150+
151+
3. **Set Degraded condition based on dependency type**:
152+
- **If all dependencies are satisfied**:
153+
- Ensure the `Degraded` condition is not set with reason `DependencyNotSatisfied` or `RequiredDependencyNotSatisfied`
154+
155+
- **If any optional dependency is not satisfied** (type=Optional):
156+
- Set the `Degraded` condition to `True` with:
157+
- Reason: `DependencyNotSatisfied`
158+
- Message: Clear description of which dependencies are missing (e.g., "Optional addon 'managed-serviceaccount' is not installed or not available")
159+
- Do NOT modify the `Available` condition
160+
161+
- **If any required dependency is not satisfied** (type=Required):
162+
- Set the `Degraded` condition to `True` with:
163+
- Reason: `RequiredDependencyNotSatisfied` (different reason!)
164+
- Message: Clear description of which dependencies are missing (e.g., "Required addon 'managed-serviceaccount' is not installed or not available")
165+
- Do NOT modify the `Available` condition (klusterlet-agent will do this)
166+
167+
**Klusterlet-Agent (on Managed Cluster):**
168+
169+
The klusterlet-agent component will check the `Degraded` condition when determining addon availability:
170+
171+
1. **When reconciling addon health**: After checking lease/probe health status
172+
2. **Check for required dependency failures**:
173+
- If `Degraded=True` with reason `RequiredDependencyNotSatisfied`, set `Available=False`
174+
- If `Degraded=True` with reason `DependencyNotSatisfied` (optional dependency), do not modify `Available` - let the addon's own health checks determine availability
175+
176+
This design ensures clear component ownership and avoids conflicts:
177+
- **addon-manager** owns dependency validation and sets `Degraded` with appropriate reason
178+
- **klusterlet-agent** owns `Available` and considers dependency information when making availability decisions
179+
180+
**Key Design Benefits:**
181+
1. The klusterlet-agent does not need access to the `ClusterManagementAddOn` API - all dependency type information is encoded in the `Degraded` condition reason
182+
2. Component ownership is clear and prevents components from fighting over the same condition
183+
3. Different dependency types (Optional vs Required) result in different status behaviors automatically
184+
185+
### Example Usage
186+
187+
#### Example 1: Optional Dependency (Default)
188+
189+
An addon that can work with reduced functionality without the dependency:
190+
191+
```yaml
192+
apiVersion: addon.open-cluster-management.io/v1alpha1
193+
kind: ClusterManagementAddOn
194+
metadata:
195+
name: my-addon
196+
spec:
197+
addOnMeta:
198+
displayName: "My Addon"
199+
description: "An addon that optionally uses ManagedServiceAccount API"
200+
dependencies:
201+
- name: managed-serviceaccount
202+
# type: Optional is the default, can be omitted
203+
installStrategy:
204+
type: Manual
205+
```
206+
207+
When the Managed Service Account addon is not installed, the `ManagedClusterAddOn` status would show:
208+
209+
```yaml
210+
apiVersion: addon.open-cluster-management.io/v1alpha1
211+
kind: ManagedClusterAddOn
212+
metadata:
213+
name: my-addon
214+
namespace: cluster1
215+
status:
216+
conditions:
217+
- type: Available
218+
status: "True" # Addon is still available with reduced functionality
219+
reason: AddonAvailable
220+
message: "Addon is available"
221+
lastTransitionTime: "2025-10-22T10:00:00Z"
222+
- type: Degraded
223+
status: "True"
224+
reason: DependencyNotSatisfied
225+
message: "Optional addon 'managed-serviceaccount' is not installed or not available"
226+
lastTransitionTime: "2025-10-22T10:00:00Z"
227+
```
228+
229+
#### Example 2: Required Dependency
230+
231+
An addon that cannot function without the dependency:
232+
233+
```yaml
234+
apiVersion: addon.open-cluster-management.io/v1alpha1
235+
kind: ClusterManagementAddOn
236+
metadata:
237+
name: my-critical-addon
238+
spec:
239+
addOnMeta:
240+
displayName: "My Critical Addon"
241+
description: "An addon that requires ManagedServiceAccount API"
242+
dependencies:
243+
- name: managed-serviceaccount
244+
type: Required # Required dependency
245+
installStrategy:
246+
type: Manual
247+
```
248+
249+
When the Managed Service Account addon is not installed, the `ManagedClusterAddOn` status would show:
250+
251+
```yaml
252+
apiVersion: addon.open-cluster-management.io/v1alpha1
253+
kind: ManagedClusterAddOn
254+
metadata:
255+
name: my-critical-addon
256+
namespace: cluster1
257+
status:
258+
conditions:
259+
- type: Available
260+
status: "False" # Set by klusterlet-agent
261+
reason: RequiredDependencyNotSatisfied
262+
message: "Required addon 'managed-serviceaccount' is not installed or not available"
263+
lastTransitionTime: "2025-10-22T10:00:00Z"
264+
- type: Degraded
265+
status: "True" # Set by addon-manager
266+
reason: RequiredDependencyNotSatisfied
267+
message: "Required addon 'managed-serviceaccount' is not installed or not available"
268+
lastTransitionTime: "2025-10-22T10:00:00Z"
269+
```
270+
271+
### Risks and Mitigations
272+
273+
#### Risk: Circular Dependencies
274+
275+
**Risk**: Users might accidentally create circular dependencies (A depends on B, B depends on A).
276+
277+
**Mitigation**:
278+
- Document that circular dependencies are not supported and will result in both addons being marked as degraded
279+
- Consider adding validation webhook to detect and reject circular dependencies
280+
- Future enhancement: Add a validation controller that detects circular dependencies
281+
282+
#### Risk: Dependency Chain Complexity
283+
284+
**Risk**: Long dependency chains might make troubleshooting difficult.
285+
286+
**Mitigation**:
287+
- Provide clear error messages that list all unsatisfied dependencies
288+
- Document best practices for keeping dependency chains shallow
289+
- Consider adding status field to show the full dependency tree
290+
291+
### Test Plan
292+
293+
#### Unit Tests
294+
295+
- Test dependency validation logic with various scenarios:
296+
- No dependencies
297+
- Single dependency (satisfied/unsatisfied)
298+
- Multiple dependencies (all satisfied, some unsatisfied, none satisfied)
299+
300+
#### Integration Tests
301+
302+
- Test complete workflow:
303+
1. Install addon A with dependency on addon B (B not installed) - verify Degraded condition
304+
2. Install addon B - verify addon A becomes Available
305+
3. Delete addon B - verify addon A becomes Degraded
306+
307+
#### E2E Tests
308+
309+
- Deploy real addons with dependencies on a test cluster
310+
- Verify status conditions are correctly set
311+
- Verify addon behavior when dependencies are not met
312+
313+
### Graduation Criteria
314+
315+
#### Beta (v1beta1)
316+
317+
- API changes implemented in `open-cluster-management.io/api` addon v1beta1
318+
- Dependency validation fully supported by addon-manager component
319+
- Klusterlet-agent updated to handle `RequiredDependencyNotSatisfied` reason
320+
- Unit and integration tests passing
321+
- E2E tests with real addons
322+
- At least 2 real addons using dependency declarations
323+
- Metrics for dependency validation failures
324+
- Documentation of the feature and API
325+
326+
#### GA (v1)
327+
328+
- Proven stability over at least 2 releases in v1beta1
329+
- Comprehensive documentation including troubleshooting guides
330+
- No critical bugs reported
331+
- Widely adopted by addon authors (at least 5 addons using dependencies in production)
332+
- Performance validated at scale (tested with 1000+ clusters)
333+
334+
### Upgrade / Downgrade Strategy
335+
336+
#### Upgrade
337+
338+
- **From version without dependency support to version with dependency support**:
339+
- Existing addons without dependencies continue to work unchanged
340+
- New or updated addons can add dependencies field
341+
- No migration required
342+
343+
#### Downgrade
344+
345+
- **From version with dependency support to version without dependency support**:
346+
- Dependencies field will be ignored by older controllers
347+
- Addons will function as if they have no dependencies
348+
- Status conditions related to dependencies will not be updated
349+
350+
### Version Skew Strategy
351+
352+
- Hub and managed cluster components need to be aware of version compatibility
353+
- Dependency validation happens on the hub side in the addon-manager component
354+
- No version skew issues expected as long as the API version is compatible
355+
356+
## Implementation History
357+
358+
- 2025-10-22: Initial KEP draft
359+
- TBD: Beta implementation
360+
- TBD: GA promotion
361+
362+
## Drawbacks
363+
364+
- Adds complexity to the addon API
365+
- Requires addon authors to maintain accurate dependency information
366+
- Does not automatically install dependencies (users must still manually install them)
367+
368+
## Alternatives
369+
370+
### Alternative 1: Only Set Degraded Condition
371+
372+
Instead of adding a `required` field, the addon-manager could only set the `Degraded` condition when dependencies are not satisfied, and never modify the `Available` condition.
373+
374+
**Pros**:
375+
- Simpler implementation - no need for the `required` field
376+
- Addon controllers have full control over their own availability status
377+
- Cleaner separation of concerns
378+
379+
**Cons**:
380+
- Less clear semantics - users cannot declare hard dependencies in the API
381+
- Addon controllers must implement their own logic to set `Available=False` when dependencies are missing
382+
- More work for addon authors who want hard dependency behavior
383+
384+
**Decision**: Not chosen as the primary approach because the `type` field provides clearer semantics and reduces boilerplate for addon authors. However, this remains a valid alternative if the `type` field proves too complex in practice.
385+
386+
## Infrastructure Needed
387+
388+
- No new infrastructure required
389+
- Existing CI/CD pipelines can be used for testing
390+
- Documentation updates needed in open-cluster-management website

0 commit comments

Comments
 (0)