Skip to content

Commit d7d2e08

Browse files
maltesanderclaudeadwk67
authored
refactor: extract dereference/validate pipeline from reconcile (#824)
* refactor: rename druid_controller module to controller Frees up the `controller/` directory for upcoming dereference and validate submodules, matching the trino-operator layout. No behavior change. * refactor: extract dereference step from reconcile Move all pre-loop client I/O (ZK ConfigMap, OPA URL, S3 connection, deep storage bucket, AuthenticationClasses) into a new `controller::dereference` submodule that returns `DereferencedObjects`. Matches the trino-operator pattern (trino-operator commit 7004062). No behavior change. * refactor: extract validate step from reconcile Move pure validation (product image resolution, TLS security construction, authentication config conversion, product-config validation) into a new `controller::validate` submodule that returns `ValidatedInputs`. Matches the trino-operator pattern (trino-operator commit 7004062). No behavior change. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: update stale druid_controller.rs reference Follow-up to the druid_controller → controller module rename. * fix: run fmt * refactor(tests): scope 50-assert to cluster readiness only Move the StatefulSet and PodDisruptionBudget shape assertions into the upcoming 52-assert (resource shape) so that 50-assert mirrors trino-operator's 10-assert (readiness-only). The duplicated broker PDB and missing coordinator PDB will be corrected in 52-assert. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(smoke): add 52-assert shape checks for druid resources Declarative shape assertions for every operator-managed druid resource in the smoke test (5 StatefulSets, 3 ClusterIP + 5 headless Services, 3 Listeners, 5 PDBs, the shared internal Secret, the ServiceAccount, and the RoleBinding). Mirrors trino's 13-assert and kafka's 33-assert.yaml.j2 patterns. The duplicative *-metrics Services are intentionally not asserted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(smoke): tighten 52-assert volume + selector assertions Fill in the bare log-config/log volume entries on the historical StatefulSet so the assertion matches the live shape, and add the selector blocks on the five headless Services (mirrors the trino/kafka equivalents). Also clarify the vector_enabled comment. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(smoke): add 53-assert discovery cm + secret snapshot First slice of the ConfigMap snapshot assertion: the discovery ConfigMap and the operator-managed Secret's key presence. The five role-group ConfigMap snapshots follow in the next commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(smoke): snapshot all five role-group ConfigMaps Adds full .data snapshots for druid-{broker,coordinator,historical, middlemanager,router}-default ConfigMaps. The actual side is normalized for namespace and the random znode UUID before diffing. The current snapshot was captured with VECTOR_AGGREGATOR unset. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(smoke): drop unused vector_enabled scaffolding from 52-assert vector_enabled was declared in 52-assert.yaml.j2 as forward scaffolding but has no consumer. Drop it; it can be re-introduced in the commit that actually uses it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(smoke): cover all five roles in 51-assert containerdebug check The previous check listed druid-router-default-0 twice and never checked druid-broker-default-0. Replace the duplicate with the broker entry and sort the list by role name for readability. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(smoke): add vector_enabled branch to role-group cm snapshot When VECTOR_AGGREGATOR is set, the operator adds a vector.yaml key to each role-group ConfigMap; the other four keys are unchanged. The vector.yaml payload is the same across all five roles except for a single `.role = "<role>"` line, so the content is declared once via a jinja2 `{% set %}` block and substituted per role inside the heredoc. Verified against a live VECTOR_AGGREGATOR=vector-aggregator-discovery cluster in namespace kuttl-test-distinct-camel. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(tests): place vector_enabled conditional at column 0 The `{% if vector_enabled -%}` and `{%- endif %}` blocks had 6 leading spaces. When vector_enabled was false, jinja2's strip markers joined the leading whitespace of both blocks, putting YAMLEOF at 12 spaces in the rendered output. After the kuttl YAML block scalar strips 6 common-leading spaces, YAMLEOF ended up at 6 spaces inside the script body, but the heredoc opener is `<<'YAMLEOF'` (not `<<-`), which requires the terminator at column 0. The heredoc therefore never closed and `\$(...)` never terminated, producing `Syntax error: end of file unexpected (expecting ")")` from dash. Moving `{% if %}` and `{% endif %}` to column 0 (no strip markers) mirrors kafka-operator's 34-assert.yaml.j2 pattern and renders both branches with YAMLEOF at the correct indent. Verified: - non-vector render: 8 scripts pass against kuttl-test-fluent-mastodon - vector render heredoc parses to the 5 expected keys (jvm.config, log4j2.properties, runtime.properties, security.properties, vector.yaml) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: adapt changelog * fix(test): make configmap diff better readable * move dereferenced fields to validated struct --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.tech>
1 parent 68a2442 commit d7d2e08

11 files changed

Lines changed: 1995 additions & 312 deletions

File tree

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ All notable changes to this project will be documented in this file.
1616
The `.clusterConfig.metadataStorageDatabase` has subfields according to the supported db types: `postgresql`, `mysql` and `derby`.
1717
- BREAKING: The `.clusterConfig.metadataStorageDatabase` field has been renamed to `.clusterConfig.metadataDatabase` for consistency ([#814]).
1818
- Document Helm deployed RBAC permissions and remove unnecessary permissions ([#810]).
19+
- Internal operator refactoring: introduce dereference() and validate() steps in the reconciler ([#824]).
1920
- test: Bump vector-aggregator to 0.55.0, replace /graphql call with gRPC calls ([#826]).
2021

2122
### Deleted
@@ -26,6 +27,7 @@ All notable changes to this project will be documented in this file.
2627
[#813]: https://github.com/stackabletech/druid-operator/pull/813
2728
[#814]: https://github.com/stackabletech/druid-operator/pull/814
2829
[#818]: https://github.com/stackabletech/druid-operator/pull/818
30+
[#824]: https://github.com/stackabletech/druid-operator/pull/824
2931
[#826]: https://github.com/stackabletech/druid-operator/pull/826
3032

3133
## [26.3.0] - 2026-03-16

rust/operator-binary/src/druid_controller.rs renamed to rust/operator-binary/src/controller.rs

Lines changed: 39 additions & 164 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ use product_config::{
1313
types::PropertyNameKind,
1414
writer::{PropertiesWriterError, to_java_properties_string},
1515
};
16-
use snafu::{OptionExt, ResultExt, Snafu};
16+
use snafu::{ResultExt, Snafu};
1717
use stackable_operator::{
1818
builder::{
1919
self,
@@ -26,11 +26,7 @@ use stackable_operator::{
2626
},
2727
cli::OperatorEnvironmentOptions,
2828
cluster_resources::{ClusterResourceApplyStrategy, ClusterResources},
29-
commons::{
30-
opa::OpaApiVersion,
31-
product_image_selection::{self, ResolvedProductImage},
32-
rbac::build_rbac_resources,
33-
},
29+
commons::{product_image_selection::ResolvedProductImage, rbac::build_rbac_resources},
3430
constants::RESTART_CONTROLLER_ENABLED_LABEL,
3531
crd::s3,
3632
database_connections::drivers::jdbc::JdbcDatabaseConnection as _,
@@ -49,7 +45,6 @@ use stackable_operator::{
4945
},
5046
kvp::{KeyValuePairError, LabelError, LabelValueError, Labels},
5147
logging::controller::ReconcilerError,
52-
product_config_utils::{transform_all_roles_to_config, validate_all_roles_and_groups_config},
5348
product_logging::{
5449
self,
5550
framework::LoggingError,
@@ -77,7 +72,6 @@ use crate::{
7772
LOG_CONFIG_DIRECTORY, MAX_DRUID_LOG_FILES_SIZE, METRICS_PORT, METRICS_PORT_NAME,
7873
OPERATOR_NAME, RUNTIME_PROPS, RW_CONFIG_DIRECTORY, S3_ACCESS_KEY, S3_ENDPOINT_URL,
7974
S3_PATH_STYLE_ACCESS, S3_SECRET_KEY, STACKABLE_LOG_DIR, ZOOKEEPER_CONNECTION_STRING,
80-
authentication::AuthenticationClassesResolved, authorization::DruidAuthorization,
8175
build_recommended_labels, build_string_list, security::DruidTlsSecurity, v1alpha1,
8276
},
8377
discovery::{self, build_discovery_configmaps},
@@ -92,10 +86,13 @@ use crate::{
9286
service::{build_rolegroup_headless_service, build_rolegroup_metrics_service},
9387
};
9488

89+
mod dereference;
90+
mod validate;
91+
9592
pub const DRUID_CONTROLLER_NAME: &str = "druidcluster";
9693
pub const FULL_CONTROLLER_NAME: &str = concatcp!(DRUID_CONTROLLER_NAME, '.', OPERATOR_NAME);
9794

98-
const CONTAINER_IMAGE_BASE_NAME: &str = "druid";
95+
pub(super) const CONTAINER_IMAGE_BASE_NAME: &str = "druid";
9996

10097
// volume names
10198
const DRUID_CONFIG_VOLUME_NAME: &str = "config";
@@ -139,63 +136,19 @@ pub enum Error {
139136
rolegroup: RoleGroupRef<v1alpha1::DruidCluster>,
140137
},
141138

142-
#[snafu(display("invalid product configuration"))]
143-
InvalidProductConfig {
144-
source: stackable_operator::product_config_utils::Error,
145-
},
146-
147-
#[snafu(display("invalid authentication configuration"))]
148-
InvalidDruidAuthenticationConfig {
149-
source: crate::authentication::Error,
150-
},
151-
152139
#[snafu(display("object is missing metadata to build owner reference"))]
153140
ObjectMissingMetadataForOwnerRef {
154141
source: stackable_operator::builder::meta::Error,
155142
},
156143

157-
#[snafu(display(
158-
"failed to get ZooKeeper discovery config map for cluster: {}",
159-
cm_name
160-
))]
161-
GetZookeeperConnStringConfigMap {
162-
source: stackable_operator::client::Error,
163-
cm_name: String,
164-
},
165-
166-
#[snafu(display(
167-
"failed to get OPA discovery config map and/or connection string for cluster: {}",
168-
cm_name
169-
))]
170-
GetOpaConnString {
171-
source: stackable_operator::commons::opa::Error,
172-
cm_name: String,
173-
},
174-
175-
#[snafu(display("failed to get valid S3 connection"))]
176-
GetS3Connection { source: crate::crd::Error },
144+
#[snafu(display("failed to dereference cluster objects"))]
145+
Dereference { source: dereference::Error },
177146

178147
#[snafu(display("failed to configure S3 connection"))]
179148
ConfigureS3 {
180149
source: stackable_operator::crd::s3::v1alpha1::ConnectionError,
181150
},
182151

183-
#[snafu(display("failed to get deep storage bucket"))]
184-
GetDeepStorageBucket {
185-
source: stackable_operator::crd::s3::v1alpha1::BucketError,
186-
},
187-
188-
#[snafu(display(
189-
"failed to get ZooKeeper connection string from config map {}",
190-
cm_name
191-
))]
192-
MissingZookeeperConnString { cm_name: String },
193-
194-
#[snafu(display("failed to transform configs"))]
195-
ProductConfigTransform {
196-
source: stackable_operator::product_config_utils::Error,
197-
},
198-
199152
#[snafu(display("failed to format runtime properties"))]
200153
PropertiesWriteError { source: PropertiesWriterError },
201154

@@ -245,17 +198,9 @@ pub enum Error {
245198
name: String,
246199
},
247200

248-
#[snafu(display("object defines no namespace"))]
249-
ObjectHasNoNamespace,
250-
251201
#[snafu(display("failed to initialize security context"))]
252202
FailedToInitializeSecurityContext { source: crate::crd::security::Error },
253203

254-
#[snafu(display("failed to retrieve AuthenticationClass"))]
255-
AuthenticationClassRetrieval {
256-
source: crate::crd::authentication::Error,
257-
},
258-
259204
#[snafu(display("failed to get JVM config"))]
260205
GetJvmConfig { source: crate::config::jvm::Error },
261206

@@ -363,10 +308,8 @@ pub enum Error {
363308
#[snafu(display("failed to configure service"))]
364309
ServiceConfiguration { source: crate::service::Error },
365310

366-
#[snafu(display("failed to resolve product image"))]
367-
ResolveProductImage {
368-
source: product_image_selection::Error,
369-
},
311+
#[snafu(display("failed to validate cluster"))]
312+
ValidateCluster { source: validate::Error },
370313

371314
#[snafu(display("invalid metadata database connection"))]
372315
InvalidMetadataDatabaseConnection {
@@ -394,89 +337,18 @@ pub async fn reconcile_druid(
394337
.context(InvalidDruidClusterSnafu)?;
395338

396339
let client = &ctx.client;
397-
let namespace = &druid
398-
.metadata
399-
.namespace
400-
.clone()
401-
.with_context(|| ObjectHasNoNamespaceSnafu {})?;
402-
let resolved_product_image = druid
403-
.spec
404-
.image
405-
.resolve(
406-
CONTAINER_IMAGE_BASE_NAME,
407-
&ctx.operator_environment.image_repository,
408-
crate::built_info::PKG_VERSION,
409-
)
410-
.context(ResolveProductImageSnafu)?;
411340

412-
let zk_confmap = druid.spec.cluster_config.zookeeper_config_map_name.clone();
413-
let zk_connstr = client
414-
.get::<ConfigMap>(&zk_confmap, namespace)
341+
let dereferenced_objects = dereference::dereference(client, druid)
415342
.await
416-
.context(GetZookeeperConnStringConfigMapSnafu {
417-
cm_name: zk_confmap.clone(),
418-
})?
419-
.data
420-
.and_then(|mut data| data.remove("ZOOKEEPER"))
421-
.context(MissingZookeeperConnStringSnafu {
422-
cm_name: zk_confmap.clone(),
423-
})?;
343+
.context(DereferenceSnafu)?;
424344

425-
// Assemble the OPA connection string from the discovery and the given path, if a spec is given.
426-
let opa_connstr = if let Some(DruidAuthorization { opa: opa_config }) =
427-
&druid.spec.cluster_config.authorization
428-
{
429-
Some(
430-
opa_config
431-
.full_document_url_from_config_map(client, druid, Some("allow"), &OpaApiVersion::V1)
432-
.await
433-
.context(GetOpaConnStringSnafu {
434-
cm_name: opa_config.config_map_name.clone(),
435-
})?,
436-
)
437-
} else {
438-
None
439-
};
440-
441-
// Get the s3 connection if one is defined
442-
let s3_conn = druid
443-
.get_s3_connection(client)
444-
.await
445-
.context(GetS3ConnectionSnafu)?;
446-
447-
let deep_storage_bucket_name = match &druid.spec.cluster_config.deep_storage {
448-
DeepStorageSpec::S3(s3_spec) => Some(
449-
s3_spec
450-
.bucket
451-
.clone()
452-
.resolve(client, namespace)
453-
.await
454-
.context(GetDeepStorageBucketSnafu)?
455-
.bucket_name,
456-
),
457-
_ => None,
458-
};
459-
460-
let resolved_auth_classes =
461-
AuthenticationClassesResolved::from(&druid.spec.cluster_config, client)
462-
.await
463-
.context(AuthenticationClassRetrievalSnafu)?;
464-
465-
let druid_tls_security =
466-
DruidTlsSecurity::new_from_druid_cluster(druid, &resolved_auth_classes);
467-
468-
let druid_auth_config = DruidAuthenticationConfig::try_from(resolved_auth_classes)
469-
.context(InvalidDruidAuthenticationConfigSnafu)?;
470-
471-
let role_config = transform_all_roles_to_config(druid, &druid.build_role_properties());
472-
let validated_role_config = validate_all_roles_and_groups_config(
473-
&resolved_product_image.product_version,
474-
&role_config.context(ProductConfigTransformSnafu)?,
345+
let validated = validate::validate(
346+
druid,
347+
&dereferenced_objects,
348+
&ctx.operator_environment,
475349
&ctx.product_config,
476-
false,
477-
false,
478350
)
479-
.context(InvalidProductConfigSnafu)?;
351+
.context(ValidateClusterSnafu)?;
480352

481353
let mut cluster_resources = ClusterResources::new(
482354
APP_NAME,
@@ -510,7 +382,7 @@ pub async fn reconcile_druid(
510382

511383
let mut ss_cond_builder = StatefulSetConditionBuilder::default();
512384

513-
for (role_name, role_config) in validated_role_config.iter() {
385+
for (role_name, role_config) in validated.validated_role_config.iter() {
514386
let druid_role = DruidRole::from_str(role_name).context(UnidentifiedDruidRoleSnafu {
515387
role: role_name.to_string(),
516388
})?;
@@ -533,7 +405,7 @@ pub async fn reconcile_druid(
533405
let role_group_service_recommended_labels = build_recommended_labels(
534406
druid,
535407
DRUID_CONTROLLER_NAME,
536-
&resolved_product_image.app_version_label_value,
408+
&validated.resolved_product_image.app_version_label_value,
537409
&rolegroup.role,
538410
&rolegroup.role_group,
539411
);
@@ -548,7 +420,7 @@ pub async fn reconcile_druid(
548420

549421
let rg_headless_service = build_rolegroup_headless_service(
550422
druid,
551-
&druid_tls_security,
423+
&validated.druid_tls_security,
552424
&druid_role,
553425
&rolegroup,
554426
role_group_service_recommended_labels.clone(),
@@ -565,27 +437,27 @@ pub async fn reconcile_druid(
565437

566438
let rg_configmap = build_rolegroup_config_map(
567439
druid,
568-
&resolved_product_image,
440+
&validated.resolved_product_image,
569441
&rolegroup,
570442
rolegroup_config,
571443
&merged_rolegroup_config,
572-
&zk_connstr,
573-
opa_connstr.as_deref(),
574-
s3_conn.as_ref(),
575-
deep_storage_bucket_name.as_deref(),
576-
&druid_tls_security,
577-
&druid_auth_config,
444+
&validated.zookeeper_connection_string,
445+
validated.opa_connection_string.as_deref(),
446+
validated.s3_connection.as_ref(),
447+
validated.deep_storage_bucket_name.as_deref(),
448+
&validated.druid_tls_security,
449+
&validated.druid_auth_config,
578450
)?;
579451
let rg_statefulset = build_rolegroup_statefulset(
580452
druid,
581-
&resolved_product_image,
453+
&validated.resolved_product_image,
582454
&druid_role,
583455
&rolegroup,
584456
rolegroup_config,
585457
&merged_rolegroup_config,
586-
s3_conn.as_ref(),
587-
&druid_tls_security,
588-
&druid_auth_config,
458+
validated.s3_connection.as_ref(),
459+
&validated.druid_tls_security,
460+
&validated.druid_auth_config,
589461
&rbac_sa,
590462
)?;
591463

@@ -628,14 +500,14 @@ pub async fn reconcile_druid(
628500
build_recommended_labels(
629501
druid,
630502
DRUID_CONTROLLER_NAME,
631-
&resolved_product_image.app_version_label_value,
503+
&validated.resolved_product_image.app_version_label_value,
632504
role_name,
633505
"none",
634506
),
635507
listener_class.to_string(),
636508
listener_group_name,
637509
&druid_role,
638-
&druid_tls_security,
510+
&validated.druid_tls_security,
639511
)
640512
.context(ListenerConfigurationSnafu)?;
641513

@@ -649,8 +521,8 @@ pub async fn reconcile_druid(
649521
for discovery_cm in build_discovery_configmaps(
650522
druid,
651523
druid,
652-
&resolved_product_image,
653-
&druid_tls_security,
524+
&validated.resolved_product_image,
525+
&validated.druid_tls_security,
654526
listener,
655527
)
656528
.await
@@ -1387,9 +1259,12 @@ pub fn error_policy(
13871259
mod test {
13881260
use product_config::{ProductConfigManager, writer};
13891261
use rstest::*;
1262+
use stackable_operator::product_config_utils::{
1263+
transform_all_roles_to_config, validate_all_roles_and_groups_config,
1264+
};
13901265

13911266
use super::*;
1392-
use crate::crd::PROP_SEGMENT_CACHE_LOCATIONS;
1267+
use crate::crd::{PROP_SEGMENT_CACHE_LOCATIONS, authentication::AuthenticationClassesResolved};
13931268

13941269
#[derive(Snafu, Debug, EnumDiscriminants)]
13951270
#[strum_discriminants(derive(IntoStaticStr))]

0 commit comments

Comments
 (0)