Skip to content

Claude/explain codebase mmljhjn4c3wypjfr kt7 lf#251

Open
chengaoying wants to merge 3 commits intocubefs:mainfrom
chengaoying:claude/explain-codebase-mmljhjn4c3wypjfr-Kt7Lf
Open

Claude/explain codebase mmljhjn4c3wypjfr kt7 lf#251
chengaoying wants to merge 3 commits intocubefs:mainfrom
chengaoying:claude/explain-codebase-mmljhjn4c3wypjfr-Kt7Lf

Conversation

@chengaoying
Copy link
Copy Markdown

No description provided.

claude added 3 commits March 11, 2026 05:57
Infrastructure:
- docker/compose.yml: remove ZooKeeper (Kafka now uses KRaft mode),
  replace 3-node Redis cluster with single-node Redis,
  upgrade Kafka to bitnami 3.6.1 (KRaft), OpenSearch to 2.13.0

Build quality:
- pom.xml: remove testFailureIgnore from parent and all 8 module POMs
  (test failures now block the build — quality gate restored)
- Add MyBatis-Plus and Micrometer/Prometheus to dependencyManagement
- Replace FastJSON2 with Jackson in dependencyManagement
- Upgrade Spring Boot 2.7.8 → 3.2.4, Java 8 → 17
- Upgrade OpenSearch client 1.3.12 → 2.13.0

Service merges (8 services → 5):
- task-analyzer (NEW): merges task-detect + task-parser
  - Eliminates Redis queue intermediary via AnalyzerBridge
  - Direct in-process call to JobManager.run() after detection
  - Saves: Redis Cluster 3-node, Lua scripts, 2-5s latency per task
  - Kafka max-poll-records: 1 → 50 for throughput
- task-collector (NEW): merges task-application + task-metadata
  - Unified "discover YARN/Spark applicationIds and log paths" service
  - Kafka event-driven (from task-application) + scheduled polling
    (from task-metadata) run in same JVM

Security fixes (P0):
- task-portal: JWT secret moved to ${JWT_SECRET} env variable
  (was hardcoded: '8ff3bf2c8344' — only 6 bytes)
- task-application/StringUtil: new toParameterizedQuery() method
  converts ${key} template SQL to PreparedStatement with ? placeholders,
  eliminating SQL injection risk in LogParserServiceImpl
- task-syncer: credentials moved to env vars (DB_URL, SCHEDULER_DB_URL, etc.)

Config improvements:
- All services: Redis cluster → single-node (host/port/password env vars)
- All services: MyBatis StdOutImpl → Slf4jImpl (no SQL in prod logs)
- All services: Kafka bootstrap-servers → ${KAFKA_BOOTSTRAP_SERVERS}
- All services: Add Actuator + Prometheus endpoints

task-syncer: Canal-free JDBC polling mode:
- New JdbcPollingConsumer (@ConditionalOnProperty syncer.mode=jdbc)
- Set SYNCER_MODE=jdbc when Compass can directly connect to scheduler DB
- Eliminates: Canal 3-process deployment + ZooKeeper dependency for HA

FastJSON2 → Jackson migration (key files):
- DetectedTask.java: JSON.parseObject → ObjectMapper.readValue
- DetectServiceImpl.java: JSONObject.toJSONString → AnalyzerBridge.analyze
- LogParserServiceImpl.java: JSONObject/JSON → ObjectMapper

https://claude.ai/code/session_01RTbU1cwhd7LLes6sJ5jaGF
Add a lightweight Docker Compose deployment for debugging with only
the 3 core services (portal, collector, analyzer) and JPDA remote
debug ports (5005-5007). Skip optional services (Canal, Syncer, Flink, GPT).

Changes:
- compose-debug.yml: minimal deps + compass-debug profile
- compass-debug.Dockerfile: debug container with JPDA support
- compass_env_debug.sh: debug-specific env (single Redis, no Canal)
- start_debug.sh: starts only core services with debug JVM agents
- Fix Redis config mismatch (cluster nodes -> single node)
- Update assembly.xml for new merged modules (collector, analyzer)
- Add startup/stop scripts for task-collector and task-analyzer
- Update start_all.sh hadoop conf copy for merged modules

https://claude.ai/code/session_01RTbU1cwhd7LLes6sJ5jaGF
Remove the four old modules that have been superseded by task-collector
and task-analyzer, and update all references across the codebase.

Deleted modules:
- task-application  → functionality moved to task-collector
- task-metadata     → functionality moved to task-collector
- task-detect       → functionality moved to task-analyzer
- task-parser       → functionality moved to task-analyzer

Changes:
- pom.xml: remove commented-out module entries and stale
  dependencyManagement entries for task-detect, task-metadata, task-parser
- task-assembly/assembly.xml: already updated to reference new modules
- Document updates (en + zh):
  - architecture.md / architecture_zh.md: update module descriptions
  - deployment_detail.md / deployment_zh_detail.md: replace old module
    sections with task-collector and task-analyzer documentation
  - deployment.md / deployment_zh.md: fix config comments

https://claude.ai/code/session_01RTbU1cwhd7LLes6sJ5jaGF
@chengaoying
Copy link
Copy Markdown
Author

v20250315

@chengaoying
Copy link
Copy Markdown
Author

v20260315

@chengaoying
Copy link
Copy Markdown
Author

v20250315

@chengaoying
Copy link
Copy Markdown
Author

v20260315

1 similar comment
@chengaoying
Copy link
Copy Markdown
Author

v20260315

@chengaoying
Copy link
Copy Markdown
Author

111

@chengaoying
Copy link
Copy Markdown
Author

da d

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants