Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Federated Plan Computation & Optimization in Program-Level Fed Planner #2238

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

min-guk
Copy link
Contributor

@min-guk min-guk commented Feb 25, 2025

Overview

This PR enhances Program-Level Fed Planner by improving federated plan computation, cost modeling, visualization, and redundant TWrite/TRead handling.


1. Federated Plan Computation

  • Computes federated plans for all Hop DAGs in DMLProgram.
  • Implements rewiring logic for TWrite/TRead:
    • Uses TransTable (Map<String, Hop>) to track TWrite.
    • Retrieves from TransTable and connects at TRead.

2. Cost Model & Optimization

  • Statement Block-Based Cost Weighting:
    • while block: ×10
    • if block: ×0.5
    • Example: if inside while1 × 10 × 0.5 = 5
  • Cost Deduplication:
    • Cost *= weight / # of Parents

3. Federated Plan Visualization

  • Converts FedPlan text output to a graph using: NetworkX, Matplotlib, Graphviz
    • Since only imports are used without modifying library code, there are no license conflicts.
  • Helps in debugging.

4. Handling Redundant TWrite/TRead in If-Else Blocks

  • Problem:
    • TWrite occurs in both if and else, but TRead needs both inputs.
    • Solution:
      • Immutable table (pre-if-else state).
      • If-table & else-table → merge into a unified table.
      • Supports nested if-else handling.

5. Hop DAG Disconnection Fixes

  • Disconnected TWrite in Predicates:
    • Fix: Connect TWrite pred to root dummy node.
  • Unreferenced Hops (e.g., print):
  • Loop TWrite Not Used Outside:
    • No computing/forwarding cost.

6. TWrite/TRead Constraint Handling

  • Forwarding Cost: Network cost when a node's federated output type differs from its parent.
  • Self Computing Cost: Computation cost at the given node.
  • Cumulative Cost = Self Computing Cost + Child’s Cumulative & Forwarding Cost.
Node Type Federated Output Type Computing Cost Forwarding Cost
Child of TWrite Identical ❌ (Op → TWrite)
TWrite Identical ❌ (TWrite → TRead)
TRead Identical ✅ (TRead → Parent)
Parent of TRead Can Change

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

1 participant