[debug don't merge] pr to reproduce error in aot_eager+post_grad_custom_post_pass #1785
+40
−3
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
a minimal repro of the loss & perf discrepancy. We don't consider any bucketing strategy here, just enable/disable
run_with_post_grad_graph
in this PR. (note: if the code runs perfectly in vllm. my suspect is sth is wrong with bwd lolll)Run:
Set
torch._inductor.config.run_with_post_grad_graph
intorchtitan/experiments/simple_fsdp/parallelize.py
to True (Run without inductor generated code)you will see the loss as follows
Set
torch._inductor.config.run_with_post_grad_graph
intorchtitan/experiments/simple_fsdp/parallelize.py
to False (Run with inductor generated code)you will see the loss as follows