-
Notifications
You must be signed in to change notification settings - Fork 293
[rocprof-compute] Adding Triton backend to marker injection #6901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
ggottipa-amd
merged 40 commits into
rocprofiler-compute-develop
from
users/ggottipa-amd/add-triton-backend
Jun 30, 2026
Merged
Changes from all commits
Commits
Show all changes
40 commits
Select commit
Hold shift + click to select a range
c35b15a
Add Triton ROCTX api-trace backend and analyze integration
ggottipa-amd bf054ba
updating the tests
ggottipa-amd 04737fe
removing stale references
ggottipa-amd 440cd28
adding test_profile_triton_trace
ggottipa-amd 3eb5201
changing api-trace to ml-api-trace
ggottipa-amd eab045d
removing nested functions, using partial and partial method while wra…
ggottipa-amd 69cb684
update changelog;correct encoding;remove globals
ggottipa-amd 39e098a
changelog correction
31d1948
removing torch specific function
ggottipa-amd a0cfcd6
removing API_ALIAS
ggottipa-amd 91cfaac
logging triton
ggottipa-amd a1eb5b2
fixing calll count and adding checks on triton flags in analyze
ggottipa-amd a9d7784
adding test for ml-api-trace
ggottipa-amd 081fe5d
Update projects/rocprofiler-compute/docs/how-to/analyze/cli.rst
ggottipa-amd 6a0685c
Update projects/rocprofiler-compute/docs/how-to/analyze/cli.rst
ggottipa-amd 44e4625
Update projects/rocprofiler-compute/docs/how-to/analyze/cli.rst
ggottipa-amd 3cdade3
Update projects/rocprofiler-compute/docs/how-to/analyze/cli.rst
ggottipa-amd 608ba3b
Update projects/rocprofiler-compute/docs/how-to/analyze/cli.rst
ggottipa-amd 3013095
Update projects/rocprofiler-compute/docs/how-to/analyze/cli.rst
ggottipa-amd a4234ae
Update projects/rocprofiler-compute/docs/how-to/analyze/cli.rst
ggottipa-amd 3e3b712
Update projects/rocprofiler-compute/docs/how-to/analyze/cli.rst
ggottipa-amd 322fe9a
Update projects/rocprofiler-compute/docs/how-to/analyze/cli.rst
ggottipa-amd 586ad67
Update projects/rocprofiler-compute/docs/how-to/profile/mode.rst
ggottipa-amd 4d0473e
docs correction
ggottipa-amd d76c2d6
Update projects/rocprofiler-compute/docs/how-to/analyze/cli.rst
ggottipa-amd 2b9c318
Update projects/rocprofiler-compute/docs/how-to/analyze/cli.rst
ggottipa-amd b971d98
test marker changed to triton-trace
ggottipa-amd 6d60ec3
Moving flags to experimental section
ggottipa-amd 7a3ba52
added idempotency test
ggottipa-amd 3a46ca5
Update projects/rocprofiler-compute/tests/test_profile_general.py
ggottipa-amd 7656cd3
Update projects/rocprofiler-compute/tests/test_profile_general.py
ggottipa-amd 7b8c9cb
resolving comments
ggottipa-amd dae1b20
removing stale flag check
ggottipa-amd 90cac98
adding sample triton operators listing
ggottipa-amd 491b779
adding cached workloads for testing new flags
ggottipa-amd 894decb
registering test_profile_triton_trace
ggottipa-amd eadf69d
Merge branch 'rocprofiler-compute-develop' into users/ggottipa-amd/ad…
ggottipa-amd 60e1e65
refactor(rocprof-compute): tidy ml-api-trace review nits
vedithal-amd ccd5d6d
Addressing comments
ggottipa-amd 8db192c
Merge branch 'rocprofiler-compute-develop' into users/ggottipa-amd/ad…
ggottipa-amd File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|
ggottipa-amd marked this conversation as resolved.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
33 changes: 33 additions & 0 deletions
33
projects/rocprofiler-compute/sample/torch_compile_triton.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,33 @@ | ||
| # Copyright (c) Advanced Micro Devices, Inc. | ||
| # SPDX-License-Identifier: MIT | ||
|
|
||
| """Minimal torch.compile workload that generates Triton kernels.""" | ||
|
|
||
| import sys | ||
|
|
||
| import torch | ||
|
|
||
|
|
||
| @torch.compile | ||
| def fused(x, y): | ||
| return torch.relu(x) * y + x | ||
|
|
||
|
|
||
| def main(): | ||
| if not torch.cuda.is_available(): | ||
| print("GPU is required for this sample. Exiting.") | ||
| sys.exit(1) | ||
|
|
||
| x = torch.randn(4096, 4096, device="cuda") | ||
| y = torch.randn(4096, 4096, device="cuda") | ||
|
|
||
| # First call compiles; later calls reuse the generated Triton kernels. | ||
| for _ in range(3): | ||
| fused(x, y) | ||
|
|
||
| torch.cuda.synchronize() | ||
| print("Compiled workload completed") | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| main() |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.