You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
AMD가 ROCm inference 성능을 2배로 올린 AITER 커널 라이브러리를 분석합니다. 4가지 커널 백엔드 전략(Triton, CK, HIP, ASM), JIT 컴파일 파이프라인, 그리고 DeepSeek R1에서 2배 throughput 향상을 달성한 구조를 살펴봅니다.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
posts/rocm-aiter/
AMD가 ROCm inference 성능을 2배로 올린 AITER 커널 라이브러리를 분석합니다. 4가지 커널 백엔드 전략(Triton, CK, HIP, ASM), JIT 컴파일 파이프라인, 그리고 DeepSeek R1에서 2배 throughput 향상을 달성한 구조를 살펴봅니다.
https://hyper-accel.github.io/posts/rocm-aiter/
Beta Was this translation helpful? Give feedback.
All reactions