-
Notifications
You must be signed in to change notification settings - Fork 50
[WIP] Add New Kernel Information #532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
… array dimensions
src/algorithm/ATOMIC.cpp
Outdated
| setComplexity(Complexity::N); | ||
|
|
||
| setNestedLoops(0); | ||
| setArrayDimensions(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By Array Dimensions you're basically asking if the problem worked on by the kernel is 1, 2, 3, etc dimensions? How about calling it Dimensions, Dimensionality, or something like that?
There are some kernels where the problem and arrays have multiple differing dimensionalities. For example LTIMES has a 4d loop that goes over 2d and 3d arrays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can rename these to MaxLoopDimensions and MaxArrayDimensions to clarify we are interested in recording the largest dimensionality loop & array. So for LTIMES MaxLoopDimensions=4 and MaxArrayDimensions=3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
| setNestedLoops(0); | ||
| setArrayDimensions(1); | ||
| setNumArrays(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In most cases this is simple, but in some cases as mentioned for LTIMES there are arrays of differing dimensionalities. Do you have a good idea of what you want to count here or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want to count all arrays regardless of dimensionality. i.e. if an array is included in the BytesPerRep count, we are going to count it here.
|
Regarding BatchSize are you ultimately trying to count the number of gpu synchronizations? I assume you don't mean omp barriers, though this is something that should be covered by the KernelsPerRep attribute. I this that the number of gpu synchronizations should be the same for all tunings, but we can check. |
|
It is possible something like |
|
Also comparison sorts are O(n*lg(n)) but there is only O(n) parallelism
|
Summary
src/common/KernelBase.hpp/cppto add new kernel attributes for:MaxLoopDimensionsNumber of levels in the largest nested loopMaxPerfectLoopDimensionsNumber of levels in the largest perfectly nested loopMaxArrayDimensionsNumber of dimensions in the highest-dimensionality array.NumArraysTotal number of arrays initialized in the kernel.Decided not to proceed with this. Too difficult to define what this attribute should mean, e.g. can depend on tuning in case ofBatchSizeNumber of executions between global synchronization pointsshared_replication. And not sure how attribute would be used. Closest information to this is theLaunchfeature, for RAJA team-level parallelism.src/*/*.cppProblemDimensionalityinstead ofMaxArrayDimensions- [WIP] Add New Kernel Information #532 (comment)AlgorithmParallelism- [WIP] Add New Kernel Information #532 (comment)