-
Notifications
You must be signed in to change notification settings - Fork 7
Doc: update description for efficiency test #304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR revises the efficiency test description to reflect updated sample counts, warm-up phase, and dynamic GPU utilization.
- Reduced the number of frames from 2000 to 1000 and adjusted the warm-up phase from 20% to 10%.
- Added a binary search step for dynamically sizing each frame to fully utilize GPU capacity.
- Updated formulas and denominators to match the new 900-configuration average.
| ### Efficiency | ||
|
|
||
| To assess the efficiency of the model, we randomly selected 2000 frames from the domain of Inorganic Materials and Catalysis using the aforementioned out-of-distribution datasets. Each frame was expanded to include 800 to 1000 atoms through the replication of the unit cell, ensuring that measurements of inference efficiency occurred within the regime of convergence. The initial 20% of the test samples were considered a warm-up phase and thus were excluded from the efficiency timing. We have reported the average efficiency across the remaining 1600 frames. | ||
| To assess the efficiency of the model, we randomly selected 1000 frames from the domain of Inorganic Materials and Catalysis using the aforementioned out-of-distribution datasets. Each frame was expanded to contain between 800 and 1000 atoms — dynamically determined using a binary search algorithm to fully utilize GPU capacity — by replicating the unit cell. This ensured that measurements of inference efficiency were conducted within the regime of convergence. The initial 10% of the test samples were considered a warm-up phase and thus were excluded from the efficiency timing. We have reported the average efficiency across the remaining 900 frames. |
Copilot
AI
May 28, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Consider adding a brief reference or link to where the binary search algorithm is implemented, or a note on its termination criteria, to improve reproducibility.
| To assess the efficiency of the model, we randomly selected 1000 frames from the domain of Inorganic Materials and Catalysis using the aforementioned out-of-distribution datasets. Each frame was expanded to contain between 800 and 1000 atoms — dynamically determined using a binary search algorithm to fully utilize GPU capacity — by replicating the unit cell. This ensured that measurements of inference efficiency were conducted within the regime of convergence. The initial 10% of the test samples were considered a warm-up phase and thus were excluded from the efficiency timing. We have reported the average efficiency across the remaining 900 frames. | |
| To assess the efficiency of the model, we randomly selected 1000 frames from the domain of Inorganic Materials and Catalysis using the aforementioned out-of-distribution datasets. Each frame was expanded to contain between 800 and 1000 atoms — dynamically determined using a binary search algorithm to fully utilize GPU capacity — by replicating the unit cell. The binary search algorithm iteratively adjusts the number of atoms until the GPU capacity is maximized, terminating when the difference between the estimated and actual GPU utilization falls below a predefined threshold. For implementation details, refer to [Binary Search Algorithm Documentation](https://example.com/binary-search-doc). This ensured that measurements of inference efficiency were conducted within the regime of convergence. The initial 10% of the test samples were considered a warm-up phase and thus were excluded from the efficiency timing. We have reported the average efficiency across the remaining 900 frames. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #304 +/- ##
=======================================
Coverage 65.54% 65.54%
=======================================
Files 35 35
Lines 1550 1550
Branches 185 185
=======================================
Hits 1016 1016
Misses 496 496
Partials 38 38 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This PR updates the description for efficiency test. The changes were made in #264.