-
Notifications
You must be signed in to change notification settings - Fork 14
Description
Hi there,
when trying to run GA4HPC to get information on a single job, GA4HPC crashes as follows:
`sh myCarbonFootprint.sh -S 2024-01-20 --filterJobIDs 4198761
Virtualenv: OK
Python versions: OK
Traceback (most recent call last):
File "GreenAlgo4HPC/GA_env/lib/python3.11/site-packages/pandas/core/indexes/base.py", line 3652, in get_loc
return self._engine.get_loc(casted_key)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "pandas/_libs/index.pyx", line 147, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 176, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 2606, in pandas._libs.hashtable.Int64HashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 2630, in pandas._libs.hashtable.Int64HashTable.get_item
KeyError: 0
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "GreenAlgo4HPC/init.py", line 155, in
extracted_data = main_backend(args)
^^^^^^^^^^^^^^^^^^
File "GreenAlgo4HPC/backend/init.py", line 240, in main_backend
summary_stats = summarise_data(df2, args=args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "GreenAlgo4HPC/backend/init.py", line 193, in summarise_data
userID = df.UserX[0]
~~~~~~~~^^^
File "GreenAlgo4HPC/GA_env/lib/python3.11/site-packages/pandas/core/series.py", line 1012, in getitem
return self._get_value(key)
^^^^^^^^^^^^^^^^^^^^
File "GreenAlgo4HPC/GA_env/lib/python3.11/site-packages/pandas/core/series.py", line 1121, in _get_value
loc = self.index.get_loc(label)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "GreenAlgo4HPC/GA_env/lib/python3.11/site-packages/pandas/core/indexes/base.py", line 3654, in get_loc
raise KeyError(key) from err
KeyError: 0`
(As also indicated in the error message) The error originates from line 193 in backend/init.py, specifically the expression userID = df.UserX[0] . From what I can tell, if there's only one job, then df.UserX doesn't work as a proper hashtable anymore.
What seems to work as a fix is to turn df.UserX into a string for the case of just one job (but this of course completely ignores any Pandas-based solution):
if (len(df.UserX) ==1): userID = str(df.UserX).split()[1] else: userID = df.UserX[0]