⚡️ Speed up function fillna
by 5,211%
#72
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 5,211% (52.11x) speedup for
fillna
insrc/numpy_pandas/dataframe_operations.py
⏱️ Runtime :
155 milliseconds
→2.91 milliseconds
(best of652
runs)📝 Explanation and details
The optimized version achieves a 5211% speedup by replacing an inefficient row-by-row loop with pandas' vectorized operations.
Key Performance Issues in Original Code:
for i in range(len(df))
loop processes each row individually, which is extremely slow for pandas DataFramespd.isna(df.iloc[i][column])
performs individual cell access and NA checking in each iterationresult.iloc[i, df.columns.get_loc(column)] = value
uses positional indexing for each assignmentOptimizations Applied:
mask = result[column].isna()
creates a boolean mask for all NA values in one operationresult.loc[mask, column] = value
assigns the fill value to all NA positions simultaneouslyWhy This Creates Massive Speedup:
Performance Characteristics by Test Case:
The optimization is particularly effective for larger datasets where the vectorization benefits far outweigh the setup costs, making it ideal for real-world data processing scenarios.
✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-fillna-mdpetka0
and push.