⚡️ Speed up function milliseconds_to_proto_timestamp by 8%
#128
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 8% (0.08x) speedup for
milliseconds_to_proto_timestampinmlflow/utils/proto_json_utils.py⏱️ Runtime :
13.0 milliseconds→12.0 milliseconds(best of61runs)📝 Explanation and details
The optimization replaces the protobuf library's
FromMilliseconds()method with direct field assignment, achieving an 8% speedup by eliminating method call overhead and internal validation.Key Changes:
t.FromMilliseconds(milliseconds), the code now manually setst.seconds = milliseconds // 1000andt.nanos = (milliseconds % 1000) * 1_000_000FromMilliseconds()method involves internal parsing, validation, and potentially more complex logic, while direct assignment is a simple memory write operationWhy This is Faster:
The line profiler shows the critical bottleneck was the
FromMilliseconds()call, taking 29.7% of execution time (9.8ms out of 33ms total). The optimized version reduces this to just two simple arithmetic operations (7.4% + 6.6% = 14% of total time), resulting in nearly 50% reduction in the conversion overhead.Performance Characteristics:
This optimization is particularly valuable if this function is called frequently in timestamp conversion workflows, as the cumulative effect of the 8% per-call improvement can be significant at scale.
✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
import pytest # used for our unit tests
from google.protobuf.timestamp_pb2 import Timestamp
from mlflow.utils.proto_json_utils import milliseconds_to_proto_timestamp
unit tests
-----------------------------
Basic Test Cases
-----------------------------
def test_zero_milliseconds():
# 0 ms should be the Unix epoch
codeflash_output = milliseconds_to_proto_timestamp(0) # 13.6μs -> 13.0μs (4.64% faster)
def test_one_millisecond():
# 1 ms after epoch
codeflash_output = milliseconds_to_proto_timestamp(1) # 14.1μs -> 14.2μs (0.979% slower)
def test_exact_second():
# 1000 ms = 1 second
codeflash_output = milliseconds_to_proto_timestamp(1000) # 13.5μs -> 12.3μs (10.4% faster)
def test_exact_minute():
# 60,000 ms = 1 minute
codeflash_output = milliseconds_to_proto_timestamp(60000) # 13.4μs -> 12.4μs (8.55% faster)
def test_exact_hour():
# 3,600,000 ms = 1 hour
codeflash_output = milliseconds_to_proto_timestamp(3600000) # 13.9μs -> 12.9μs (7.42% faster)
def test_arbitrary_timestamp():
# 1650000000000 ms = 2022-04-15T05:20:00Z
codeflash_output = milliseconds_to_proto_timestamp(1650000000000) # 14.5μs -> 13.2μs (9.58% faster)
def test_subsecond_precision():
# 123456789 ms = 1970-01-02T10:17:36.789Z
codeflash_output = milliseconds_to_proto_timestamp(123456789) # 15.1μs -> 14.5μs (4.10% faster)
-----------------------------
Edge Test Cases
-----------------------------
def test_negative_milliseconds():
# Negative ms should be before the epoch
codeflash_output = milliseconds_to_proto_timestamp(-1) # 14.7μs -> 14.3μs (3.27% faster)
codeflash_output = milliseconds_to_proto_timestamp(-1000) # 4.05μs -> 3.75μs (7.97% faster)
codeflash_output = milliseconds_to_proto_timestamp(-123456789) # 4.07μs -> 3.99μs (1.85% faster)
def test_maximum_timestamp():
# Timestamp max value is 9999999999999 ms = 2286-11-20T17:46:39.999Z
codeflash_output = milliseconds_to_proto_timestamp(9999999999999) # 15.5μs -> 14.8μs (4.27% faster)
def test_minimum_timestamp():
# Timestamp min value is -62135596800000 ms = 0001-01-01T00:00:00Z
codeflash_output = milliseconds_to_proto_timestamp(-62135596800000) # 14.9μs -> 13.1μs (13.4% faster)
def test_round_trip_consistency():
# Ensure round-trip conversion works for a variety of inputs
for ms in [0, 1, 1000, 123456789, 1650000000000, -1, -62135596800000, 9999999999999]:
codeflash_output = milliseconds_to_proto_timestamp(ms); s = codeflash_output # 44.4μs -> 41.5μs (6.88% faster)
# Parse back using Timestamp
t = Timestamp()
t.FromJsonString(s)
def test_fractional_second_boundaries():
# 999 ms should be .999Z, 1001 ms should be 1.001Z
codeflash_output = milliseconds_to_proto_timestamp(999) # 14.7μs -> 13.9μs (5.79% faster)
codeflash_output = milliseconds_to_proto_timestamp(1001) # 4.50μs -> 4.21μs (6.79% faster)
def test_leap_year():
# 2020-02-29T12:34:56.789Z (leap day)
import datetime
dt = datetime.datetime(2020, 2, 29, 12, 34, 56, 789000, tzinfo=datetime.timezone.utc)
ms = int(dt.timestamp() * 1000)
codeflash_output = milliseconds_to_proto_timestamp(ms) # 14.5μs -> 13.9μs (4.17% faster)
def test_millisecond_just_before_epoch():
# -1 ms is just before the epoch
codeflash_output = milliseconds_to_proto_timestamp(-1) # 15.3μs -> 14.5μs (5.36% faster)
def test_millisecond_just_after_epoch():
# 1 ms is just after the epoch
codeflash_output = milliseconds_to_proto_timestamp(1) # 15.4μs -> 14.1μs (8.88% faster)
def test_large_negative_timestamp():
# -9999999999999 ms: 1653-02-10T06:13:20.001Z
codeflash_output = milliseconds_to_proto_timestamp(-9999999999999) # 16.3μs -> 15.3μs (6.95% faster)
-----------------------------
Large Scale Test Cases
-----------------------------
def test_many_arbitrary_timestamps():
# Test a large number of random timestamps for consistency
import random
random.seed(42)
for _ in range(1000):
ms = random.randint(-62135596800000, 9999999999999)
codeflash_output = milliseconds_to_proto_timestamp(ms); s = codeflash_output # 2.91ms -> 2.66ms (9.33% faster)
t = Timestamp()
t.FromJsonString(s)
def test_sequential_milliseconds():
# Test a sequence of milliseconds to ensure monotonicity and correctness
start = 1650000000000
for i in range(1000):
ms = start + i
codeflash_output = milliseconds_to_proto_timestamp(ms); s = codeflash_output # 2.43ms -> 2.27ms (7.09% faster)
t = Timestamp()
t.FromJsonString(s)
# Ensure that the timestamp string increases lexicographically as ms increases
if i > 0:
codeflash_output = milliseconds_to_proto_timestamp(start + i - 1); prev_s = codeflash_output
def test_performance_large_batch():
# Test performance for a large batch of conversions
# Not a strict performance test, but ensures no exceptions for large batches
batch = [i * 1000000 for i in range(1000)] # 0 ms to ~1e9 ms
results = [milliseconds_to_proto_timestamp(ms) for ms in batch]
for s in results:
# Should be ISO8601 format: YYYY-MM-DDTHH:MM:SS(.sss)?Z
import re
-----------------------------
Additional Robustness Tests
-----------------------------
@pytest.mark.parametrize("invalid_input", [None, "123", 123.456, [], {}, object()])
def test_invalid_input_types(invalid_input):
# Should raise TypeError or ValueError for non-int input
with pytest.raises((TypeError, ValueError)):
milliseconds_to_proto_timestamp(invalid_input) # 27.2μs -> 20.6μs (31.9% faster)
codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import pytest # used for our unit tests
from google.protobuf.timestamp_pb2 import Timestamp
from mlflow.utils.proto_json_utils import milliseconds_to_proto_timestamp
unit tests
--- Basic Test Cases ---
def test_zero_milliseconds():
# 0 milliseconds should be the Unix epoch
codeflash_output = milliseconds_to_proto_timestamp(0) # 13.3μs -> 12.3μs (8.44% faster)
def test_one_second():
# 1000 milliseconds is exactly 1 second after epoch
codeflash_output = milliseconds_to_proto_timestamp(1000) # 12.7μs -> 12.0μs (5.68% faster)
def test_one_minute():
# 60,000 milliseconds is exactly 1 minute after epoch
codeflash_output = milliseconds_to_proto_timestamp(60000) # 13.0μs -> 12.3μs (5.14% faster)
def test_specific_datetime():
# 1680000000000 ms = 2023-03-29T06:13:20Z
codeflash_output = milliseconds_to_proto_timestamp(1680000000000) # 14.4μs -> 14.0μs (2.65% faster)
def test_milliseconds_with_fraction():
# 1680000000123 ms = 2023-03-29T06:13:20.123Z
codeflash_output = milliseconds_to_proto_timestamp(1680000000123) # 15.8μs -> 15.0μs (5.62% faster)
--- Edge Test Cases ---
def test_negative_milliseconds():
# -1000 ms is 1 second before epoch
codeflash_output = milliseconds_to_proto_timestamp(-1000) # 13.7μs -> 12.8μs (7.38% faster)
def test_negative_with_fraction():
# -123 ms is 123 ms before epoch
codeflash_output = milliseconds_to_proto_timestamp(-123) # 15.1μs -> 14.0μs (7.81% faster)
def test_maximum_timestamp():
# Maximum protobuf Timestamp value: 9999-12-31T23:59:59.999999999Z
# This is 253402300799999 ms
codeflash_output = milliseconds_to_proto_timestamp(253402300799999) # 15.5μs -> 14.4μs (7.50% faster)
def test_minimum_timestamp():
# Minimum protobuf Timestamp value: 0001-01-01T00:00:00Z
# This is -62135596800000 ms
codeflash_output = milliseconds_to_proto_timestamp(-62135596800000) # 14.7μs -> 13.5μs (8.69% faster)
def test_just_before_epoch():
# -1 ms is just before epoch
codeflash_output = milliseconds_to_proto_timestamp(-1) # 14.8μs -> 14.4μs (3.00% faster)
def test_just_after_epoch():
# 1 ms is just after epoch
codeflash_output = milliseconds_to_proto_timestamp(1) # 14.6μs -> 14.0μs (4.41% faster)
def test_fractional_second_rounding():
# 123456789 ms = 1970-01-02T10:17:36.789Z
codeflash_output = milliseconds_to_proto_timestamp(123456789) # 15.1μs -> 14.9μs (1.22% faster)
def test_large_negative():
# -62135596801234 ms = 0000-12-31T23:59:58.766Z
# This is before the minimum valid timestamp, should raise ValueError
with pytest.raises(ValueError):
milliseconds_to_proto_timestamp(-62135596801234) # 6.38μs -> 8.12μs (21.4% slower)
def test_large_positive_overflow():
# 253402300800000 ms is just over the max valid timestamp, should raise ValueError
with pytest.raises(ValueError):
milliseconds_to_proto_timestamp(253402300800000) # 6.01μs -> 7.65μs (21.4% slower)
def test_non_integer_input():
# Should raise TypeError for non-integer input
with pytest.raises(TypeError):
milliseconds_to_proto_timestamp("1000") # 3.89μs -> 2.74μs (42.2% faster)
with pytest.raises(TypeError):
milliseconds_to_proto_timestamp(1000.5) # 4.39μs -> 2.74μs (60.4% faster)
with pytest.raises(TypeError):
milliseconds_to_proto_timestamp(None) # 1.22μs -> 1.02μs (20.1% faster)
--- Large Scale Test Cases ---
def test_many_sequential_timestamps():
# Test conversion for 1000 sequential milliseconds
start_ms = 1680000000000
expected_base = "2023-03-29T06:13:20"
for i in range(1000):
codeflash_output = milliseconds_to_proto_timestamp(start_ms + i); ts = codeflash_output # 2.43ms -> 2.24ms (8.34% faster)
# The seconds should be 2023-03-29T06:13:20 for i < 1000
if i < 1000:
# Extract seconds and milliseconds
if i == 0:
pass
elif i < 1000:
sec = expected_base
ms = i
def test_large_gap():
# Test conversion for timestamps spaced by 1,000,000 ms (16m40s)
start_ms = 1680000000000
for i in range(1000):
ms = start_ms + i * 1000000
codeflash_output = milliseconds_to_proto_timestamp(ms); ts = codeflash_output # 2.19ms -> 2.00ms (9.18% faster)
# Should have a valid format: YYYY-MM-DDTHH:MM:SS(.mmm)?Z
import re
def test_leap_year_and_dst():
# 1582934400000 ms = 2020-02-29T00:00:00Z (leap day)
codeflash_output = milliseconds_to_proto_timestamp(1582934400000) # 15.3μs -> 14.2μs (7.58% faster)
# 1615680000000 ms = 2021-03-14T00:00:00Z (DST change in US, but UTC unaffected)
codeflash_output = milliseconds_to_proto_timestamp(1615680000000) # 3.70μs -> 3.39μs (9.09% faster)
def test_end_of_months():
# 1640995199000 ms = 2021-12-31T23:59:59Z (end of year)
codeflash_output = milliseconds_to_proto_timestamp(1640995199000) # 13.9μs -> 13.2μs (5.26% faster)
# 1612137599000 ms = 2021-01-31T23:59:59Z (end of Jan)
codeflash_output = milliseconds_to_proto_timestamp(1612137599000) # 3.83μs -> 3.48μs (10.0% faster)
# 1614556799000 ms = 2021-02-28T23:59:59Z (end of Feb, non-leap year)
codeflash_output = milliseconds_to_proto_timestamp(1614556799000) # 2.49μs -> 2.32μs (7.28% faster)
def test_far_future():
# 2208988800000 ms = 2040-01-01T00:00:00Z
codeflash_output = milliseconds_to_proto_timestamp(2208988800000) # 14.3μs -> 13.6μs (4.83% faster)
def test_far_past():
# -2208988800000 ms = 1900-01-01T00:00:00Z
codeflash_output = milliseconds_to_proto_timestamp(-2208988800000) # 14.6μs -> 13.7μs (6.79% faster)
codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from mlflow.utils.proto_json_utils import milliseconds_to_proto_timestamp
def test_milliseconds_to_proto_timestamp():
milliseconds_to_proto_timestamp(0)
To edit these changes
git checkout codeflash/optimize-milliseconds_to_proto_timestamp-mhuhvtn1and push.