-
Notifications
You must be signed in to change notification settings - Fork 1
[sc-241273] Add interpolation to Transpose & Sync #70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: feature/sc-241110-add-step-column
Are you sure you want to change the base?
[sc-241273] Add interpolation to Transpose & Sync #70
Conversation
…73-add-interpolation-to-Tranpose-n-Sync
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The basics work but there are a couple of things to fix.
And just to clarify the behaviour, i think it should be as follows:
Without interpolation (no change):
No previous value -> No value
Both previous and subsequent -> use previous value
No subsequent value -> use previous value
Timestamp col value - always previous value
With interpolation:
No previous value -> No value
Both previous and subsequent -> use interpolated value
No subsequent value -> use previous value
Timestamp col value - always previous value (which is the first value in interpolation)
(Though perhaps we should have two columns in this case, one for the first value and one for the second in the interpolation - though that might complicate things)
if should_add_timestamps_columns: | ||
values.update({ | ||
"{}{}".format(attribute_path, OSIsoftConstants.TIMESTAMP_COLUMN_SUFFIX): current_timestamps_cache[attribute_index], | ||
"{}{}".format(attribute_path, OSIsoftConstants.VALUE_COLUMN_SUFFIX): current_values_cache[attribute_index] | ||
"{}{}".format(attribute_path, OSIsoftConstants.TIMESTAMP_COLUMN_SUFFIX): next_cached_timestamp, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here you have changed current_timestamps_cache[attribute_index]
to next_cached_timestamp
-
This does not give us the correct timestamp for the value for the non-interpolated case.
For the interpolated case I'm not sure what the correct value would be as both are relevant - but it is simplest to just do the same as the non-interpolated case and give the current timestamp
next_timestamp = get_epoch_from_string(next_timestamp) | ||
time_now = get_epoch_from_string(time_now) | ||
if previous_timestamp is None or next_timestamp is None or time_now is None: | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
current_timestamps_cache[attribute_index] = next_timestamps_cache[attribute_index]
current_values_cache[attribute_index] = next_values_cache[attribute_index]
This will create a divide by zero error in the code below - an easy thing to do is catch it here like this.
(I.e. use the previous value just like the non-interpolated case in this situation)
if previous_timestamp == next_timestamp:
return previous_value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact we could make it the following - this will avoid doing the interpolation calculation when it is not necessary as we have an exact value:
if previous_timestamp == next_timestamp or time_now == previous_timestamp :
return previous_value
value_now = ( | ||
( | ||
float(next_value) - float(previous_value) | ||
) / ( | ||
float(next_timestamp) - float(previous_timestamp) | ||
) | ||
) * ( | ||
float(time_now) - float(previous_timestamp) | ||
) + float(previous_value) | ||
return value_now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👓 readability: a bit hard to parse for the reader. Consider changing to something like this:
rate_of_change = (float(next_value) - float(previous_value)) / (float(next_timestamp) - float(previous_timestamp))
value_now = float(previous_value) + rate_of_change * (float(time_now) - float(previous_timestamp))
utc_time = datetime.strptime(datetime_string, "%Y-%m-%dT%H:%M:%SZ") | ||
epoch_time = (utc_time - datetime(1970, 1, 1)).total_seconds() | ||
except Exception: | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably log an error here? I don't think the calling code will be able to handle this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM & retested 👍
Story details: https://app.shortcut.com/dataiku/story/241273