Possible inconsistent state between federated room and remote room if a cloud notification is retried #13079
Labels
1. to develop
bug
feature: api 🛠️
OCS API for conversations, chats and participants
feature: federation 🌐
Milestone
How to use GitHub
If the cloud notification to update a room property sent from the host server to the federated server fails for some reason it will be retried later through a background job. However, if the same property was modified again in the meantime and that notification was successfully sent when the background job sends the previous value again it will overwrite the current value.
Although syncing the room properties when joining it mitigates the problem it does not fully solve it.
A possible solution would be to add a property to rooms that keep track on how many times it was modified, so if a cloud notification is resent but the federated room is already in a newer state it is ignored.
An unsigned 32 bit integer can have 4294967296 values. If a property was changed every second and thus the counter was increased by 1 every second that would make 4294967296/(606024*365) = 136 years, so... I guess that an integer should be enough, even if it is signed and we start at 0 ;-)
This would also require sending the full state on each notification; otherwise, as right now only each single property modification is notified, some state could be lost if a cloud notification is ignored.
A mixed approach between the current behaviour (sending a delta without a modification version) and the fixed behaviour (sending the full state with a modification version) would be sending a delta with a modification version, and if the federated server receives a cloud notification with a modification version higher than current version + 1 then it would explicitly request the full state to the remote server and set it. But I am not sure if there could be some race condition with that approach.
In any case, note that full state does not necessarily mean all the room properties; it could be limited only to those actually needed in federated rooms (like done when joining the room).
The text was updated successfully, but these errors were encountered: