-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat!: Ensure that all autoevent resource scheduler gets invoked properly during every boot of the target machine #445
base: main
Are you sure you want to change the base?
Conversation
include/edgex/edgex-base.h
Outdated
@@ -28,6 +28,12 @@ typedef struct devsdk_nvpairs | |||
struct devsdk_nvpairs *next; | |||
} devsdk_nvpairs; | |||
|
|||
typedef struct devsdk_new_device |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of defining a new structure to hold a list of device names, use a string map:
iot_data_t *devices;
Initialisation (create empty map):
devices = iot_data_alloc_map (IOT_DATA_STRING);
Add entry:
iot_data_map_add (devices, iot_data_alloc_string (name, IOT_DATA_COPY), iot_data_alloc_null ());
Test for existing name:
if (iot_data_string_map_get (devices, name))
Clean up
iot_data_free (devices);
Also, store the map as a member of the devmap or service structs rather than a global
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @iain-anderson for this suggestion... This shall be more efficient to be used...
result = search_devsdk_device(device_first, dup->name); | ||
if (result) | ||
{ | ||
edgex_device_autoevent_start (map->svc, dup); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what effect this is intended to have - looks as though autoevents will only be started for devices added via edgex_add_device?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what effect this is intended to have - looks as though autoevents will only be started for devices added via edgex_add_device?
@iain-anderson Thanks for the review... I am putting my observation and analysis below point wise:
The autoevents gets invoked in 2 ways for 2 different scenarios:
1. When we boot target for the very first time:
In this scenario the device profile and default device interfaces (viz. eth0 in this case) gets added to metadata for the very first time.
After the successful addition of the device the autoevents gets scheduled and invoked from the API's in the below order:
edgex_devmap_populate_devices---->add_locked---->edgex_device_autoevent_start---->ae_runner
2. When we reboot the target or poweroff and start the target:
In this scenario the device profile and default device interfaces (viz. eth0 in this case) is already in the entry and hence when sdk tries
to add the device in metadata, it goes for a check initially where it is found that device already exits in the entry and hence the device
profile and device just needs to be updated in the sdk in order to invoke autoevents with the help of device profile update callback.
Issue in target machine:
It runs fine when target gets booted for the very first time. But after reboot a flaky issue is found. Only 1st autoevent is invoked and rest
all of them are not appearing. Hence, in this case when we try to acquire value of rest of the autoevent resources from edgex developer's UI
we get 502 error.
Root cause of the issue:
After reboot when service starts, sometimes the below order of API's gets triggered first before the device exist check while device addition.
edgex_devmap_populate_devices---->add_locked--->edgex_device_autoevent_start---->ae_runner
Since the device is already in metadata even before the existing check, the above API order tries to invoke the autoevents for the device with schedulers.
The ae_runner works in schedulers running parallely. Meanwhile there is an attempt for adding device where it is found under the check that
device already exists. Under the same check there is an API edgex_device_release call which has call for edgex_device_autoevent_stop inside
which actually stops and deletes the remaining autoevent schedulers which are yet to be run. Once stopped further no autoevents gets triggered ever.
In this scenario no device profile update callback gets invoked which also triggers the autoevents.
Possible fix:
We can make a data structure which would contain the list of only newly added device. We will do a search operation on the data structure just before edgex_device_autoevent_start
during the API order call as mentioned above earlier. This will make sure that autoevents gets invoked with this API order call only for newly added device.
edgex_devmap_populate_devices---->add_locked---->search_devsdk_device--->{if search for a new device successful}---->edgex_device_autoevent_start---->ae_runner
For devices already in the metadata entry, the autoevents will be automatically scheduled and invoked by the device profile update callback.
…thout getting missed Signed-off-by: TuhinDasgupta-eaton <[email protected]>
Signed-off-by: TuhinDasgupta-eaton <[email protected]>
Signed-off-by: Tuhin Dasgupta [email protected] Signed-off-by: TuhinDasgupta-eaton <[email protected]>
Signed-off-by: TuhinDasgupta-eaton <[email protected]>
Signed-off-by: TuhinDasgupta-eaton <[email protected]>
2a3632d
to
7918e98
Compare
Autoevents resources are getting triggered properly during first boot of the target. But, we are facing a flaky EdgeX service issue where all the device autoevents doesn't get triggered when we reboot the target. It has been found that whenever our sdk tries to add device to metadata through edgex_add_device API, there is a device existing check inside that and if the device exists already, then an API edgex_device_release gets called which further calls edgex_device_autoevent_stop. If suppose the respective device autoevents are already invoked and running, it stops all autoevents and further no autoevent gets called ever. Due to this we are getting 502 error while acquiring autoevent resources value from edgex developer's UI.
This change would ensure that all the device service autoevents gets triggered properly during every boot...
Signed-off-by: Tuhin Dasgupta [email protected]
If your build fails due to your commit message not passing the build checks, please review the guidelines here: https://github.com/edgexfoundry/device-sdk-c/blob/main/.github/Contributing.md
PR Checklist
Please check if your PR fulfills the following requirements:
BREAKING CHANGE:
describing the break)Testing Instructions
New Dependency Instructions (If applicable)