Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uploaded devices sometimes not in devmap (race condition) #516

Open
eaton-coreymutter opened this issue Oct 22, 2024 · 0 comments · May be fixed by #517
Open

Uploaded devices sometimes not in devmap (race condition) #516

eaton-coreymutter opened this issue Oct 22, 2024 · 0 comments · May be fixed by #517
Assignees
Labels
bug Something isn't working

Comments

@eaton-coreymutter
Copy link
Contributor

🐞 Bug Report

Affected Services [REQUIRED]

The issue is located in: device-sdk-c

Is this a regression?

Yes, the previous version in which this bug was not present was: ....

Description and Minimal Reproduction [REQUIRED]

We have seen some instances where an uploaded device (e.g. res/devices/xxx.yaml) gets created and exists in core-metadata, but is not in the service's devmap, so resource operations return 404 until the service is restarted.

This seems to be a race in service.c startConfigured() where we call edgex_bus_register_handler() to get the callback to edgex_callback_add_devices(), which is the function that actually adds to the devmap. Then we immediately call edgex_device_devices_upload(), but in bus-mqtt.c we are calling an async subscribe function and it has apparently not finished, so we miss the callback because core-metadata posts the system event before our subscription has taken effect.

🔥 Exception or Error





🌍 Your Environment

Deployment Environment: Yocto, MQTT message bus

EdgeX Version [REQUIRED]: Post-3.1 main branch

Anything else relevant?

Posting a PR with the fix that worked for us - have edgex_bus_mqtt_subscribe() not return until that async subscribe is complete, by calling MQTTAsync_waitForCompletion().

It has been getting lots of exercise in our automated system test framework with no apparent ill effects,

@eaton-coreymutter eaton-coreymutter added the bug Something isn't working label Oct 22, 2024
@eaton-coreymutter eaton-coreymutter self-assigned this Oct 22, 2024
eaton-coreymutter added a commit to eaton-coreymutter/device-sdk-c that referenced this issue Oct 22, 2024
…exfoundry#516)

MQTTAsync_subscribe() is async. There are some places e.g. startConfigured()
where we expect the subscription to be in effect right after this call.
So wait for completion before returning.

Signed-off-by: Corey Mutter <[email protected]>
eaton-coreymutter added a commit to eaton-coreymutter/device-sdk-c that referenced this issue Oct 22, 2024
…exfoundry#516)

MQTTAsync_subscribe() is async. There are some places e.g. startConfigured()
where we expect the subscription to be in effect right after this call.
So wait for completion before returning.

Fixes: edgexfoundry#516

Signed-off-by: Corey Mutter <[email protected]>
eaton-coreymutter added a commit to eaton-coreymutter/device-sdk-c that referenced this issue Oct 23, 2024
…exfoundry#516)

MQTTAsync_subscribe() is async. There are some places e.g. startConfigured()
where we expect the subscription to be in effect right after this call.
So wait for completion before returning.

Fixes: edgexfoundry#516

Signed-off-by: Corey Mutter <[email protected]>
@github-project-automation github-project-automation bot moved this to New Issues in Technical WG Oct 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: QA/Code Review
1 participant