Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Callback safety #464

Open
MarijnS95 opened this issue Jan 9, 2024 · 0 comments
Open

Callback safety #464

MarijnS95 opened this issue Jan 9, 2024 · 0 comments

Comments

@MarijnS95
Copy link
Member

MarijnS95 commented Jan 9, 2024

Important

This report/analysis is unfinished.

In order to properly represent the semantics around callbacks registered to Android via NDK bindings, there are a few things we need to know in order to map them safely, and unfortunately the NDK documentation only barely ever specifies them to us. I've reported this upstream (https://issuetracker.google.com/issues/318944941) and will backlink to this issue, which I'll use to track what we need to know, and what we know for which callbacks.

This issue also serves as a guideline when implementing future APIs that contain callbacks.

What

  • Is this callback only called once, or multiple times?
    • If called once, we can have FnOnce for move semantics and drop the Box<dyn Fn> after being called;
      • But what if it is never called -> we'll leak a box.
  • Will a NULL call invalidate a previous callback?
    • Allows us to drop the previous Fn;
  • Will a repeat ..setCallback..() overwrite a previous callback (if used with the same parameters)?
    • Allows us to only track one Fn and overwrite - hence drop - the previous;
  • Will _delete() on the object invalidate any registered callbacks?
    • Also check _reset() functions!
    • Allows us to track the Fn in the object that owns the pointer;
      • Impossible when the object is refcounted, or merely a reference to a (thread-local) global.
    • Some callbacks, such as those for ASurfaceTransaction conveniently outlive the transaction. Together with FnOnce semantics (and repeat registrations not overwriting previous callbacks), tracking the lifetimes is trivial;
  • On what thread is this callback called?
    • Important to know that it is synchronized with _delete() or ..setCallback..() in order to drop the Fn;
    • If called on the same thread, no marker traits are needed;
    • If called on one other thread, serially, only Send is needed. Implementation can be FnOnce or FnMut;
    • If called on multiple other threads, potentially concurrently, Send+Sync is needed. the implementation can only be FnOnce or Fn as it is unsafe to mutate captured state concurrently.
  • Is a previous callback cleared if the registration function returns an error?
  • On the side: is the object itself thread-safe? That may affect where (relatively) callbacks are invoked, and how objects should be handled when Android gives a pointer to them inside a callback.

Which callbacks

Caution

A bunch of obvious things are not yet filled out in this table. I want to post this before (trivially) testing a few things and adding more callback registration functions.

Name Thread Times called NULL clears callback Repeated call overwrites previous _delete() invalidates callback Cleared on failure Resulting type Track Box with object
AMediaCodec_setAsyncNotifyCallback() One NDK internal thread >0 Yes Probably? Yes Manual check: yes FnMut<Send> Yes
AMediaCodec_setOnFrameRenderedCallback() 1 One NDK internal thread ? Yes Probably? Yes ? FnMut<Send> Yes
AAudioStreamBuilder_setDataCallback() and AAudioStreamBuilder_setErrorCallback() "real-time thread, never simultaneously" ? ? ? ? - FnMut<Send> ?
AndroidBitmap_compress() Current thread, while function is called >1 - - - - FnMut No, drop immediately after call
ALooper_addFd() with a callback Looper thread (inferred) ? No 2 Yes for same FD No 3 ? FnMut for ThreadLooper, FnMut<Send> for ForeignLooper Can't 3
AImageReader_setBufferRemovedListener() and AImageReader_setImageListener() Manual test: One internal thread >0 Manual test: yes Yes (manual test: race conditions?) Manual test: race conditions in concurrently-received callback! ? FnMut<Send> Probably?
AThermal_registerThermalStatusListener() Manual test: internal thread. Code: callback is guarded by a mutex >0 No; there's an unregister() call No; adding same callback+userdata is an error Yes, behind mutex No FnMut<Send> Probably, to free on Drop if user forgets to unregister?
AChoreographer_postFrameCallback(Delayed)(64)() Looper (current!) thread (don't make Choreographer Send/Sync) 0-1 ? No; each registration fires individually No No failure FnOnce No
AChoreographer_postVsyncCallback() Looper (current!) thread (don't make Choreographer Send/Sync) 0-1 ? No; each registration fires individually No No failure FnOnce No
AChoreographer_registerRefreshRateCallback() Looper (current) thread >0 ? No, assuming fn/userdata ptr changes? Our instance is just a reference No failure FnMut No, Choreographer is just a reference
ASurfaceTransaction_setOnComplete() (One?) NDK internal thread 0-1 ? No, each registration fires individually No No failure FnOnce<Send> No
ASurfaceTransaction_setOnCommit() (One?) NDK internal thread 0-1 ? No, each registration fires individually No No failure FnOnce<Send> No

ImageReader

This is a strange beast. No docs and it's very easy to trip up by deleting the object or changing the callback at runtime, while a callback may be coming in. I've already locally moved the overwriting of Boxes until after the setxxxListener() call so that we can't drop them too early, but it's still finicky.
Not to mention some inconsistencies when dropping the NativeWindow, which we've previously acquired an extra reference on (docs say _release()ing should not be done, but no comment on first acquiring it...).

Also note that we don't mark the type as Send/Sync, but the both callbacks reference a to ImageReader, so it should be thead-safe?

References
#454
#455

Footnotes

  1. Missing from the NDK crate

  2. NULL callback will register a regular non-callback event. It can be cleared via ALooper_removeFd(), but the callback might still run after removal under specific circumstances.

  3. Looper is refcounted (but what if the owning thread disappears or never calls poll()?). It can also be registered from any thread, even if poll() and callback invocations will only happen on the main thread. 2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant