Free threaded resources on Android #2761

alasram · 2024-08-24T01:44:34Z

When tiles are streamed, GPU resources are uploaded to the GPU synchronously in the render thread. A quick change in Zoom or location can block the render thread due to very slow uploads: Many uploads are serialized delaying actual rendering and affecting the user experience. The following Tracy screenshot shows a slow frame. Most of the frame is processing uploads of vertex buffers to the GPU:

Zoomed version of the same screenshot where we see actual rendering at the very end of the frame:

Pipelining the renderer would alleviate the problem but it won't be enough when many uploads are processed in the same frame. Also, pipelining the renderer requires significant efforts. An easier solution that may scale better is to distribute uploads to as many threads as possible and only wait for them to finish when they are needed in a draw call. This can scale better when many free CPU cores exist.

This PR adds shared EGL contexts to the Android backend to issue buffer uploads in worker threads.
The existing thread pool has been updated to enable persistent EGL contexts (I noticed very poor performance on Qualcomm drivers when migrating contexts between threads hence the thread pool with persistent contexts: no calls to eglMakeCurrent every frame and in different threads). This allows concurrent uploads to the GPU:

The uploads are synchronized with the draws: the render thread only waits for a buffer when a draw needs it. This allow drawing while uploading data:

The uploads are currently randomly scheduled. Parallelizing drawing and uploading can be improved by first scheduling uploads for resources that get used late in the frame, e.g. uploading resources used by translucent objects before opaque objects since opaques are rendered first.

This PR is specific to Android and is made such as other backends are unaffected and have minimal code change. Once Vulkan is the official backend on Android similar and better handling of resources can be done.

Texture uploads are also slow but less frequent and will be handled in a separate PR:

There are unused index buffer that we currently wait on before we destroy them and this is slowing down the render thread #2760

We can destroy them in a future frame but we should understand why they are there in the first place.

There are slowdowns in the the render thread that will be investigated separately (mailbox related):

github-actions · 2024-08-24T02:36:44Z

Bloaty Results (iOS) 🐋

Compared to main

    FILE SIZE        VM SIZE    
 --------------  -------------- 
  +0.0% +3.16Ki  [ = ]       0    TOTAL

Full report: https://maplibre-native.s3.eu-central-1.amazonaws.com/bloaty-results-ios/pr-2761-compared-to-main.txt

github-actions · 2024-08-24T03:07:44Z

Bloaty Results 🐋

Compared to main

    FILE SIZE        VM SIZE    
 --------------  -------------- 
  +0.3%  +447Ki  +0.3%  +111Ki    TOTAL

Full report: https://maplibre-native.s3.eu-central-1.amazonaws.com/bloaty-results/pr-2761-compared-to-main.txt

Compared to d387090 (legacy)

    FILE SIZE        VM SIZE    
 --------------  -------------- 
   +28% +33.0Mi  +427% +25.5Mi    TOTAL

Full report: https://maplibre-native.s3.eu-central-1.amazonaws.com/bloaty-results/pr-2761-compared-to-legacy.txt

github-actions · 2024-08-24T03:10:20Z

Benchmark Results ⚡

Benchmark                                                     Time             CPU      Time Old      Time New       CPU Old       CPU New
------------------------------------------------------------------------------------------------------------------------------------------
OVERALL_GEOMEAN                                            -0.0125         -0.0124             0             0             0             0

Full report: https://maplibre-native.s3.eu-central-1.amazonaws.com/benchmark-results/pr-2761-compared-to-main.txt

src/mbgl/gl/fence.cpp

sjg-wdw · 2024-09-17T16:30:23Z

Just tagging @mwilsnd and @alexcristici to take a look at this.

alasram · 2024-09-17T17:06:07Z

I will be making this opt-in because the performance is variable depending on vendors:

Emulator. Shared EGL contexts not well supported (known issue)
Mali drivers (Google pixel 6 and 7) on Android: uploads are like memcpy so no benefit of using shared contexts
Adreno drivers on Android: buffer updates are slow compared to a memcpy so this is useful
Adreno drivers on QNX: buffer upload perf is much lower than a regular Android driver and shared contexts scale the perf (screenshots are on QNX)

louwers

Is the added complexity worth it if it is opt-in?

Probably a tiny fraction of users has enough OpenGL knowledge and control over their distribution for this to be useful as a feature.

If the performance gains are significant, can we make an informed choice for the user automatically?

…ze GL contexts but only use fences with GLES 3.

alasram requested review from TimSylvester, mwilsnd, nyurik, sjg-wdw and adrian-cojocaru August 24, 2024 01:44

louwers reviewed Aug 24, 2024

View reviewed changes

src/mbgl/gl/fence.cpp Show resolved Hide resolved

louwers added cpp-core OpenGL Issues related to the OpenGL renderer backend ⚡ performance labels Aug 24, 2024

alasram force-pushed the alasram/free_threaded_resources branch 3 times, most recently from c676779 to aa00054 Compare August 29, 2024 19:16

alasram force-pushed the alasram/free_threaded_resources branch 4 times, most recently from bb58491 to 8c1f7c7 Compare September 5, 2024 23:15

alasram force-pushed the alasram/free_threaded_resources branch 3 times, most recently from f314bdf to 0d64d6c Compare September 13, 2024 17:45

alasram force-pushed the alasram/free_threaded_resources branch from 0d64d6c to 22e5b23 Compare September 18, 2024 00:59

louwers reviewed Sep 18, 2024

View reviewed changes

alasram force-pushed the alasram/free_threaded_resources branch 2 times, most recently from 242bcf5 to 0e1812e Compare September 20, 2024 00:33

alasram added 2 commits September 20, 2024 16:46

Add Free-Threaded resource upload to Android OpenGL Backend

d5857a1

Use multi-threaded uploads by default. Always use fences to synchroni…

20ba5d1

…ze GL contexts but only use fences with GLES 3.

alasram force-pushed the alasram/free_threaded_resources branch from 0e1812e to 20ba5d1 Compare September 21, 2024 00:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Free threaded resources on Android #2761

Free threaded resources on Android #2761

alasram commented Aug 24, 2024

github-actions bot commented Aug 24, 2024 •

edited

Loading

github-actions bot commented Aug 24, 2024 •

edited

Loading

github-actions bot commented Aug 24, 2024 •

edited

Loading

sjg-wdw commented Sep 17, 2024

alasram commented Sep 17, 2024

louwers left a comment •

edited

Loading

Free threaded resources on Android #2761

Are you sure you want to change the base?

Free threaded resources on Android #2761

Conversation

alasram commented Aug 24, 2024

github-actions bot commented Aug 24, 2024 • edited Loading

Bloaty Results (iOS) 🐋

Full report: https://maplibre-native.s3.eu-central-1.amazonaws.com/bloaty-results-ios/pr-2761-compared-to-main.txt

github-actions bot commented Aug 24, 2024 • edited Loading

Bloaty Results 🐋

Full report: https://maplibre-native.s3.eu-central-1.amazonaws.com/bloaty-results/pr-2761-compared-to-main.txt

github-actions bot commented Aug 24, 2024 • edited Loading

sjg-wdw commented Sep 17, 2024

alasram commented Sep 17, 2024

louwers left a comment • edited Loading

Choose a reason for hiding this comment

github-actions bot commented Aug 24, 2024 •

edited

Loading

github-actions bot commented Aug 24, 2024 •

edited

Loading

github-actions bot commented Aug 24, 2024 •

edited

Loading

louwers left a comment •

edited

Loading