Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSAA for metal. #20506

Open
wants to merge 3 commits into
base: v4
Choose a base branch
from
Open

MSAA for metal. #20506

wants to merge 3 commits into from

Conversation

arsen-aar
Copy link

  1. MSAA may be enabled for the general Metal pipeline with the new flag:

ENABLE_MSAA_GLOBAL_METAL (CCRenderer.cpp:156).

3

  1. MSAA also can be enabled during drawing in to render texture (metal only).
    It's available the new test under the "RenderTextureTest" section. Where one can compare the quality of rendering with and without MSAA.

RenderTextureWithMsaaSprite3D (RenderTextureTest.h:200)

  1. I have a mixed 2d/3d pipeline due to the specifics of my current project. And I noticed that it takes a lot of time to create each new CommandEncoder.
    Here provided an ad hoc solution to this issue.

MsaaMode::MtlUnified (CCRenderTexture.h:71, CommandBufferMTL.cpp::100, CCRenderer.cpp:900 ..)

You may also compare the performance of common and unified multisampling in RenderTextureWithMsaaSprite3D test.

I realize that this solution is pretty dirty and situational to be merged to cocos-2d master. But its a good start point for future development. Here is the capture of this test running on a 6 scale.
https://www.youtube.com/watch?v=8LinKjoj4Y8
The speedup was 12 to 60 fps on my iPhone 10.

  1. The creation of DepthStencilStateMTL was a bottleneck for mixed 2d/3d rendering. Depth/stencil state cache was implemented to make it's creation faster.

_depthStencilStateCache (DeviceMTL.h:160)

1) MSAA may be enabled for the general Metal pipeline with the new flag:
> ENABLE_MSAA_GLOBAL_METAL (CCRenderer.cpp:156).

2) MSAA also can be enabled during drawing in to render texture (metal only).
It's available the new test under the "RenderTextureTest" section. Where one can compare the quality of rendering with and without MSAA.
> RenderTextureWithMsaaSprite3D (RenderTextureTest.h:200)

3) I have a mixed 2d/3d pipeline due to the specifics of my current project. And I noticed that it takes a lot of time to create each new CommandEncoder.
Here provided an ad hoc solution to this issue.
> MsaaMode::MtlUnified (CCRenderTexture.h:71, CommandBufferMTL.cpp::100, CCRenderer.cpp:900 ..)
You may also compare the performance of common and unified multisampling in RenderTextureWithMsaaSprite3D test.

4) The creation of DepthStencilStateMTL was a bottleneck for mixed 2d/3d rendering. Depth/stencil state cache was implemented to make it's creation faster.
> _depthStencilStateCache (DeviceMTL.h:160)
Fix win bild. Elvis operator removed.
@@ -219,6 +278,24 @@ inline int clamp(int value, int min, int max) {

CommandBufferMTL::~CommandBufferMTL()
{
{
// если текущий CommandBuffer еще выполняется GPU
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment needs translate to English

@halx99 halx99 mentioned this pull request Sep 1, 2020
67 tasks
code not related to MSAA has been removed.
@arsen-aar
Copy link
Author

Sorry, it was my colleague's code that is not related to the MSAA optimization.

He experiences Sanitizer warning on every os-x app close without it. That behavior is reproduced on only half of our team MacBooks. Sanitizer stack attached:


#0 0x105b4fa4e in invocation function for block in cocos2d::backend::CommandBufferMTL::endFrame() CommandBufferMTL.mm:449
    #1 0x7fff3656c37d in MTLDispatchListApply (Metal:x86_64+0x1b37d)
    #2 0x7fff3656c802 in -[_MTLCommandBuffer didCompleteWithStartTime:endTime:error:] (Metal:x86_64+0x1b802)
    #3 0x7fff3656c667 in -[MTLIOAccelCommandBuffer didCompleteWithStartTime:endTime:error:] (Metal:x86_64+0x1b667)
    #4 0x7fff3656c556 in -[_MTLCommandQueue commandBufferDidComplete:startTime:completionTime:error:] (Metal:x86_64+0x1b556)
    #5 0x7fff55ce7389 in ioAccelCommandQueueBlockFenceCallback (IOAccelerator:x86_64+0x3389)
    #6 0x7fff33dd479d in IODispatchCalloutFromCFMessage (IOKit:x86_64+0x579d)
    #7 0x7fff33dd461e in _IODispatchCalloutWithDispatch (IOKit:x86_64+0x561e)
    #8 0x7fff6b0183d3 in dispatch_mig_server (libdispatch.dylib:x86_64+0x193d3)
    #9 0x10f69226a in __wrap_dispatch_source_set_event_handler_block_invoke (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0x4526a)
    #10 0x7fff6b001657 in _dispatch_client_callout (libdispatch.dylib:x86_64+0x2657)
    #11 0x7fff6b003817 in _dispatch_continuation_pop (libdispatch.dylib:x86_64+0x4817)
    #12 0x7fff6b0134bd in _dispatch_source_invoke (libdispatch.dylib:x86_64+0x144bd)
    #13 0x7fff6b006af5 in _dispatch_lane_serial_drain (libdispatch.dylib:x86_64+0x7af5)
    #14 0x7fff6b0075d5 in _dispatch_lane_invoke (libdispatch.dylib:x86_64+0x85d5)
    #15 0x7fff6b010c08 in _dispatch_workloop_worker_thread (libdispatch.dylib:x86_64+0x11c08)
    #16 0x7fff6b25ba3c in _pthread_wqthread (libsystem_pthread.dylib:x86_64+0x2a3c)
    #17 0x7fff6b25ab76 in start_wqthread (libsystem_pthread.dylib:x86_64+0x1b76)
0x6110000b87f8 is located 120 bytes inside of 256-byte region [0x6110000b8780,0x6110000b8880)
freed by thread T0 here:
    #0 0x10f69f3fd in wrap__ZdlPv (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0x523fd)
    #1 0x105b48937 in cocos2d::backend::CommandBufferMTL::~CommandBufferMTL() CommandBufferMTL.mm:262
    #2 0x1050ded5a in cocos2d::Ref::release() CCRef.cpp:150
    #3 0x10506138e in cocos2d::Renderer::~Renderer() CCRenderer.cpp:189
    #4 0x105061594 in cocos2d::Renderer::~Renderer() CCRenderer.cpp:183
    #5 0x10577a412 in cocos2d::Director::~Director() CCDirector.cpp:183
    #6 0x10577a692 in cocos2d::Director::~Director() CCDirector.cpp:161
    #7 0x10577a76b in cocos2d::Director::~Director() CCDirector.cpp:161
    #8 0x1050ded5a in cocos2d::Ref::release() CCRef.cpp:150
    #9 0x1057860f0 in cocos2d::Director::purgeDirector() CCDirector.cpp:1086
    #10 0x10578a046 in cocos2d::Director::mainLoop() CCDirector.cpp:1426
    #11 0x1050df98f in cocos2d::Application::run() CCApplication-mac.mm:103
    #12 0x1003ce0c9 in main main.cpp:33
    #13 0x7fff6b05acc8 in start (libdyld.dylib:x86_64+0x1acc8)
previously allocated by thread T0 here:
    #0 0x10f69f1fd in wrap__ZnwmRKSt9nothrow_t (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0x521fd)
    #1 0x1052640c4 in cocos2d::backend::DeviceMTL::newCommandBuffer() DeviceMTL.mm:91
    #2 0x1050616fd in cocos2d::Renderer::init() CCRenderer.cpp:201
    #3 0x10578052b in cocos2d::Director::setOpenGLView(cocos2d::GLView*) CCDirector.cpp:434
    #4 0x10037a180 in AppDelegate::applicationDidFinishLaunching() AppDelegate.cpp:300
    #5 0x1050df69b in cocos2d::Application::run() CCApplication-mac.mm:67
    #6 0x1003ce0c9 in main main.cpp:33
    #7 0x7fff6b05acc8 in start (libdyld.dylib:x86_64+0x1acc8)#0 0x105b4fa4e in invocation function for block in cocos2d::backend::CommandBufferMTL::endFrame() CommandBufferMTL.mm:449
    #1 0x7fff3656c37d in MTLDispatchListApply (Metal:x86_64+0x1b37d)
    #2 0x7fff3656c802 in -[_MTLCommandBuffer didCompleteWithStartTime:endTime:error:] (Metal:x86_64+0x1b802)
    #3 0x7fff3656c667 in -[MTLIOAccelCommandBuffer didCompleteWithStartTime:endTime:error:] (Metal:x86_64+0x1b667)
    #4 0x7fff3656c556 in -[_MTLCommandQueue commandBufferDidComplete:startTime:completionTime:error:] (Metal:x86_64+0x1b556)
    #5 0x7fff55ce7389 in ioAccelCommandQueueBlockFenceCallback (IOAccelerator:x86_64+0x3389)
    #6 0x7fff33dd479d in IODispatchCalloutFromCFMessage (IOKit:x86_64+0x579d)
    #7 0x7fff33dd461e in _IODispatchCalloutWithDispatch (IOKit:x86_64+0x561e)
    #8 0x7fff6b0183d3 in dispatch_mig_server (libdispatch.dylib:x86_64+0x193d3)
    #9 0x10f69226a in __wrap_dispatch_source_set_event_handler_block_invoke (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0x4526a)
    #10 0x7fff6b001657 in _dispatch_client_callout (libdispatch.dylib:x86_64+0x2657)
    #11 0x7fff6b003817 in _dispatch_continuation_pop (libdispatch.dylib:x86_64+0x4817)
    #12 0x7fff6b0134bd in _dispatch_source_invoke (libdispatch.dylib:x86_64+0x144bd)
    #13 0x7fff6b006af5 in _dispatch_lane_serial_drain (libdispatch.dylib:x86_64+0x7af5)
    #14 0x7fff6b0075d5 in _dispatch_lane_invoke (libdispatch.dylib:x86_64+0x85d5)
    #15 0x7fff6b010c08 in _dispatch_workloop_worker_thread (libdispatch.dylib:x86_64+0x11c08)
    #16 0x7fff6b25ba3c in _pthread_wqthread (libsystem_pthread.dylib:x86_64+0x2a3c)
    #17 0x7fff6b25ab76 in start_wqthread (libsystem_pthread.dylib:x86_64+0x1b76)

@halx99
Copy link
Contributor

halx99 commented Sep 3, 2020

Yeah, great, so I think you can post a seprate PR to fix this, and I also check google angleproject's metal renderer backend, it also have this code

@arsen-aar
Copy link
Author

Done
#20575

zhongfq added a commit to zhongfq/cocos-lua that referenced this pull request Nov 10, 2020
@paulocoutinhox
Copy link

Can someone bring it to Axmol?
axmolengine/axmol#1813

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants