-
-
Notifications
You must be signed in to change notification settings - Fork 23.7k
Implement the Ground Truth Ambient Occlusion (GTAO) effect #110997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Looks really good. I found an I think when you are under and close to geometry. ASSAO Screencast.From.2025-09-28.11-22-23.webmGTAO Screencast.From.2025-09-28.11-22-42.webm |
May I have the model and the parameters (especially the quality and radius) to debug what is happening? I clamped the UV while searching for largest angle in the latest commit. Though this may bring some new artifacts, it will hopefully fix the artifact demonstrated in the video. To completely eliminate the artifact near the border we have to introduce temporal filtering. Besides, reducing the radius will also help alleviate the artifact. |
|
Great work! Can you be specific about which areas were copied from Unreal Engine? For clarity, we cannot merge any code that was copied from Unreal since their licence is restrictive and does not allow it. Anything in this PR that was copied from Unreal needs to be removed before we can consider merging this |
Might be me but in medium sethings ASSAO seems really broken or blocky ahíle gtao look’s far cleaner, is it a bug or current algorithm flaw. Either way for me difference is day an night, great job, hopefully getting merged when it’s ready. |
Issue seems to be fixed. Same algorithm could be implemented for SSIL? |
Godot uses a mirrored repeat sampler for all screen space effects. I don't really know the real purpose for this so I didn't change it to clamp directly. |
Well, none of the code is actually copied, the relationship to Unreal is only on the logic level and somehow indirect. Direct references are the flower renderer (MIT licensed) and O3DE (Apache 2 Licensed), both of which are logic transcripts of the Unreal implementation. Thus, since there's no direct copying, I believe that these changes won't encounter license issues (do they?). |
I am very confused by this respond. In the same paragraph you say that you didn't use any of the code, but then say it is a "logic transcripts of the Unreal implementation" which sounds like the code is copied. Which is it? For clarity, you cannot copy any code from Unreal Engine, nor can you look at the code in Unreal Engine and then write code that is functionally the same, but uses different variable names / coding style. The implementation needs to be totally fresh without any reliance on the Unreal Engine source code. The fact that there exist open source implementations that use the same algorithm as Unreal doesn't change that. Unreal's licence doesn't cover the GTAO technique, it covers the code in their engine. Any copying of their code violates the licence. Similarly, if Flower copied Unreal's code and then we copy them, it would still be a copyright violation. |
Thanks for your explanation. I didn't "look at the code in Unreal Engine and then write code that is functionally the same, but uses different variable names / coding style". So my version isn't a line-by-line logic transcription of the UE version, but according to your definition, I used code snippets from O3DE and flower that almost somehow similar to the corresponding part of source code from UE and they might have been taken from UE (and maybe not, but I don't know, so there might be a potential violation here). |
The latest commit rebased the logic on MIT-licensed XeGTAO instead of Unreal. Also, added paper references for some of the magic-looking logic to make it more traceable. The changes might slightly affect the visual appearances, also the performance might be worse in a very slight degree (due to lack of some Unreal magic in the inner integration part), but it's now free of potential copyright violations. |
Would you may benchmark the performance? |
On my NVIDIA RTX4070 Laptop (unit: ms):
The Ultra version of GTAO aims to create an ultimate AO quality, so it uses far more samples than Ultra ASSAO, which makes sense that it takes considerably longer. In other quality levels, GTAO is no slower than ASSAO. |
|
You can try to run it once and see what kind of differences you see. I wouldn't be entirely surprised that it's a mistake, it wouldn't be the first, won't be the last. Also I'm curious if fast acos is actually faster then just using normal acos |
Approximate acos is way faster than regular acos See the benchmarks in #101973 |
Visually there's none, but theoretically there will be some. For example, the fallout is calculated differently (this makes the attenuation looks different but can be compensated by adjusting the radius parameter), and that when calculating step length Unreal uses
Well, the original slide from SIGGRAPH indicates using fast acos and fast invsqrt, so that I followed it. |
|
Well it'd also be worth seeing how much of a performance difference going from 4 passes to 1 pass gives, especially if nobody's able to tell the difference, if it's a decent perf bump, why not And you can always add a Comment that it was 4 in case somebody in the future has some issues and can see if that was a solution to it that we unknowingly removed |
Still having similar problem while running GTAO half size |
You cannot copyright an optimization, what? As long as it benefits performance, even if it's a few ns, it's good enough. |
1 pass for performance, even if it affects visual, could be beneficial for the compatibility renderer. |
This https://github.com/Jamsers/Bistro-Demo-Tweaked I suppose. |
|
What I don't understand is why not opt to do the complete implementation of XeGTAO, as was mentioned at the GodotCon? I know this requires temporal filters, but in the end, it will be an alternative to ASSAO and not the default. Clarification: At GodotCon 2024, as I understood it, it was said that there is a desire to improve the quality of ASSAO (in addition to other ss effects) with a temporal filter, I assume to resemble XeGTAO, but in 2023 it was implied that XeGTAO was intended to be implemented, I assume as an alternative to ASSAO. Am I wrong? |
A temporal filter within GTAO is what UE does, not XeGTAO. AFAIK, XeGTAO just uses a spatial edge-aware smart blur within the GTAO pass and relies on a subsequent TAA effect to filter temporally. In godot we can do the same (enable GTAO+TAA to bring temporal filtering). The reason why Unreal does temporal filtering within GTAO is, I guess, perhaps to make GTAO look better without TAA, with a cost of performance. So, in terms of XeGTAO, the current implementation is already "full". |
Being curious , did you ran these tests at full or half resolution ? Since xegtao runs at full resolution by default i thought that the reasson this pr is almost 2x faster is because of that aside from lacking some optimization. . Also if you think this is ready for you could let rendering devs know at the godot rocket chat rendering channel. |
You are right. Godot always runs the gather pass at half resolution. If "half resolution" is checked, then Godot actually runs gather at 1/4 resolution. This would be a important reason.
Oh, I didn't know that. I will go for a post. |
clayjohn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really cool, great work!
But I'm not sure what the goal is here. The visual results seem to be very similar to the current SSAO implementation, but with a much higher cost. Particularly in your screenshots it appears that ASSAO ultra looks just as good as GTAO ultra, but is almost as cheap as GTAO medium quality. So the quality/performance tradeoff seems to get quite a bit worse.
I don't think adding GTAO as an option alongside ASSAO is a good idea. Doing so adds complexity for maintainers who now have another permutation to worry about when touching SSAO code and it also adds confusion for users (especially when there is no clear quality improvement at the high quality settings). There needs to be a compelling reason to add such complexity and I don't see a compelling reason in this PR.
For context, we have discussed improving SSAO for a while in the rendering meetings, but what we had in mind was to switch to new technique completely. The idea was to replace ASSAO with something like XeGTAO or SSAOVB to significantly improve quality. Then, at the same time, add support for optional temporal supersampling in order to make it even cheaper. The benefit of such an approach is it keeps our code simpler and easier to maintain and it brings in the quality improvements that users have been wanting.
|
this is basically XeGTAO. Also there are Quality Improvements, such as the removal of an ASSAO artifact that is clearly visible in ASSAO medium, but also occurs in ASSAO Ultra, just less so and requiring more complex environments. |
There is, GTAO looks sharper and less blurry than ASSAO, the only reasson it might look worse than xegtao is because of the blur logic ( since in godot it looks really noisy) and because unlike xegtao , gtao is not computing the bent normals ( which if we did so , it would cost 25% of fps ). Moreover i don't think xegtao would bring more performance since based on his benchmarks xegtao seems to be more expensive( gtao max costs 26 and xegtao 0.52). I think this pr is good for a middleground, atleast for me it looks better than ASSAO , but behind xegtao xegtao so we could just add this and then when you guys have the time for it, add xegtao. |
Yes, this is adpated from XeGTAO. There are notable visual improvements. As for the scenes we've tested, ASSAO creates large regions of dilute AO, whereas GTAO creates darker AO that spreads across a smaller range, looing sharper and more realistic. As for SSAOVB, I've tested it locally, but I found that it underperforms the original GTAO in some circumstances. This is because that SSAOVB doesn't have falloff. This means far and near occluders within the radius contribute the same, causing some visual degenerations compared with orthodox GTAO. Now within mainstream game engine (Unreal, Unity) GTAO still remains the popular choice. |
My implementation is mostly XeGTAO. The blur part might be among one of the reasons. Another possible reasons is thta godot uses four downsampled depth texture for all screen space effects (possibly for bandwidth reduction) whereas XeGTAO uses full resolution depth texture. |
@clayjohn All right, I understand what you would like to have. From now on I'll split the GTAO implementation from ASSAO and extend it to a full XeGTAO implementation (port the blur part of XeGTAO and stop reusing the blur passes of ASSAO). That will make this a complete XeGTAO that could is parallel to ASSAO and could serve as a drop-in replacement for it. I'll notify when I finishes. |
Also take into account xegtao has also bent normal computation for better ao. Which helps a lot in test scenes like bistro. |
Be aware that bent normals doesn't actually affect the AO term itself. Bent normals are encoded during AO and used in the subsequent lighting pass for anisotropic and specular lighting. So bent normal shouldn't make a difference in SSAO debug view. I suspect whether the real improvement with and without bent normals is large. |
|
I don't understand where this is going if XeGTAO is going to replace ASSAO. That would lead to the forced use of TAA since XeGTAO is very noisy despite having a spatial denoiser. And yes, a spatio-temporal denoiser can be used to avoid forcing the use of TAA, but let's be honest, it produces the same results as having TAA active, such as ghosting and the noise is still visible when there's movement.
Godot already supports bent normals, the only detail is that the user has to generate the bent normal maps. I can understand that GTAO would generate them automatically, but I think this would be redundant. |
Yes, this is why I kept ASSAO as an option whilst implementing GTAO. I believe ASSAO could still be useful for low end devices without AA techniques enabled. However, it would be up to the Godot devs whether to keep or completely replace ASSAO.
That would make sense. I will investigate if I have extra time. But keep in mind that I may end up without bent normals. |
Sadly that's the sad reality nowadays all ss using motion vectors to make them look good, look at ue5 or unity hdrp. It kind of similar to the ssr overhaul pr where you will need temporal supersampling to fix some issues upon moving. And as of now here there isn't any temporal filter in this pr, though using taa does kind of that thing.
That's fine , this is just a feature that would be nice to have( since bent normals are weird to bake on programs like blender), though it's better as a togglable option since it cuts 25% of performance. |
It's possible to implement a temporal reprojection pass separate from TAA, so TAA is not a strict requirement for it. We do the same for volumetric fog right now (it's enabled by default). Volumetric fog is low-frequency data, so it works well for this use case. SSAO is proportionally higher-frequency data, but still rather low-frequency when you consider the blur passes and half-resolution rendering (by default). That said, games that feature high-end visuals (and are likely going to use SSAO in general) almost always use TAA or some upscaler by default nowadays, so you might as well make use of it to improve performance and visuals. |
|
This is a bit off-topic from GTAO, but if this is going to be the case, is there any chance of replacing Separable Subsurface Scattering with Burley Subsurface Scattering? |
|
We could create another PR and try adding a temporal pass to see if it improves the quality without incurring a significant performance cost. |
|
New experiment with SSILVB (GTAO with visibility bitmask), as mentioned in 4.x Rendering Roadmap: This might be because of a hard-cut at the edge of radius, which means points within the radius have a considerable contribution, whereas just a tiny step out of the sampling radius the values becomes zero, leaving a sharp border in AO value. Will investigate other existing implementations whether they also has this artifact or how they avoid it. |
You can check cyberreality implementation which looks dope . Its for his web game engine which i think is open source on github , so you could check there https://cybereality.com/screen-space-indirect-lighting-with-visibility-bitmask-improvement-to-gtao-ssao-real-time-ambient-occlusion-algorithm-glsl-shader-implementation/ + some optimization he did https://x.com/cybereality/status/1816678708671021247 Also this would be better as a ssil rework since we already have ssil . |
Yes I did reference this article for implementation. But I checked and found that my implementation is logically identical to the code it provided. I also referenced Spartan engine. Maybe it's the setting of parameters? |
Could be , you might need to ask clayjohn on rocket chat or here about it . Also another link which discussed this https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://github.com/mrdoob/three.js/issues/29668&ved=2ahUKEwj6sc_btd2QAxULV6QEHRKKIBgQFnoECB0QAQ&usg=AOvVaw39k2vKfWlQOV62DDFiUtSt |
|
Closing in favor of #113304 . |




Resolves godotengine/godot-proposals#3223, but for 4.x.
Showcase:

In medium quality, the visual quality of GTAO is comparable to ASSAO, while in Ultra quality, GTAO outperforms ASSAO conspicuously.
Note: Due to algorithm differences, the parameters cannot perfectly map between ASSAO and GTAO (for example, ASSAO intensity 2.0 and GTAO intensity 1.0 gives visually similar AO strengths). This could be balanced by adding a multiplier for some parameters if required.
Togglable by UI:

Implementation details
Main gather logic was taken from the GTAO implementation of Unreal Engine and flower renderer. Also referenced XeGTAO to check the correctness of some computational details.
Blur and interleave part has reused the edge-aware smart blur that ASSAO also depends on.
GTAO isn't much more costly than ASSAO, but as a high-quality effect, it isn't intended for super low-end scenarios. Thus, the effect is only implemented for RendererRD without support for the Compatibility backend.
Questions
Upon implementing the effect, I noticed that the ASSAO gather is run four times with four different depth textures that's packed into a texture array, and then averaged between the four values produced. I don't understand what's the real purpose for this and whether I can remove this logic and only run the gather once for GTAO. Could anyone explain this for me?
Possible future improvements
A "full" implementation of GTAO would have to include a temporal filter pass. The current implementation omits the temporal filtering step, as lacking it won't considerably make the visual quality degenerate (if a better result is needed, enabling TAA would compensate for the lack of temporal filtering, though it still looks fine without TAA enabled). Temporal filtering could be added in the future through a separate PR by me or anyone else who's willing to contribute, to make the effect look even better.