Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Huge increase in allocations when timing a function with return values #81

Open
sloede opened this issue Mar 14, 2020 · 3 comments
Open

Comments

@sloede
Copy link
Contributor

sloede commented Mar 14, 2020

I directly apologize for the convoluted example, but I was not able to reproduce this with a more minimum WE.

When I time this piece of code (version A)

  @timeit timer() "blending factors" begin                                                          
  begin                                                                                             
    alpha, element_ids_dg, element_ids_dgfv = calc_blending_factors(dg, dg.elements.u)              
  end   

instead of this (version B)

  out = Any[]                                                                                       
  @timeit timer() "blending factors" calc_blending_factors(out, dg, dg.elements.u)                  
  alpha, element_ids_dg, element_ids_dgfv = out

where I did not change anything in comparison to A but wrap the return values in an Any[] container instead of returning them directly, I get a huge increase in memory allocations (and thus run time):

With return values (version A):

     volume integral         415    19.2s  93.8%  46.3ms   8.99GiB  97.4%  22.2MiB
       blended DG-FV         415    18.9s  92.3%  45.6ms   8.87GiB  96.1%  21.9MiB
       blending factors      415    285ms  1.39%   686μs    119MiB  1.26%   294KiB
       pure DG               415   5.49ms  0.03%  13.2μs    396KiB  0.00%     976B

With out (version B):

     volume integral         415    3.26s  75.0%  7.85ms    124MiB  38.5%   307KiB
       blended DG-FV         415    2.99s  68.8%  7.21ms   1.31MiB  0.41%  3.23KiB
       blending factors      415    261ms  6.01%   629μs    119MiB  37.0%   294KiB
       pure DG               415   4.94ms  0.11%  11.9μs    454KiB  0.14%  1.09KiB

If I do not time the call to calc_blending_factors (using either version A-type return values or version B), I get roughly the same timings as in B. This does seem very odd to me and I don't have the slightest idea where I should start looking for the problem.

(If this is not the correct place to report such behavior, but rather e.g. Discourse, please let me know)

@KristofferC
Copy link
Owner

KristofferC commented Mar 14, 2020

Hard to do much here without some kind of reproducer. My guess is that it is related to JuliaLang/julia#15276

@jonas-schulze
Copy link

Hi @sloede and @ranocha, I could not find where exactly you mentioned this issue within trixi-framework/Trixi.jl#200, but do you have any updates on this TimerOutputs issue?

PS: I really enjoyed your JuliaCon talk on YouTube.

@ranocha
Copy link
Contributor

ranocha commented Jan 26, 2024

We couldn't find a really good way to resolve this issue. Thus, we have basically created a copy of @timeit_debug but without the try block in

https://github.com/trixi-framework/Trixi.jl/blob/70d365f405c2494e314277f22e63dcc32d04f095/src/auxiliary/auxiliary.jl#L257-L275

That fixed the severe performance problems we got with the macros implemented in TimerOutputs.jl.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants