Replies: 3 comments
-
Hello Peter! I totally understand what you are talking about. It is quite often that the culprit hides in multiple flames, and digging them out requires a bit of effort. But there are not one or two, but even three ways to do it.
Hope this helps! |
Beta Was this translation helpful? Give feedback.
-
Hi Alex, thanks for your tips. I'm familiar with the I realized though that the flamegraphs are generated from a simple txt file that contains the stacktrace and # of samples. It's not hard to generate a rudimentary table from that. Here's a sample of where I got in 30 minutes: (defn st&nm [ln] (let [ix (str/last-index-of ln " ")] [(subs ln 0 ix) (Long/valueOf (subs ln (inc ix)))]))
(defn st->calls [s] (set (str/split (clj-async-profiler.post-processing/demunge-java-clojure-frames s) #";")))
(defn profile->data [sts]
(loop [mp {} tc 0 sts sts]
(if-some [[ln & mr] sts]
(let [[st nm] (st&nm ln)
cls (st->calls st)]
(recur (reduce (fn [ac cl] (update ac cl #(if % (+ % nm) nm))) mp cls) (+ tc nm) mr))
[mp tc])))
(let [[mp tc] (profile->data (line-seq (io/reader "/tmp/clj-async-profiler/results/18-cpu-2020-03-03-14-04-30.txt")))]
(println "Total calls:" tc)
(pp/print-table (reduce (fn [ac [mt cn]]
(if (re-matches #"(clojure|nrepl)\..*" mt)
ac
(conj ac {:method mt :count cn :usage-% (-> cn (/ tc) (* 100) long)})))
[] (sort-by (comp - second) mp)))) and the output is something like
This output makes me happy that flamegraphs are the default :) Nevertheless the table output can be helpful in certain scenarios, in addition to the flamegraphs. Thanks for your time and let me know your thoughts on this! |
Beta Was this translation helpful? Give feedback.
-
Since this isn't really an issue I'm closing. I'm open to further discussions, if there's any interest. |
Beta Was this translation helpful? Give feedback.
-
Imagine a flame graph like
We see most of the time was spent in
A
and there inY
, however after a closer look one might seeX
is the method where most of the CPU time is being spent. With an arbitrarily complex call tree it gets extremely complex to find such culprits in the flame graph.What I'm trying to say is that there's several ways to look at the results:
A
is taking longest to run, therefore we need to optimize it. MaybeY
can be optimized. Maybe calls toA
can be partially cached. Maybe ...X
took longest to run, therefore optimizing it will yield great overall benefits.Is there a way to obtain the latter results? In the underlying async-profiler I see something similar being printed to the JVM's console output, maybe giving access to that would be a start?
Beta Was this translation helpful? Give feedback.
All reactions