Custom loop profile

Hi,

I am trying to benchmark adaptive finite element simulations using Caliper and I am super stuck in finding the correct configuration for caliper. Since I am cycling between the documentation page and permutating environment variable combinations for the last 3 days without any progress I am asking here for help.

Basically what I want is
1. the output of loop-report
2. separate output per mpi rank
3. inclusive, aggregated region for some selected regions
4. information about waiting processes
5. additional metadata per loop (e.g. number of elements on current rank)

On a very high level my program looks like this

```cxx
...
CALI_CXX_MARK_LOOP_BEGIN(loop_ann_outer, "Time Loop");
for (auto t = 0.0; t < t_final; t += Δt) {
    timestep_index++;
    CALI_CXX_MARK_LOOP_ITERATION(loop_ann_outer, timestep_index);

    CALI_MARK_BEGIN("AMR");
    ...
       CALI_MARK_BEGIN("Refinement");
       ...
       CALI_MARK_END("Refinement");
       CALI_MARK_BEGIN("Derefinement");
       ...
       CALI_MARK_END("Derefinement");
    ...
    CALI_MARK_END("AMR");
    {
      cali::Annotation::Guard g( cali::Annotation("num_elements").set(num_elements_local) );
    }

    CALI_MARK_BEGIN("Prepare Update");
    ...
    CALI_MARK_END("Prepare Update");

    CALI_CXX_MARK_LOOP_BEGIN(loop_ann_inner, "Update Loop");
    for (...) {
        CALI_CXX_MARK_LOOP_ITERATION(loop_ann_inner, ...);
        ...

        CALI_MARK_BEGIN("Halo Exchange");
        ...
        CALI_MARK_END("Halo Exchange");
    }
    CALI_CXX_MARK_LOOP_END(loop_ann_inner);

    {
      cali::Annotation::Guard g( cali::Annotation("num_inner_steps").set(num_inner_steps_local) );
    }
}
CALI_CXX_MARK_LOOP_END(loop_ann_outer);
...
```

To be specific, I want to generate a time series with time spent in MPI_Waitall+selected regions+total time+the 2 annotations per iteration in "Time Loop" to investigate how load imbalanced evolve for different load balancing strategies and numbers of processes. So my question is: How can this be achieved with Caliper? I am also happy with some external example from which I can start or the docs page, in case I missed something here.

Also related to this, is it possible that the docs are out of date? I could not really figure out where the code for the example here http://software.llnl.gov/Caliper/services.html#example can be found.

### What I tried so far
My first try was to just write the raw data and use cali-query to bring it into the correct shape. With this I almost succeeded, but hit hard drive limitations very fast (since I could not figure out how to filter the event traces correctly) and I could not get the exact caliper query. Here is what I tried to generate the data
```sh
CALI_SERVICES_ENABLE=mpi:event:trace:report
for NP in 1 2 4 8 16 32 64
do
    CALI_CONFIG=event-trace$trace.mpi,output="$NP/performance-report-%mpi.rank%.cali"$ mpirun -np $NP executable ...
done
```
and for the query
```
"SELECT *,inclusive_sum(time.duration.ns) FORMAT json(human) GROUP BY \"iteration#Time Loop\",region,mpi.rank ORDER BY \"iteration#Time Loop\""
```



My second attempt was to generate the required data in-situ. Here I first tried to do it via the aggregation service via
```sh
export CALI_LOG_VERBOSITY=2
export CALI_SERVICES_ENABLE=event,trace,timestamp,recorder,aggregate,report,mpi,debug
export CALI_AGGREGATE_ATTRIBUTES="???"
export CALI_AGGREGATE_KEY=???
for NP in 1 2 4 8 16 32 64
do
    export CALI_REPORT_FILENAME="$NP/performance-report-%mpi.rank%.cali"
    mpirun -np $NP executable ...
done
```
here no matter what I have put into `CALI_AGGREGATE_ATTRIBUTES` and `CALI_AGGREGATE_KEY` I could not get anything meaningful. Furthermore, I am not understanding at all what I am doing wrong here and could not really deduce it from the docs, because the output is faulty in any case (the number of output columns change with each iteration and the data starts to interleave). I have just updated to master and can reproduce this.



My latest idea was to make a custom loop-reporter, because it is closest to what I want. However, I was really not sure where I should even start after copy pasting `LoopReportController`. I also could not find how to extend the output of the loop controller from command line, or even just redirect the output to some specific file.

```sh
export CALI_LOG_VERBOSITY=10
export CALI_SERVICES_ENABLE=mpi,debug
for NP in 1 2 4 8 16 32 64
do
    export CALI_REPORT_FILENAME="$NP/performance-report-%mpi.rank%.cali"
    CALI_CONFIG=loop-report,iteration_interval=1,timeseries.maxrows=0 mpirun -np $NP ...
done
```

Thanks in advance,
Dennis


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Custom loop profile #521

What I tried so far

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Custom loop profile #521

Description

What I tried so far

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions