Skip to content

Conversation

FASTSHIFT
Copy link

Try to fix the problem of abnormal benchmark scores caused by low timestamp precision: lvgl/lvgl#8751

@AndreCostaaa
Copy link
Contributor

Hi @FASTSHIFT

As I suspected, the results are less stable with this version. These are the results with LVGL master commit d406e5953ca5cc8f601ba7893e6df1fbd88fe2b9

All scenes avg.,65%, 34, 69, 69, 0
All scenes avg.,64%, 34, 87, 87, 0
All scenes avg.,64%, 34, 82, 82, 0

Full results here:

Benchmark Summary (9.4.0 dev)
Name, Avg. CPU, Avg. FPS, Avg. time, render time, flush time
Empty screen, 37%, 37, 258, 258, 0
Moving wallpaper, 97%, 43, 66, 66, 0
Single rectangle, 18%, 43, 0, 0, 0
Multiple rectangles, 31%, 43, 39, 39, 0
Multiple RGB images, 40%, 43, 5, 5, 0
Multiple ARGB images, 60%, 43, 49, 49, 0
Rotated ARGB images, 98%, 18, 44, 44, 0
Multiple labels, 61%, 43, 14, 14, 0
Screen sized text, 98%, 19, 74, 74, 0
Multiple arcs, 96%, 36, 68, 68, 0
Containers, 55%, 42, 53, 53, 0
Containers with overlay, 79%, 16, 199, 199, 0
Containers with opa, 60%, 40, 15, 15, 0
Containers with opa_layer, 70%, 36, 68, 68, 0
Containers with scrolling, 92%, 25, 92, 92, 0
Widgets demo, 62%, 26, 74, 74, 0
All scenes avg.,65%, 34, 69, 69, 0
Benchmark Summary (9.4.0 dev)
Name, Avg. CPU, Avg. FPS, Avg. time, render time, flush time
Empty screen, 37%, 37, 258, 258, 0
Moving wallpaper, 97%, 43, 66, 66, 0
Single rectangle, 17%, 43, 0, 0, 0
Multiple rectangles, 32%, 43, 32, 32, 0
Multiple RGB images, 23%, 43, 5, 5, 0
Multiple ARGB images, 60%, 43, 59, 59, 0
Rotated ARGB images, 88%, 18, 187, 187, 0
Multiple labels, 58%, 43, 14, 14, 0
Screen sized text, 99%, 18, 168, 168, 0
Multiple arcs, 95%, 37, 68, 68, 0
Containers, 55%, 43, 77, 77, 0
Containers with overlay, 79%, 16, 200, 200, 0
Containers with opa, 61%, 41, 14, 14, 0
Containers with opa_layer, 70%, 37, 67, 67, 0
Containers with scrolling, 92%, 25, 92, 92, 0
Widgets demo, 62%, 26, 98, 98, 0
All scenes avg.,64%, 34, 87, 87, 0
Benchmark Summary (9.4.0 dev)
Name, Avg. CPU, Avg. FPS, Avg. time, render time, flush time
Empty screen, 37%, 37, 258, 258, 0
Moving wallpaper, 87%, 43, 54, 54, 0
Single rectangle, 18%, 43, 0, 0, 0
Multiple rectangles, 31%, 43, 32, 32, 0
Multiple RGB images, 24%, 43, 5, 5, 0
Multiple ARGB images, 70%, 43, 49, 49, 0
Rotated ARGB images, 98%, 18, 187, 187, 0
Multiple labels, 71%, 43, 62, 62, 0
Screen sized text, 98%, 19, 74, 74, 0
Multiple arcs, 96%, 36, 68, 68, 0
Containers, 39%, 43, 6, 6, 0
Containers with overlay, 79%, 16, 199, 199, 0
Containers with opa, 59%, 40, 15, 15, 0
Containers with opa_layer, 70%, 37, 67, 67, 0
Containers with scrolling, 92%, 25, 141, 141, 0
Widgets demo, 62%, 26, 98, 98, 0
All scenes avg.,64%, 34, 82, 82, 0

At least the FPS and the %CPU are very stable, I suspect there maybe some issues with how LVGL calculates the render and flush time, specially when we have scenes that take > 200ms to render, it could be useful to run with LVGL Profiler that I recently learned about

Btw to run these tests locally, you can build the same docker image that is used in the LVGL CI:

#Inside so3
docker build . -f docker/Dockerfile.lvperf_32b -t so3-lv_perf32b

And then use it in LVGL:

#Inside LVGL
cp lv_conf_template.h lv_conf_perf32b.h

python scripts/generate_lv_conf.py \
   --template lv_conf_template.h \
   --config lv_conf_perf32b.h \
   --defaults configs/ci/perf/lv_conf_perf32b.defaults
   
docker run -t --privileged \
            -v /dev:/dev \
            -v $(pwd)/lv_conf_perf32b.h:/so3/usr/lib/lv_conf.h \
            -v $(pwd):/so3/usr/lib/lvgl \
            -v $(pwd)/scripts/perf/lvperf_dependencies.sh:/so3/install_dependencies.sh  \
            -v $(pwd)../so3-persistence:/persistence \
            -v $(pwd)../so3-usr-build-32b:/so3/usr/build \
            so3-lv_perf32b

For 64b, simply replace 32b with 64b in the above commands

@FASTSHIFT
Copy link
Author

@AndreCostaaa

I'm more inclined to believe it's a timestamp issue, as the profiler's trace files show that each frame doesn't run in 16ms.

image

log.txt

I'm also wondering if the "usleep" and "clock_gettime" commands in this environment are actually retrieving the QEMU virtual machine's time or the physical PC's time. My understanding is that both use the virtual machine's timestamp, unrelated to the actual PC's time.

Also, the usleep function seems to be inaccurate. Here's a test case:

#include <time.h>
static uint32_t tick_get(void)
{
	struct timespec ts;

	clock_gettime(CLOCK_MONOTONIC, &ts);
	uint32_t tick = ts.tv_sec * 1000 + ts.tv_nsec / 1000000;

	return tick;
}

void tick_test()
{
    uint32_t start_time = tick_get();
    usleep(5000);
    LV_LOG_USER("elapsed time: %d ms", tick_get() - start_time);

    start_time = tick_get();
    usleep(10000);
    LV_LOG_USER("elapsed time: %d ms", tick_get() - start_time);

    start_time = tick_get();
    usleep(15000);
    LV_LOG_USER("elapsed time: %d ms", tick_get() - start_time);
}

I got this log:

[User]  (2.37, +2037)    tick_test: elapsed time: 9 ms lv_demo_benchmark.c:546
[User]  (2.57, +20)      tick_test: elapsed time: 20 ms lv_demo_benchmark.c:550
[User]  (2.77, +20)      tick_test: elapsed time: 20 ms lv_demo_benchmark.c:555

It seems to me that the system scheduling cycle has only 10ms accuracy. How can I set the scheduling cycle to 1ms?

@AndreCostaaa
Copy link
Contributor

It does, it's defined here:

/* SCHEDULE_FREQ is the scheduler tick expressed in ms */
#define SCHEDULE_FREQ 10

@FASTSHIFT
Copy link
Author

It does, it's defined here:

/* SCHEDULE_FREQ is the scheduler tick expressed in ms */
#define SCHEDULE_FREQ 10

@AndreCostaaa

I tried #define SCHEDULE_FREQ 1 and then ran docker build . -f docker/Dockerfile.lvperf_32b -t so3-lv_perf32b to recompile, but there was still no improvement after running LVGL:

[User]  (2.20, +2020)    tick_test: elapsed time: 7 ms lv_demo_benchmark.c:546
[User]  (2.40, +20)      tick_test: elapsed time: 20 ms lv_demo_benchmark.c:550
[User]  (2.60, +20)      tick_test: elapsed time: 20 ms lv_demo_benchmark.c:555

Am I missing something?

@FASTSHIFT
Copy link
Author

FASTSHIFT commented Aug 29, 2025

Moreover, when the same code is run for the first time, the first time measurement will have a very large probability deviation:

[User]  (2.2, +2002)     tick_test: elapsed time: 1007 ms lv_demo_benchmark.c:546
[User]  (2.22, +20)      tick_test: elapsed time: 20 ms lv_demo_benchmark.c:550
[User]  (2.42, +20)      tick_test: elapsed time: 20 ms lv_demo_benchmark.c:555

This may be the cause of the abnormal benchmark rendering time measurement: Empty screen, 37%, 37, 258, 258, 0.

@AndreCostaaa
Copy link
Contributor

AndreCostaaa commented Aug 29, 2025

Am I missing something?

I might be the one that is missing something, there's a timer that raises an IRQ every SCHEDULE_FREQ ms.
This busy loop shows that we have a precision of around 31.2 useconds so the system should be able to keep up with a lower schedule frequency, I don't know why it doesn't. Any ideas @daniel-rossier ?

void busy_loop()
{
	uint32_t start_tick = tick_get();
	struct timespec ts;

	clock_gettime(CLOCK_MONOTONIC, &ts);
	uint64_t last_nsec = ts.tv_nsec;

	while (1) {
		clock_gettime(CLOCK_MONOTONIC, &ts);
		uint32_t current_tick = tick_get();
		if (current_tick - start_tick > 500) {
			break;
		}
		printf("Tick: %" PRIu32, current_tick - start_tick);
		printf(" Timestamp: %" PRIu64 " %" PRIu64 " %" PRIu64 "\n", ts.tv_sec, ts.tv_nsec, ts.tv_nsec - last_nsec);
		last_nsec = ts.tv_nsec;
	}
}
Tick: 499 Timestamp: 1 1506603744 31232
Tick: 499 Timestamp: 1 1506634976 31232
Tick: 499 Timestamp: 1 1506666208 31232
Tick: 499 Timestamp: 1 1506697440 31232
Tick: 499 Timestamp: 1 1506728672 31232
Tick: 499 Timestamp: 1 1506759904 31232
Tick: 499 Timestamp: 1 1506791136 31232
Tick: 499 Timestamp: 1 1506822368 31232
Tick: 499 Timestamp: 1 1506853584 31216
Tick: 499 Timestamp: 1 1506884816 31232
Tick: 499 Timestamp: 1 1506916048 31232
Tick: 499 Timestamp: 1 1506947280 31232
Tick: 499 Timestamp: 1 1506978512 31232
Tick: 500 Timestamp: 1 1507009744 31232
Tick: 500 Timestamp: 1 1507040976 31232
Tick: 500 Timestamp: 1 1507072208 31232
Tick: 500 Timestamp: 1 1507103456 31248
Tick: 500 Timestamp: 1 1507134688 31232
Tick: 500 Timestamp: 1 1507165920 31232
Tick: 500 Timestamp: 1 1507197152 31232
Tick: 500 Timestamp: 1 1507228384 31232
Tick: 500 Timestamp: 1 1507259616 31232
Tick: 500 Timestamp: 1 1507290848 31232
Tick: 500 Timestamp: 1 1507322080 31232
Tick: 500 Timestamp: 1 1507353328 31248
Tick: 500 Timestamp: 1 1507384560 31232

Yes, the first tick isn't from the point where the application starts but rather an unspecified starting point that I would guess it's the boot time of SO3 so the fact that it's not 0 is "normal". After modifying the tick_get function to this:

uint32_t first_tick = 0;
static uint32_t tick_get(void)
{
	struct timespec ts;
	clock_gettime(CLOCK_MONOTONIC, &ts);
	uint32_t tick = ts.tv_sec * 1000 + ts.tv_nsec / 1000000;
	if (first_tick == 0) {
		first_tick = tick;
	}
	return tick - first_tick;
}

I don't see much improvements, actually, the 10 millisecond sleep now sometimes takes 1020 milliseconds:

[User]	(0.8, +7)	tick_test: elapsed time: 7 ms lvgl_benchmark.c:47
[User]	(1.28, +1020)	tick_test: elapsed time: 1020 ms lvgl_benchmark.c:51
[User]	(1.48, +20)	tick_test: elapsed time: 20 ms lvgl_benchmark.c:55

I wasn't expecting to see usleep being that inaccurate actually, I wrongly assumed it would wake up the thread much quicker than that, maybe a problem with the scheduler algorithm since there's literally nothing else to do other than run this application (no separate tasks nor programs run at the same time as the benchmark)

@AndreCostaaa
Copy link
Contributor

@FASTSHIFT

I figured it out, it was the internal timer configuration, CONFIG_HZ is set to 100 so the timer has a granularity of 10ms

For the lvgl perf tests, the configurations are here:

And here:

Running the benchmark now

@FASTSHIFT
Copy link
Author

@FASTSHIFT

I figured it out, it was the internal timer configuration, CONFIG_HZ is set to 100 so the timer has a granularity of 10ms

For the lvgl perf tests, the configurations are here:

And here:

Running the benchmark now

@AndreCostaaa

There is improvement, but the second time seems to still have problems:

[User]  (0.999, +999)    lv_demo_benchmark: elapsed time: 5 ms lv_demo_benchmark.c:546
[User]  (2.10, +1011)    lv_demo_benchmark: elapsed time: 1011 ms lv_demo_benchmark.c:550
[User]  (2.25, +15)      lv_demo_benchmark: elapsed time: 15 ms lv_demo_benchmark.c:555

@FASTSHIFT
Copy link
Author

FASTSHIFT commented Aug 29, 2025

The benchmark score has improved slightly, but the rendering time of scenes below 16ms has not reached 60FPS. It may be necessary to continue to track the timestamp stability issue.

Benchmark Summary (9.4.0 dev)
Name, Avg. CPU, Avg. FPS, Avg. time, render time, flush time
Empty screen, 33%, 44, 8, 8, 0
Moving wallpaper, 97%, 43, 66, 66, 0
Single rectangle, 18%, 52, 0, 0, 0
Multiple rectangles, 35%, 51, 27, 27, 0
Multiple RGB images, 34%, 52, 4, 4, 0
Multiple ARGB images, 48%, 50, 10, 10, 0
Rotated ARGB images, 88%, 18, 187, 187, 0
Multiple labels, 80%, 51, 42, 42, 0
Screen sized text, 99%, 18, 74, 74, 0
Multiple arcs, 98%, 37, 92, 92, 0
Containers, 55%, 51, 52, 52, 0
Containers with overlay, 81%, 17, 199, 199, 0
Containers with opa, 62%, 48, 12, 12, 0
Containers with opa_layer, 59%, 43, 17, 17, 0
Containers with scrolling, 92%, 25, 92, 92, 0
Widgets demo, 63%, 29, 87, 87, 0
All scenes avg.,65%, 39, 60, 60, 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants