-
Notifications
You must be signed in to change notification settings - Fork 897
perf: Replace positions indexes from BTREE to BRIN to reduce memory usage #5075
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: Replace positions indexes from BTREE to BRIN to reduce memory usage #5075
Conversation
✅ Deploy Preview for teslamate ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
Nice find! @swiffer may I ask you to review in the coming weeks? |
JakobLichterfeld
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution. Sounds like a smart solution.
Please add the following to avoid the type spelling issue in CI:
diff --git a/.typos.toml b/.typos.toml
index ea78907a..268764e6 100644
--- a/.typos.toml
+++ b/.typos.toml
@@ -13,6 +13,7 @@ dur = "dur"
pn = "pn"
Derivated = "Derivated"
Bre = "Bre"
+BRIN = "BRIN"
[type.po]
extend-glob = ["*.po"]|
Hi - will have a look, however i do have fairly powerful hardware, guess it's more interesting if folks with lower spec hw test this (rpi). BRIN seems to be a good choice for our case, just wondering about Import functionality (TeslaFi). Only problematic when you already started logging data and decide to import old data later. Guess the chance is low and we can ask for REINDEXing the database after import in the docs. |
|
I haven't noticed any slow down here, all dashboards loading instantly. 👍Will keep using the BRIN indexes, saving ~500MB. One thought: When multiple cars are driving simultanously the drive_id_date_index is becoming interleaved, right? |
|
Hi @swiffer, thanks for your comments
Yes, when multiple cars are inserting rows concurrently, the BRIN index ranges become interleaved, and this reduces effectiveness. According to my calculations, efficiency drops at 3+ drive_id in the same range. From what I see in the database, it's 3 records per second per car, which means that block in BRIN is 1 hour, and for 3 cars it's 20 minutes per block, which overall should quite effectively resolve collisions and index inefficiencies, leading to a little extra scanning. But in general, yes, the index stores information about the range(
Agree, unfortunately I don't have an RPI for testing either, my machine has an Intel N100, which is generally a fairly powerful processor. |
|
Hi @JakobLichterfeld, thanks for help, CI should be fixed. |
You are welcome. Spell check now succeed as expected, ty. |
|
(The following is my guess) For low RAM hw if the BTREE is 0.5-1 GB in size, then it will read index from the disk, where performance might drop more than simply reading and filtering ranges. It would be very interesting to test this, I will try to find something. |
|
@JakobLichterfeld could you please run this query with your data? I'm still thinking about a good query for this data to test index effectiveness but this is a real query from |
|
not sure what queries benefit the most - i tried using this query - it's modified from the "Visited" dashboard to ensure it uses the brin index. EXPLAIN (ANALYZE, BUFFERS)
SELECT
date_trunc('minute', timezone('UTC', date), 'Europe/Berlin') as time,
avg(latitude) as latitude,
avg(longitude) as longitude
FROM
positions
WHERE
date BETWEEN '2025-09-29T19:28:28.56Z' AND '2025-12-29T20:28:28.56Z'
GROUP BY 1
ORDER BY 1 |
|
Sorry for late reply.
As my test data is a bit old, I changed the date range to 2024-09-29 till 2025-12-29 (restore currently fails as db is too big for the Memory...) v2.2.0 on Raspberry Pi 3B+ with PostgreSQL 17 AND old index: EXPLAIN (ANALYZE, BUFFERS)
SELECT
date_trunc('minute', timezone('UTC', date), 'Europe/Berlin') as time,
avg(latitude) as latitude,
avg(longitude) as longitude
FROM
positions
WHERE
date BETWEEN '2024-09-29T19:28:28.56Z' AND '2025-12-29T20:28:28.56Z'
GROUP BY 1
ORDER BY 1
.
WARNING: database "teslamate" has a collation version mismatch
DETAIL: The database was created using collation version 2.36, but the operating system provides version 2.41.
HINT: Rebuild all objects in this database that use the default collation and run ALTER DATABASE teslamate REFRESH COLLATION VERSION, or build PostgreSQL with the right library version.
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=19746.63..22359.65 rows=87098 width=72) (actual time=5853.133..7236.225 rows=4911 loops=1)
Group Key: (date_trunc('minute'::text, timezone('UTC'::text, date), 'Europe/Berlin'::text))
Buffers: shared hit=457014 read=7271, temp read=2061 written=2068
-> Sort (cost=19746.63..19964.39 rows=87106 width=24) (actual time=5852.638..6295.889 rows=468056 loops=1)
Sort Key: (date_trunc('minute'::text, timezone('UTC'::text, date), 'Europe/Berlin'::text))
Sort Method: external merge Disk: 16488kB
Buffers: shared hit=457014 read=7271, temp read=2061 written=2068
-> Index Scan using positions_date_index on positions (cost=0.43..12599.37 rows=87106 width=24) (actual time=20.947..4960.103 rows=468056 loops=1)
Index Cond: ((date >= '2024-09-29 19:28:28.56'::timestamp without time zone) AND (date <= '2025-12-29 20:28:28.56'::timestamp without time zone))
Buffers: shared hit=457011 read=7271
Planning:
Buffers: shared hit=231
Planning Time: 3.500 ms
Execution Time: 7261.860 ms
(14 rows)will install current rasbian os and set up new dev env on the slow hw. |
|
Cool, thanks @JakobLichterfeld ! Do you have an update with the new index? I see that the index scan took almost 5 seconds, and I expect a significant improvement here. |
New test env now ready:
v2.2.0 with OLD (BTREE) Index: pi@raspberrypi:~/teslamate $ docker compose exec -T database psql -U teslamate -d teslamate << .
EXPLAIN (ANALYZE, BUFFERS)
SELECT
date_trunc('minute', timezone('UTC', date), 'Europe/Berlin') as time,
avg(latitude) as latitude,
avg(longitude) as longitude
FROM
positions
WHERE
date BETWEEN '2020-01-01T00:01:28.56Z' AND '2020-12-31T23:59:28.56Z'
GROUP BY 1
ORDER BY 1
.
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=188814.63..220723.95 rows=1063644 width=72) (actual time=11941.707..14379.999 rows=16007 loops=1)
Group Key: (date_trunc('minute'::text, timezone('UTC'::text, date), 'Europe/Berlin'::text))
Buffers: shared hit=1027295 read=3020, temp read=9332 written=9350
-> Sort (cost=188814.63..191473.74 rows=1063644 width=24) (actual time=11924.570..12735.596 rows=1059916 loops=1)
Sort Key: (date_trunc('minute'::text, timezone('UTC'::text, date), 'Europe/Berlin'::text))
Sort Method: external merge Disk: 37352kB
Buffers: shared hit=1027295 read=3020, temp read=9332 written=9350
-> Index Scan using positions_date_index on positions (cost=0.43..60525.26 rows=1063644 width=24) (actual time=189.695..9104.986 rows=1059916 loops=1)
Index Cond: ((date >= '2020-01-01 00:01:28.56'::timestamp without time zone) AND (date <= '2020-12-31 23:59:28.56'::timestamp without time zone))
Buffers: shared hit=1027292 read=3020
Planning:
Buffers: shared hit=204 read=18
Planning Time: 41.026 ms
JIT:
Functions: 9
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 13.397 ms (Deform 1.557 ms), Inlining 0.000 ms, Optimization 54.909 ms, Emission 137.305 ms, Total 205.611 ms
Execution Time: 15397.036 ms
(18 rows) |
|
New index creation took quite a long time (I assume >20 s each): pi@raspberrypi:~/teslamate $ docker compose exec database psql teslamate teslamate
psql (17.7 (Debian 17.7-3.pgdg13+1))
Type "help" for help.
teslamate=# DROP INDEX positions_date_index;
DROP INDEX
teslamate=# DROP INDEX positions_drive_id_date_index;
DROP INDEX
teslamate=# CREATE INDEX IF NOT EXISTS positions_date_index ON public.positions USING brin (date);
CREATE INDEX
teslamate=# CREATE INDEX IF NOT EXISTS positions_drive_id_date_index ON public.positions USING brin (drive_id, date);
CREATE INDEX
teslamate=# |
|
with v2.2.0 and NEW index (BRIN) around 50% improvement: pi@raspberrypi:~/teslamate $ docker compose exec -T database psql -U teslamate -d teslamate << .
EXPLAIN (ANALYZE, BUFFERS)
SELECT
date_trunc('minute', timezone('UTC', date), 'Europe/Berlin') as time,
avg(latitude) as latitude,
avg(longitude) as longitude
FROM
positions
WHERE
date BETWEEN '2020-01-01T00:01:28.56Z' AND '2020-12-31T23:59:28.56Z'
GROUP BY 1
ORDER BY 1
.
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize GroupAggregate (cost=172812.56..320717.29 rows=1063267 width=72) (actual time=3897.530..5146.864 rows=16007 loops=1)
Group Key: (date_trunc('minute'::text, timezone('UTC'::text, date), 'Europe/Berlin'::text))
Buffers: shared hit=13705 read=1041, temp read=4672 written=4690
-> Gather Merge (cost=172812.56..288376.25 rows=886056 width=72) (actual time=3897.352..4914.164 rows=36252 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=13705 read=1041, temp read=4672 written=4690
-> Partial GroupAggregate (cost=171812.54..185103.38 rows=443028 width=72) (actual time=3791.563..4621.997 rows=12084 loops=3)
Group Key: (date_trunc('minute'::text, timezone('UTC'::text, date), 'Europe/Berlin'::text))
Buffers: shared hit=13705 read=1041, temp read=4672 written=4690
-> Sort (cost=171812.54..172920.11 rows=443028 width=24) (actual time=3788.129..4058.001 rows=353305 loops=3)
Sort Key: (date_trunc('minute'::text, timezone('UTC'::text, date), 'Europe/Berlin'::text))
Sort Method: external merge Disk: 12504kB
Buffers: shared hit=13705 read=1041, temp read=4672 written=4690
Worker 0: Sort Method: external merge Disk: 12440kB
Worker 1: Sort Method: external merge Disk: 12432kB
-> Parallel Bitmap Heap Scan on positions (cost=289.25..121177.07 rows=443028 width=24) (actual time=322.758..2639.754 rows=353305 loops=3)
Recheck Cond: ((date >= '2020-01-01 00:01:28.56'::timestamp without time zone) AND (date <= '2020-12-31 23:59:28.56'::timestamp without time zone))
Rows Removed by Index Recheck: 19620
Heap Blocks: lossy=4986
Buffers: shared hit=13689 read=1041
-> Bitmap Index Scan on positions_date_index (cost=0.00..23.44 rows=1082603 width=0) (actual time=4.324..4.326 rows=147200 loops=1)
Index Cond: ((date >= '2020-01-01 00:01:28.56'::timestamp without time zone) AND (date <= '2020-12-31 23:59:28.56'::timestamp without time zone))
Buffers: shared hit=10
Planning:
Buffers: shared hit=211 read=10 dirtied=3
Planning Time: 40.271 ms
JIT:
Functions: 30
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 22.367 ms (Deform 3.811 ms), Inlining 0.000 ms, Optimization 57.043 ms, Emission 747.963 ms, Total 827.373 ms
Execution Time: 7951.743 ms
(32 rows) |
|
Thanks @JakobLichterfeld for testing that and ensuring there is up2date slow hardware available for testing! Ok, I support changing those indexes to BRIN. It's fairly easy to revert if some slow queries pop up after the next release we are not aware of right now. The car_id / date index would be another candidate but with the car_id / date / ideal_battery_range_km is not null index used for most queries i think we're fine with query performance right now and space savings are smaller for this one. @ilya-y-synth - could you add a note to the teslafi import docs to hint users to reindex their positions table after running an import? Otherwise -> LGTM |
|
Thanks @JakobLichterfeld @swiffer ! Docs updated |
JakobLichterfeld
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, ty for your contribution!

I noticed that the indexes in the position table take up the same amount of space as the data, although they are already sorted by date and new data is only added to the end, which means BTREE can be replaced with a more memory efficient BRIN index.
Raw data size: 50.5 MiB
Indexes:
Before:
Total: 41 MB
After:
Total: 15.1 MB (-63%)