Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IpRange.objects.by_ip(ip) generate sequential scans #35

Open
amezhenin opened this issue Aug 2, 2013 · 6 comments
Open

IpRange.objects.by_ip(ip) generate sequential scans #35

amezhenin opened this issue Aug 2, 2013 · 6 comments

Comments

@amezhenin
Copy link

I'm using django-geoip==0.3 in production and notice strange behavior of PostgreSQL 9.1, when executing queries:

SELECT relname, idx_scan, seq_scan, 100 * idx_scan / (seq_scan + idx_scan) "index_used, %", n_live_tup rows_in_table FROM pg_stat_user_tables WHERE seq_scan + idx_scan > 0 ORDER BY n_live_tup DESC;
             relname              | idx_scan | seq_scan | index_used, % | rows_in_table 
----------------------------------+----------+----------+---------------+---------------
 django_geoip_iprange             |  5323362 |  4830742 |            52 |        160320
 django_geoip_city                |  8258720 |      185 |            99 |           990
 django_geoip_country             |   164658 |  2780330 |             5 |           252
 django_geoip_region              |    44712 |   749790 |             5 |           110

52% of queries generates sequential scans, this leads to long execution time. Currently, I have 15ms in average for those types of queries which is way too long.

I tinkered a bit with you code and reduced the problem to this Stackoverflow question. Do you have thoughts about this issue?

@coagulant
Copy link
Member

Unfortunately, queries are suboptimal. There are several possible speedups: http://habrahabr.ru/post/138067/
Fastest being completely avoiding dbs queries for ip ranges and using ngnix module or some sort of that.

Currently I have no plans on implementing any of them, however pull requests are highly welcome. I'll be happy to discuss and help adding working solutions.

@antonagestam
Copy link

FWIW I'm also experiencing really slow query speeds up to 70ms using PostgresSQL 9.3.2. Is there really no way to optimize the queries?

@antonagestam
Copy link

Found this: http://blog.jcole.us/2007/11/24/on-efficiently-geo-referencing-ips-with-maxmind-geoip-and-mysql-gis/. Do you think this would be feasible to implement?

@amezhenin
Copy link
Author

@antonagestam , I end up using this library https://github.com/idlesign/pysyge . If you don't need to make relations between you models and geodata, this will be fine.

@coagulant
Copy link
Member

@antonagestam There are some, mentioned by me earlier. As for MySQL GIS, I'd rather use db-agnostic solution for django-geoip.

If you don't need to make relations between you models and geodata, this will be fine.

Exactly.

@antonagestam
Copy link

Might create a fork implementing something using Postgres' GiST if I find the time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants