With #101, ADM_LEVEL_NAMES in processor.py maps GeoNames feature codes to generic English names (ADM1 → region, ADM2 → department, etc.). These are used when two places share the same name and are in an ancestor/descendant relationship, producing labels like "Rumilly (district)" vs "Rumilly (city)".
The problem is that ADM levels are country-specific: ADM1 is a "state" in the US, a "region" in France, a "Land" in Germany, a "province" in Canada, etc. The generic fallback names are misleading for most countries.
GeoNames originally definition is probably intentionally way more blurry: https://www.geonames.org/export/codes.html
Proposed solution
Replace the single flat dict with a bundled JSON asset (src/maps2zim/assets/adm_level_names.json) structured as:
{
"_default": {"ADM1": "region", "ADM2": "department", "ADM3": "district", "ADM4": "city"},
"US": {"ADM1": "state", "ADM2": "county", "ADM3": "city"},
"DE": {"ADM1": "state", "ADM2": "district", "ADM3": "municipality"},
...
}
_compute_discriminating_labels looks up the place's country_code first, falling back to _default.
Data sourcing
No off-the-shelf file is known to exist. The initial dataset should cover the most frequent countries in the GeoNames data and can be extended over time. Wikidata SPARQL or the OpenStreetMap wiki (which documents per-country admin levels) are the best reference sources for curating the initial content.
With #101,
ADM_LEVEL_NAMESinprocessor.pymaps GeoNames feature codes to generic English names (ADM1→region,ADM2→department, etc.). These are used when two places share the same name and are in an ancestor/descendant relationship, producing labels like "Rumilly (district)" vs "Rumilly (city)".The problem is that ADM levels are country-specific:
ADM1is a "state" in the US, a "region" in France, a "Land" in Germany, a "province" in Canada, etc. The generic fallback names are misleading for most countries.GeoNames originally definition is probably intentionally way more blurry: https://www.geonames.org/export/codes.html
Proposed solution
Replace the single flat dict with a bundled JSON asset (
src/maps2zim/assets/adm_level_names.json) structured as:_compute_discriminating_labelslooks up the place'scountry_codefirst, falling back to_default.Data sourcing
No off-the-shelf file is known to exist. The initial dataset should cover the most frequent countries in the GeoNames data and can be extended over time. Wikidata SPARQL or the OpenStreetMap wiki (which documents per-country admin levels) are the best reference sources for curating the initial content.