Skip to content

Platform-independent versions of Pleiades gazetteer data

License

Notifications You must be signed in to change notification settings

isawnyu/pleiades.datasets

Repository files navigation

Pleiades gazetteer datasets

Please report problems and make feature requests via the main Pleiades Gazetteer Issue Tracker.

License: CC BY 3.0 Content is governed by the copyrights of the individual contributors responsible for its creation. Some rights are reserved. All content is distributed under the terms of a Creative Commons Attribution license (cc-by).

In order to facilitate reproducibility and to comply with license terms, we encourage use and citation of numbered releases for scholarly work that will be published in static form.

Please share notices of data reuse with the Pleiades community via email to [email protected]. These reports help us to justify continued funding and operation of the gazetteer and to prioritize updates and improvements.

Version 4.0.1 - 6 February 2025

41,200 place resources

Since release 3.2 of pleiades.datasets on 3 November 2023, the Pleiades gazetteer published 876 new and 9,555 updated place resources, reflecting the work of Johan Åhlfeldt, Ella Arnold, Jeffrey Becker, Gabriel Bodard, Sarah Bond, Catherine Bouras, Lucas Butler, Iulian Bîrzescu, Anne Chen, Birgit Christiansen, Niels Christofferson, James Cowey, Francis Deblauwe, Dan Diffendale, Anthony Durham, Denitsa Dzhigova, Tom Elliott, Jordy Didier Orellana Figueroa, Martina Filosa, Jonathan Fu, Ryosuke Furui, Maija Gierhart, Sean Gillies, Matthias Grawehr, Amelia Grissom, Maxime Guénette, Andrew Harris, Greta Hawes, Ryan M. Horne, Carolin Johansson, Daniel C. Browning Jr., Noah Kaye, Philip Kenrick, Brady Kiesling, Yaniv Korman, Mark Krier, Divya Kumar-Dumas, Thomas Landvatter, Chris de Lisle, Yuyao Liu, Stanisław Ludwiński, Sean Manning, Gabriel McKee, John Muccigrosso, Jamie Novotny, Philipp Pilhofer, Jonathan Prag, Adam Rabinowitz, Rune Rattenborg, María Jesús Redondo, Charlotte Roueché, Karen Rubinson, Thomas Seidler, Rosemary Selth, Jason M. Silverman, R. Scott Smith, Néhémie Strupler, Richard Talbert, Francis Tassaux, Clifflena Tiah, Georgios Tsolakis, Scott Vanderbilt, Athanasia Varveri and Valeria Vitale.

Highlights

  • Updated gazetteer data in this release: see "Contents" below.
  • Removed deprecated "legacy CSV" serialization. JSON or "CSV for GIS" are the recommended packages for most third-party reuse.
  • Added new "indexes" dataset: Pleiades places that reference certain external resources.
  • Improved serialization of vocabulary terms in "CSV for GIS" serialization and added the previously omitted "Time Periods" vocabulary.
  • Added new "sidebar" dataset: assertions by external datasets of relationships to Pleiades places.

Overview

This is a package of data derived from the Pleiades gazetteer of ancient places. It is used for archival and redistribution purposes and is likely to be less up-to-date than the live data at https://pleiades.stoa.org.

Pleiades is a community-built gazetteer and graph of ancient places. It publishes authoritative information about ancient places and spaces, providing unique services for finding, displaying, and reusing that information under open license. It publishes not just for individual human users, but also for search engines and for the widening array of computational research and visualization tools that support humanities teaching and research.

Pleiades is a continuously published scholarly reference work for the 21st century. We embrace the new paradigm of citizen humanities, encouraging contributions from any knowledgeable person and doing so in a context of pervasive peer review. Pleiades welcomes your contribution, no matter how small, and we have a number of useful tasks suitable for volunteers of every interest.

Access and Archiving

The latest versions of this package can be had by fork or download from the main branch at https://github.com/isawnyu/pleiades-datasets. Numbered releases are created periodically at GitHub. These are archived at:

Credits

Pleiades is brought to you by:

  • Our volunteer content contributors (see data/rdf/authors.ttl for complete list and associated identifiers or data).
  • Pleiades has received significant, periodic support from the National Endowment for the Humanities since 2006. Grant numbers: HK-230973-15, PA-51873-06, PX-50003-08, and PW-50557-10. Any views, findings, conclusions, or recommendations expressed in this publication do not necessarily reflect those of the National Endowment for the Humanities.
  • Web hosting and additional support has been provided since 2008 by the Institute for the Study of the Ancient World at New York University.
  • Additional support has been provided since 2000 by the Ancient World Mapping Center at the University of North Carolina at Chapel Hill.
  • Development hosting and other project incubation support was provided between 2000 and 2008 by Ross Scaife and the Stoa Consortium.

Contents

JavaScript Object Notation (JSON)

directory: /data/json/

The complete serialization of each published object in the Pleiades database (i.e., every place, name, location, and connection resource) is written to a single, large JSON file once daily. We periodically download this file and split it up into individual files, one for each place resource (together with its subordinate name, location, and connection resources).

Each file is named with the final, numeric portion of the place resource's Uniform Resource Identifier (URI), plus the filename extension ".json". So, for example, the URI for the Pleiades place resource describing Roman Heidelberg is https://pleiades.stoa.org/places/118731. The corresponding JSON file in this dataset is named "118731.json".

In order to avoid performance problems on operating systems that cannot handle large numbers of files in a single directory efficiently, the JSON files are distributed throughout a hierarchical directory structure using each of the first few digits in the base filename as a subdirectory. So, for the Heidelberg example, one would find the JSON file at the relative path data/json/1/1/8/7/118731.json.

GIS Package (in CSV format)

directory: /data/gis/

A collection of CSV files derived from data in the Pleiades gazetteer of ancient places. This collection is intended to facilitate use of Pleiades data in geographic information systems software and other programming contexts where JSON is inconvenient. NB: not all attributes are included in this serialization. Files have been tested for use in QGIS. See further data/gis/README.md.

Resource Description Framework (RDF)

directory: /data/rdf/

The summary data for all places, errata, authors, place types, and time periods is available for download in Turtle (Terse RDF Triple Language) via http://atlantides.org/downloads/pleiades/rdf/pleiades-latest.tar.gz. This is a gzip-compressed, TAR archive. Previous RDF dumps are also available at http://atlantides.org/downloads/pleiades/rdf/. RDF dumps are updated weekly on Sundays. We periodically download, decompress, and unarchive these files.

NB: RDF serializations of data for individual places — in both Turtle and RDF/XML syntax — can be had from links on the place pages, such as http://pleiades.stoa.org/places/579885/turtle for Athens, or by a negotiated request for the resource http://pleiades.stoa.org/places/579885#this. Please see the README in https://github.com/isawnyu/pleiades-rdf for a description of the RDF and the vocabularies and ontologies used.

Changelogs

directory: /data/changelogs/

Monthly listings, in HTML files, of new and updated place resources since 2009. The listings include: place titles and summaries, links to the live resources on the Pleiades website using canonical URIs, and information about the creators and contributors of each resource, with a special entry for the authors of the referenced changes (includes change summary notes).

Data Quality Metrics

directory: /data/data_quality/

Data quality and characterization reports, currently used by the Editorial College to prioritize and organize feature improvement and content cleanup efforts. Files include:

  • issues.json: place ids for each category of error (see below), as well as summary information used in the generation of reports
  • bad_osm_way.csv: place resources that reference OSM Way objects but that include coordinate information drawn from only the first Node in the referenced way. See Pleiades Gazetteer Issue 492: add "reimport from OSM" affordance to Pleiades locations for a feature addition that will facilitate supervised programmatic repair of such problems.
  • bad_place_type.csv: place resources that make use of deprecated place/feature-type terms.
  • missing_accuracy.csv: place resources with associated location resources that are missing positional accuracy metadata
  • missing_modern_name.csv: place resources with no associated modern name resource (may not always be an error)
  • names_romanized_only.csv: place resources that contain associated names that lack "attested" forms (i.e., original-script orthography)
  • poor_accuracy.csv: place resources none of whose associated locations provide horizontal accuracy better than 2km.
  • question_mark_titles.csv: place resources whose titles contain question marks (i.e., are legacy BAtlas entries for less-than-certain place/name/locations matches that have likely not yet been revisited for data modeling improvements)
  • rough_not_unlocated.csv: place resources that are not marked (place type) as "unlocated", but that report only "rough" positions. Many of these are likely Barrington Atlas place resources that should be typed "unlocated" or that were never digitized.

Credits

directory: /data/html/credits.html

A copy of the Pleiades gazetteer credits page, which is online at https://pleiades.stoa.org/credits.

Indexes

directory: /data/indexes/

Various indexes, potentially of value when working with Pleiades content.

  • name_index.json: a list of all name and place title strings occurring in the gazetteer, together with the associated Pleiades place IDs.
  • JSON files, and Markdown generated from them, reporting on references from Pleiades places to the following external resources:
    • aio: Lambert, Stephen, Polly Low, Peter Liddel, and Chris de Lisle, eds. Attic Inscriptions Online. Athens: British School at Athens, 2012-. https://www.atticinscriptions.com/.
    • cfl_ago: Veronique Chankowski, Amélie Perrier, Catherine Bouras, Sabine Fourrier, John Bennet, Michaeol Loy, Kostis Christakis, L. Mulot, and G. Bejjaji. Chronique Des Fouilles En Ligne = Archaeology in Greece Online. Athens: Ecole française d’Athènes and British School at Athens, 2018-. https://chronique.efa.gr/?kroute=homepage.
    • manto: Hawes, Greta, and Scott Smith. MANTO. 2020-. https://www.manto-myth.org/manto.
    • paus-spiro-perseus: Pausanias. Pausaniae Graeciae descriptio. Edited by Friedrich Spiro. 3 vols. Bibliotheca Teubneriana. Lipsiae: Teubner, 1903. As digitized and published by the Perseus Digital Library http://data.perseus.org/texts/urn:cts:greekLit:tlg0525.tlg001.perseus-grc1.
    • tm: Depauw, Mark, Tom Gheldof, Herbert Verreth, Nico Dogaer, Willy Clarysse, Yanne Broux, Gert Baetens, and Heinz-Josef Thissen. Trismegistos: An Interdisciplinary Portal of the Ancient World. Leuven, 2006-. http://www.trismegistos.org/.
    • topostext: Kiesling, Brady. ToposText – a Reference Tool for Greek Civilization. Version 2.0. Aikaterini Laskaridis Foundation, 2016-. https://topostext.org/.
    • wikidata: Wikidata: The Free Knowledge Base That Anyone Can Edit. Wikimedia Foundation, 2014-. https://www.wikidata.org/.

Sidebar

directory: /data/sidebar/

JSON files containing information about incoming links from external resources that are currently indexed. Files are named according to the corresponding Pleiades ID. This data will eventually be used for a "Linked Data" widget on Pleiades place pages. This data is generated using code in the pleiades_sidebar repository. The following resources are currently checked:

  • Alföldy, Géza, and Christian Witschel, eds. Epigraphic Database Heidelberg. Heidelberg: Heidelberg Academy of Sciences and Humanities, 1997-2021. http://edh-www.adw.uni-heidelberg.de/.
  • Brughmans, Tom, Pau de Soto, A. Pažout, and P. Bjerregaard Vahlstrup. Itiner-e: The Digital Atlas of Ancient Roads. 2024. https://itiner-e.org/.
  • MANTO
  • Meadows, Andrew, Sebastian Heath, and Ethan Gruber. Nomisma.org. New York: American Numismatic Society, 2010-. http://nomisma.org/.
  • Wikidata

About

Platform-independent versions of Pleiades gazetteer data

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages