Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconciling various metadata inputs #77

Closed
rickj opened this issue Sep 29, 2021 · 3 comments
Closed

Reconciling various metadata inputs #77

rickj opened this issue Sep 29, 2021 · 3 comments

Comments

@rickj
Copy link
Collaborator

rickj commented Sep 29, 2021

A given title can have three (potentially conflicting) metadata feeds:

  1. What is in the file itself
  2. What is sent via ONIX
  3. What is sent from the publisher independently of either of the above (typically via a spreadsheet)

We would like guidance on how to handle this.

What we currently do when viewing the metadata is you will see a11y data in three "buckets": Source file, publisher, and ONIX. These sets of data are unique, independent from each other, and can have duplicate or conflicting information. We are providing all the data we have about an asset, unaltered, so that purchasers can make informed decisions.

Our goal is to collect as much information about a title as possible. Therefore, collecting a11y data via these three buckets is summative. Updating a source file will not cause the other two buckets to empty. Sending through a spreadsheet full of onix data will not cause the publisher information to disappear, etc.

Any time one bucket is updated, that bucket will only contain the most recent information sent through. Sending a spreadsheet with abc data, then another with xyz data, will only display xyz. We will not display abcxyz.

@dauwhe
Copy link

dauwhe commented Sep 30, 2021

The EPUB spec itself does provide some guidance for conflicting metadata, but it does require that the EPUB contain links to the external metadata.

When it comes to resolving discrepancies and conflicts between metadata expressed in the Package Document and in linked metadata records, Reading Systems MUST use the document order of link elements in the Package Document to establish precedence (i.e., metadata in the first linked record encountered has the highest precedence and metadata in the Package Document the lowest, regardless of whether the link elements occur before, within or after the package metadata elements).

@rickj
Copy link
Collaborator Author

rickj commented Sep 30, 2021

Excellent point @dauwhe , however, in this case we are talking about metadata presented to a learner considering purchase (store page, catalog, etc.) and not in a reading system. Precedence order makes sense within a reading system for title related metadata. For display to a user outside a reading system I would imagine the desired precedence order would be reversed, as an ONIX feed, or publisher direct feed/spreadsheet would have more current metadata than a content file previously distributed thru a channel.

@gautierchomel
Copy link
Collaborator

See #189 & #191

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants