Skip to content

Add HtmlIO asset handler for text/html (C2PA 2.4 Appendix A.7)#2188

Open
erik-sv wants to merge 5 commits into
contentauth:mainfrom
erik-sv:feat/html-asset-handler
Open

Add HtmlIO asset handler for text/html (C2PA 2.4 Appendix A.7)#2188
erik-sv wants to merge 5 commits into
contentauth:mainfrom
erik-sv:feat/html-asset-handler

Conversation

@erik-sv
Copy link
Copy Markdown

@erik-sv erik-sv commented May 29, 2026

Summary

Adds an HtmlIO asset handler implementing the C2PA 2.4 Appendix A.7 HTML embedding method, built on the c2pa-text 2.0.0 crate. This is a companion to the text/plain handler in #2117 and covers a distinct embedding method, so it is proposed separately for clean review.

What it does

  • Read (read_cai): extracts a Base64-encoded C2PA Manifest Store from an inline <script type="application/c2pa"> element. An external <link rel="c2pa-manifest"> reference is recognized but treated as an external (non-embedded) store.
  • Write (write_cai): embeds the Manifest Store as a <script> element in the document <head>, removing any existing C2PA element first so writes are idempotent.
  • Object locations: the c2pa.hash.data exclusion range covers the entire <script> element, per Appendix A.7.1.3.
  • Remove: strips the C2PA <script> element, restoring clean HTML.

Registers html / htm / text/html in the reader, writer, and AssetIO handler maps and the MIME extension↔type tables.

Scope

This handler implements only the HTML method (A.7). It does not touch:

Dependency

Stacked on #2117, which introduces the c2pa-text 2.0.0 dependency this handler uses (c2pa_text::html). Until #2117 merges, the diff here also shows that version bump; it will narrow to the HTML changes once #2117 lands.

Tests

cargo test -p c2pa --lib --features file_io html_io — round-trip, replace+remove, object-location, and no-manifest cases pass, plus the jumbf_io reader/writer/AssetIO registration tests.

erik-sv and others added 5 commits May 5, 2026 20:36
Add a TextIO asset handler that embeds and extracts C2PA JUMBF manifests
in plain text files using the c2pa-text crate. The crate encodes binary
manifest data as invisible Unicode Variation Selectors, following the
C2PA text embedding specification (Section A.7).

The handler implements CAIReader, CAIWriter, and AssetPatch for full
read/write/patch support. Hash object positions span the entire text
content with an exclusion range covering the embedded manifest bytes.

Registers "txt" and "text/plain" as supported types in the MIME utility
and adds TextIO to all three handler maps (readers, writers, file-based).

The c2pa-text reference implementation is at:
  https://github.com/encypherai/c2pa-text
Git dependencies are rejected by crates.io during publish. c2pa-text
v1.1.0 is already published on crates.io, so reference it directly.
- Bump c2pa-text from 1.1.0 to 2.0.0 (released crates.io version adding the
  structured (A.9) and HTML (A.7) pipelines; TextIO's embed_manifest/
  extract_manifest usage is unchanged).
- Resolve a merge conflict left in sdk/Cargo.toml by the prior 'Merge branch
  main' commit: keep c2pa-text and drop the 'config' dependency, which main
  removed (it is unused in the sdk).

sdk compiles and the text_io tests pass.
Implements the C2PA HTML embedding method (Appendix A.7) on top of c2pa-text 2.0.0:

- read: extracts a Base64-encoded Manifest Store from an inline
  <script type="application/c2pa"> element. External <link rel="c2pa-manifest">
  references are recognized but treated as an external (non-embedded) store.
- write: embeds the Manifest Store as a <script> element in the document <head>,
  replacing any existing C2PA element so writes are idempotent.
- object locations: the c2pa.hash.data exclusion covers the <script> element
  (Appendix A.7.1.3).
- remove: strips the C2PA <script> element, restoring clean HTML.

Registers html / htm / text/html in the reader, writer, and AssetIO handler maps
and the MIME extension<->type tables. Includes round-trip, replace+remove,
object-location, and no-manifest tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant