Skip to content
Open
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,10 @@ As of December 2025 and until the 1.0.0 version is released, the CAI team will o

## [Unreleased]

### Added

* Add OGG Vorbis and Opus support for C2PA manifest embedding per spec v2.3 §A.3.5

## [0.80.0](https://github.com/contentauth/c2pa-rs/compare/c2pa-v0.79.5...c2pa-v0.80.0)
_16 April 2026_

Expand Down
60 changes: 60 additions & 0 deletions docs/formats/ogg.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# OGG C2PA Support

This document describes how C2PA manifest storage is implemented for OGG containers (Vorbis, Opus) in c2pa-rs.

## Spec reference

- **C2PA Specification v2.3**: [Section A.3.5 – Embedding manifests into OGG Vorbis](https://spec.c2pa.org/specifications/specifications/2.3/specs/C2PA_Specification.html) states that the C2PA Manifest Store shall be embedded in its own dedicated logical bitstream within the OGG container. The first packet of this stream starts with the 5-byte identifier `\x00c2pa`, and the manifest store data follows immediately after.
- **Hash binding**: [Section 18.7.3.7](https://spec.c2pa.org/specifications/specifications/2.3/specs/C2PA_Specification.html) defines how OGG logical bitstreams map to the box hash model. Each logical bitstream is treated as a single "box" named `Stream-{serial}`, where `serial` is the bitstream serial number as a decimal ASCII string. The C2PA manifest bitstream uses the standard `C2PA` box name.
- **OGG container**: [RFC 3533](https://www.rfc-editor.org/rfc/rfc3533) – The Ogg Encapsulation Format Version 0.

## OGG structure

An OGG file consists of interleaved **pages**, each belonging to a **logical bitstream** identified by a unique 32-bit serial number. Pages carry **packets** using a lacing mechanism (segments of up to 255 bytes).

Layout after C2PA embedding:

1. **C2PA BOS page** – Beginning-of-stream page for the manifest bitstream. First packet starts with `\x00c2pa` followed by JUMBF manifest data.
2. **Audio BOS page(s)** – Original Vorbis or Opus identification header.
3. **C2PA continuation/EOS pages** – If the manifest exceeds ~65 KB, it spans multiple pages. The last page carries the EOS flag. For small manifests the BOS page is also the EOS page, so this group is empty.
4. **Audio data pages** – Original audio content, unmodified.

This ordering ensures all BOS pages appear before any data pages (RFC 3533 requirement).

This ordering ensures each bitstream's pages are contiguous, which is required for BoxHash byte-range verification.

**BOS page identification**:
- Vorbis: first packet starts with `\x01vorbis`
- Opus: first packet starts with `OpusHead`
- C2PA: first packet starts with `\x00c2pa`

## No external crate dependencies

OGG page parsing and writing are implemented directly in the handler. The OGG page format is simple (27-byte fixed header + segment table + body), and the CRC-32 uses a precomputed lookup table for the OGG-specific polynomial `0x04c11db7` (direct / non-reflected, per RFC 3533). No external OGG parsing crate is required.

## Implementation summary

- **Module**: `sdk/src/asset_handlers/ogg_io.rs`.
- **Handler**: `OggIO` with `supported_types()` returning `["ogg", "audio/ogg", "opus", "audio/opus"]`.
- **Traits**: `CAIReader`, `CAIWriter`, `AssetIO`, `AssetPatch`, `AssetBoxHash`.
- **Note**: `RemoteRefEmbed` is not implemented — the C2PA specification does not define XMP or remote reference embedding for OGG containers.
- **Flow**:
- **Read**: Parse all pages, find the BOS page whose first packet starts with `\x00c2pa`, collect all pages with that serial number, reconstruct the packet, strip the 5-byte magic prefix, return the JUMBF bytes.
- **Write**: Parse all pages, remove any existing C2PA bitstream, build new C2PA pages from the manifest data (handling fragmentation across pages for large manifests), write output with BOS pages grouped first (C2PA BOS, then audio BOS), followed by C2PA continuation/EOS pages, then audio data pages grouped by serial.
- **BoxHash**: Each logical bitstream maps to a `BoxMap` entry. Audio streams are named `Stream-{serial}` (decimal). The C2PA stream uses the `C2PA` label. If no C2PA stream exists, a placeholder entry is inserted with `excluded: true`.
- **Patch**: For same-size manifest replacement, C2PA pages are overwritten in-place with recomputed CRC checksums.

## Opus support

While the C2PA v2.3 specification only names "OGG Vorbis", the embedding mechanism operates at the OGG container level (a separate logical bitstream) and is codec-agnostic. This implementation supports both Vorbis and Opus containers. The handler registers both `ogg`/`audio/ogg` and `opus`/`audio/opus` MIME types.

## Files touched

- `sdk/src/asset_handlers/ogg_io.rs` – OGG handler with inline page parser, CRC-32, and tests.
- `sdk/src/asset_handlers/mod.rs` – `pub mod ogg_io`.
- `sdk/src/error.rs` – `OggError` and `Error::OggError`.
- `sdk/src/jumbf_io.rs` – register `OggIO` in CAI_READERS, CAI_WRITERS, and handler tests.
- `sdk/src/utils/mime.rs` – `"opus"` / `"audio/opus"` in `extension_to_mime` and `format_to_extension`.
- `sdk/tests/fixtures/sample1.ogg` – minimal valid OGG Vorbis fixture.
- `sdk/tests/fixtures/sample1.opus` – minimal valid OGG Opus fixture.
- `docs/supported-formats.md` – added OGG and Opus to the format table.
2 changes: 2 additions & 0 deletions docs/supported-formats.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@ The following table summarizes the supported media (asset) file formats. This i
| `mp3` | `audio/mpeg` |
| `mp4` | `video/mp4`, `application/mp4` <br/>Fragmented MP4 (DASH) supported only for file-based operations from the Rust library. |
| `mov` | `video/quicktime` |
| `ogg` | `audio/ogg` |
| `opus` | `audio/opus` |
| `pdf` | `application/pdf` (**read-only**) |
| `png` | `image/png` |
| `svg` | `image/svg+xml` |
Expand Down
3 changes: 2 additions & 1 deletion sdk/src/asset_handlers/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -319,6 +319,7 @@ All traits require `Sync + Send`. Handlers must be **stateless structs** with no
| **GifIO** (GIF) | Y | Y | Y | Y | Y | Y | Y |
| **C2paIO** (C2PA sidecar) | Y | Y | Y | -- | Y | Y | -- |
| **PdfIO** (PDF) | Y | -- | Y | -- | -- | Y | -- |
| **OggIO** (OGG, Opus) | Y | Y | Y | -- | Y | -- | Y |

### Key observations

Expand Down Expand Up @@ -390,7 +391,7 @@ flowchart TB
end
end

handlers["jpeg_io.rs jpegxl_io.rs png_io.rs bmff_io.rs tiff_io.rs riff_io.rs svg_io.rs mp3_io.rs gif_io.rs c2pa_io.rs pdf_io.rs"]
handlers["jpeg_io.rs jpegxl_io.rs png_io.rs bmff_io.rs tiff_io.rs riff_io.rs svg_io.rs mp3_io.rs gif_io.rs ogg_io.rs c2pa_io.rs pdf_io.rs"]

dispatch -->|"Looks up by extension/MIME"| traits
traits -->|"Implemented by asset_handlers/&lt;format&gt;_io.rs"| handlers
Expand Down
1 change: 1 addition & 0 deletions sdk/src/asset_handlers/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ pub(crate) mod id3_helper;
pub mod jpeg_io;
pub mod jpegxl_io;
pub mod mp3_io;
pub mod ogg_io;
pub mod png_io;
pub mod riff_io;
pub mod svg_io;
Expand Down
Loading