Skip to content

Support Unicode CJK CMap #11

@neko-para

Description

@neko-para

There are a lot of CJK CMaps, but some of them are just Utf16-BE. We can check the prefix of unknown encoding and treat the encoding begin with Uni as Utf16-BE.
Here are some information from pdf v1.5 spec p404
image
image

// fontentry.rs
let source_encoding = match base_encoding {
    Some(BaseEncoding::StandardEncoding) => Some(Encoding::AdobeStandard),
    Some(BaseEncoding::SymbolEncoding) => Some(Encoding::AdobeSymbol),
    Some(BaseEncoding::WinAnsiEncoding) => Some(Encoding::WinAnsiEncoding),
    Some(BaseEncoding::MacRomanEncoding) => Some(Encoding::MacRomanEncoding),
    Some(BaseEncoding::MacExpertEncoding) => Some(Encoding::AdobeExpert),
    ref e => {
        // we can do the check here, return AdobeStandard if matches.
        warn!("unsupported pdf encoding {:?}", e);
        None
    }
};

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions