Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
98 changes: 98 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -169,6 +169,104 @@ console.log(person2.mum.mum.name)
// outputs "Joanne"
```

### Managing Language Tags

RDF includes a special attribute for string literals called a [language tag](https://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal). Language tags let developers provide string values for many different translations. This library provides `LanguagePreferences` to declaratively control which translations are read and written.

A language preference is an ordered list of [IETF language tags](https://en.wikipedia.org/wiki/IETF_language_tag) plus the special tags `@none` (matches strings without a language tag) and `@other` (matches any language not explicitly listed).

- **Read** operations return the value matching the highest-priority preference.
- **Write** operations use the first non-`@other` preference as the language tag.

#### Singular properties

```javascript
import { LanguagePreferences, RequiredFrom, RequiredAs, TermWrapper } from "@rdfjs/wrapper"

class Hospital extends TermWrapper {
languages = new LanguagePreferences("es", "ko", "@none")

get label() {
return RequiredFrom.subjectPredicateByLanguage(this, "http://www.w3.org/2000/01/rdf-schema#label", this.languages)
}

set label(value) {
RequiredAs.objectByLanguage(this, "http://www.w3.org/2000/01/rdf-schema#label", value, this.languages)
}
}
```

Assuming the following RDF has been loaded in a dataset `dataset`:

```turtle
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

<hospital>
rdfs:label "Hospital" ;
rdfs:label "Hôpital"@fr ;
rdfs:label "병원"@ko .
```

Class usage:

```javascript
const hospital = new Hospital("hospital", dataset, DataFactory)

// No Spanish label, so falls through to Korean
console.log(hospital.label)
// outputs "병원"

// Writing adds a Spanish label (first preference)
hospital.label = "Hospital Español"
console.log(hospital.label)
// outputs "Hospital Español"
```

#### Set properties

```javascript
import { LanguagePreferences, SetFrom, TermWrapper } from "@rdfjs/wrapper"

class Hospital extends TermWrapper {
languages = new LanguagePreferences("fr", "ko", "@none")

get descriptions() {
return SetFrom.subjectPredicateByLanguage(this, "http://www.w3.org/2000/01/rdf-schema#description", this.languages)
}
}
```

```javascript
const hospital = new Hospital("hospital", dataset, DataFactory)

// Returns French descriptions (highest-priority match)
console.log(hospital.descriptions.size)
// outputs 2

for (const desc of hospital.descriptions) {
console.log(desc)
}
// outputs "Guérit les malades"
// outputs "A des médecins"
```

#### Inspecting all translations

The `languagesOf` function returns all language-tagged string values for a predicate, grouped by language tag:

```javascript
import { languagesOf } from "@rdfjs/wrapper"

const langs = languagesOf(hospital, "http://www.w3.org/2000/01/rdf-schema#label")
// Map { "@none" => ["Hospital"], "fr" => ["Hôpital"], "ko" => ["병원"] }
```

#### Optional properties

`OptionalFrom.subjectPredicateByLanguage` and `OptionalAs.objectByLanguage` both return `undefined` when no match is found.

When set to `undefined`, `OptionalFrom.subjectPredicateByLanguage` removes language-tagged quads for that predicate. `OptionalAs.objectByLanguage` is broader: it removes both language-tagged literals and plain string literals (for example `@none` / `xsd:string` values). Take care when clearing an `OptionalAs.objectByLanguage` value, as this can also remove untagged string values.


## Background

Expand Down
128 changes: 128 additions & 0 deletions src/LanguagePreferences.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
import type { Literal, Quad } from "@rdfjs/types"
import { isStringLiteralQuad } from "./isStringLiteralQuad.js"

/**
* Represents an ordered list of language preferences for reading and writing
* language-tagged RDF literals ({@link https://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal | `rdf:langString`}).
*
* @remarks
* Language preferences control how language-tagged string literals are selected
* for reading and which language tag is used when writing.
*
* Valid preference values include:
* - Any {@link https://en.wikipedia.org/wiki/IETF_language_tag | IETF language tag} (e.g., `"en"`, `"fr"`, `"ko"`)
* - `"@none"` — matches string literals that have no language tag (plain `xsd:string`)
* - `"@other"` — matches any language not explicitly listed in the preferences
*
* For read operations, literals are searched in preference order and the first
* match is returned. For write operations, the first preference that is not
* `"@other"` is used as the language tag.
*
* @example Basic usage in a model class
* ```ts
* class Hospital extends TermWrapper {
* readonly languages = new LanguagePreferences("es", "ko", "@none")
*
* get label(): string {
* return RequiredFrom.subjectPredicateByLanguage(this, "http://www.w3.org/2000/01/rdf-schema#label", this.languages)
* }
*
* set label(value: string) {
* RequiredAs.objectByLanguage(this, "http://www.w3.org/2000/01/rdf-schema#label", value, this.languages)
* }
* }
* ```
*/
export class LanguagePreferences {
/**
* The ordered list of language preference tags.
*/
public readonly tags: readonly string[]

/**
* Creates a new instance of {@link LanguagePreferences}.
*
* @param tags - An ordered list of language preferences. Earlier entries have higher priority.
*/
constructor(...tags: string[]) {
this.tags = tags
}

/**
* The language tag to use for write operations.
*
* Returns the first preference that is not `"@other"`.
* For `"@none"`, returns an empty string.
* If all preferences are `"@other"` or the list is empty, returns an empty string.
*/
get writeLanguage(): string {
for (const tag of this.tags) {
if (tag !== "@other") {
return tag === "@none" ? "" : tag
}
}
return ""
}

/**
* Tests whether a literal's language tag matches a given preference tag.
*
* @param literalLanguage - The language tag of the literal (empty string if none).
* @param preferenceTag - The preference tag to match against.
*/
matchesPreference(literalLanguage: string, preferenceTag: string): boolean {
if (preferenceTag === "@none") {
return literalLanguage === ""
}
if (preferenceTag === "@other") {
return !this.tags.some(t =>
t !== "@other" && this.matchesPreference(literalLanguage, t)
)
}
return literalLanguage.toLowerCase() === preferenceTag.toLowerCase()
}

/**
* From an iterable of quads, selects the object literal of the first quad
* whose language tag matches the highest-priority preference.
*
* Considers language-tagged literals (`rdf:langString`) and plain string
* literals (`xsd:string`). The `@none` preference matches plain strings.
*
* @param quads - The quads to search through.
* @returns The best-matching literal, or `undefined` if no match is found.
*/
selectBest(quads: Iterable<Quad>): Literal | undefined {
return this.filterBest(quads).next().value
}

/**
* From an iterable of quads, yields all literals whose language tag matches
* the highest-priority preference that has at least one match.
*
* Considers language-tagged literals (`rdf:langString`) and plain string
* literals (`xsd:string`). The `@none` preference matches plain strings.
*
* @param quads - The quads to search through.
* @returns An iterable of matching literals (may be empty).
*/
* filterBest(quads: Iterable<Quad>): IterableIterator<Literal> {
const stringLiterals = [...this.collectStringLiterals(quads)]

for (const tag of this.tags) {
const matches = stringLiterals.filter(l => this.matchesPreference(l.language, tag))
if (matches.length > 0) {
yield* matches
return
}
}
}

private * collectStringLiterals(quads: Iterable<Quad>): Iterable<Literal> {
for (const quad of quads) {
if (isStringLiteralQuad(quad)) {
yield quad.object
}
}
}
}
115 changes: 115 additions & 0 deletions src/LanguageSet.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
import type { Literal, Quad_Object, Quad_Subject, Term } from "@rdfjs/types"
import { TermWrapper } from "./TermWrapper.js"
import type { LanguagePreferences } from "./LanguagePreferences.js"
import { isStringLiteralQuad } from "./isStringLiteralQuad.js"

/**
* A {@link Set} of strings backed by language-tagged RDF literals in a dataset, filtered by {@link LanguagePreferences}.
*
* @remarks
* - Reading (iteration, `has`, `size`) returns only values matching the highest-priority language preference that has at least one match.
* - Adding values creates literals tagged with the {@link LanguagePreferences.writeLanguage | write language}.
* - Deleting values removes quads from the currently visible (best-matching) language.
* - Clearing removes all string-literal quads for the predicate, including both `rdf:langString` and `xsd:string`, regardless of language.
*/
export class LanguageSet implements Set<string> {
constructor(
private readonly subject: TermWrapper,
private readonly predicate: string,
private readonly preferences: LanguagePreferences,
) {}

add(value: string): this {
const s = this.subject as Quad_Subject
const p = this.subject.factory.namedNode(this.predicate)
const o = this.createLangLiteral(value)
const q = this.subject.factory.quad(s, p, o)
this.subject.dataset.add(q)
return this
}

clear(): void {
for (const q of this.allLangStringQuads) {
if (isStringLiteralQuad(q)) {
this.subject.dataset.delete(q)
}
}
}

delete(value: string): boolean {
const matchingLiterals = this.bestMatchLiterals.filter((literal) => literal.value === value)
if (matchingLiterals.length === 0) {
return false
}

const matchingQuads = Array.from(this.allLangStringQuads).filter(
(q) => q.object.termType === "Literal" && matchingLiterals.some((literal) => literal.equals(q.object))
)

for (const q of matchingQuads) {
this.subject.dataset.delete(q)
}

return matchingQuads.length > 0
}

forEach(cb: (item: string, index: string, set: Set<string>) => void, thisArg?: any): void {
for (const item of this) {
cb.call(thisArg, item, item, this)
}
}

has(value: string): boolean {
for (const literal of this.bestMatchLiterals) {
if (literal.value === value) {
return true
}
}
return false
}

get size(): number {
return this.bestMatchLiterals.length
}

[Symbol.iterator](): SetIterator<string> {
return this.values()
}

* entries(): SetIterator<[string, string]> {
for (const v of this) {
yield [v, v]
}
}

keys(): SetIterator<string> {
return this.values()
}

* values(): SetIterator<string> {
for (const literal of this.bestMatchLiterals) {
yield literal.value
}
}

get [Symbol.toStringTag](): string {
return this.constructor.name
}

private get bestMatchLiterals(): Literal[] {
return [...this.preferences.filterBest(this.allLangStringQuads)]
}

private get allLangStringQuads() {
const p = this.subject.factory.namedNode(this.predicate)
return this.subject.dataset.match(this.subject as Term, p)
}

private createLangLiteral(value: string): Quad_Object {
const language = this.preferences.writeLanguage
if (language === "") {
return this.subject.factory.literal(value)
}
return this.subject.factory.literal(value, language)
}
}
13 changes: 13 additions & 0 deletions src/isStringLiteralQuad.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
import type { Literal, Quad } from "@rdfjs/types"
import { RDF } from "./vocabulary/RDF.js"
import { XSD } from "./vocabulary/XSD.js"

/**
* Tests whether a quad's object is a string-typed literal
* (`rdf:langString` or `xsd:string`).
*/
export function isStringLiteralQuad(quad: Quad): quad is Quad & { object: Literal } {
const { object } = quad
return object.termType === "Literal"
&& (object.datatype.value === RDF.langString || object.datatype.value === XSD.string)
}
Loading
Loading