Skip to content

Commit efb83b0

Browse files
Retribution98Copilotalmilosz
authored
[JS] Update tokenizer methods (#3012)
## Description - Using Tensor from openvino-node in Tokenizer encode and decode - Extended the Tensor API - Added constructors for Tensor - Moved Tokenizer to the separate TS file - Use BigInt for tokenId - Store openvino-node addon in the AddondData to manipulate with entity from the core part - Aligned tests with the new API and verify tokenizer and binding behavior. - Updated benchmark sample with using Tensor encode - Update documentation - https://retribution98.github.io/openvino.genai/ ## Ticket CVS-174909 ## Checklist: - [x] Tests have been updated or added to cover the new code. <!-- If the change isn't maintenance related, update the tests at https://github.com/openvinotoolkit/openvino.genai/tree/master/tests or explain in the description why the tests don't need an update. --> - [x] This patch fully addresses the ticket. <!--- If follow-up pull requests are needed, specify in description. --> - [x] I have made corresponding changes to the documentation. <!-- Run github.com/\<username>/openvino.genai/actions/workflows/deploy_gh_pages.yml on your fork with your branch as a parameter to deploy a test version with the updated content. Replace this comment with the link to the built docs. --> --------- Signed-off-by: Kirill Suvorov <[email protected]> Co-authored-by: Copilot <[email protected]> Co-authored-by: Alicja Miloszewska <[email protected]>
1 parent df1c52d commit efb83b0

File tree

17 files changed

+855
-51
lines changed

17 files changed

+855
-51
lines changed

samples/js/text_generation/benchmark_genai.js

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -90,6 +90,10 @@ async function main() {
9090
pipe = await LLMPipeline(modelsPath, device, { schedulerConfig: schedulerConfig });
9191
}
9292

93+
const inputData = await pipe.getTokenizer().encode(prompt);
94+
const promptTokenSize = inputData.input_ids.getShape()[1];
95+
console.log(`Prompt token size: ${promptTokenSize}`);
96+
9397
for (let i = 0; i < numWarmup; i++) {
9498
await pipe.generate(prompt, config);
9599
}

site/docs/bindings/node-js.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,12 @@ Node.js bindings currently support:
2424
- Structured output
2525
- ReAct agent support
2626
- `TextEmbeddingPipeline`: Generate text embeddings for semantic search and RAG applications
27+
- `Tokenizer`: Fast tokenization / detokenization and chat prompt formatting
28+
- Encode strings into token id and attention mask tensors
29+
- Decode token sequences
30+
- Apply chat template
31+
- Access special tokens (BOS/EOS/PAD)
32+
- Supports paired input
2733

2834
## Installation
2935

site/docs/guides/tokenization.mdx

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,20 @@ It can be initialized from the path, in-memory IR representation or obtained fro
3434
auto tokenzier = pipe.get_tokenizer();
3535
```
3636
</TabItemCpp>
37+
<TabItemJS>
38+
```js
39+
import { LLMPipeline, Tokenizer } from 'openvino-genai-node';
40+
41+
let tokenizer;
42+
43+
// Initialize from the path
44+
tokenizer = new Tokenizer(models_path);
45+
46+
// Or get tokenizer instance from LLMPipeline
47+
const pipe = await LLMPipeline(models_path, "CPU");
48+
tokenizer = pipe.getTokenizer();
49+
```
50+
</TabItemJS>
3751
</LanguageTabs>
3852

3953
`Tokenizer` has `encode()` and `decode()` methods which support the following arguments: `add_special_tokens`, `skip_special_tokens`, `pad_to_max_length`, `max_length`.
@@ -51,6 +65,11 @@ It can be initialized from the path, in-memory IR representation or obtained fro
5165
auto tokens = tokenizer.encode("The Sun is yellow because", ov::genai::add_special_tokens(false));
5266
```
5367
</TabItemCpp>
68+
<TabItemJS>
69+
```js
70+
const tokens = tokenizer.encode("The Sun is yellow because", { add_special_tokens: false });
71+
```
72+
</TabItemJS>
5473
</LanguageTabs>
5574

5675
The `encode()` method returns a [`TokenizedInputs`](https://docs.openvino.ai/2025/api/genai_api/_autosummary/openvino_genai.TokenizedInputs.html) object containing `input_ids` and `attention_mask`, both stored as `ov::Tensor`.
@@ -121,4 +140,40 @@ If `pad_to_max_length` is set to true, then instead of padding to the longest se
121140
// out_shape: [1, 128]
122141
```
123142
</TabItemCpp>
143+
<TabItemJS>
144+
```js
145+
import { Tokenizer } from 'openvino-genai-node';
146+
147+
const tokenizer = new Tokenizer(models_path);
148+
const prompts = ["The Sun is yellow because", "The"];
149+
let tokens;
150+
151+
// Since prompt is definitely shorter than maximal length (which is taken from IR) will not affect shape.
152+
// Resulting shape is defined by length of the longest tokens sequence.
153+
// Equivalent of HuggingFace hf_tokenizer.encode(prompt, padding="longest", truncation=True)
154+
tokens = tokenizer.encode(["The Sun is yellow because", "The"]);
155+
// or is equivalent to
156+
tokens = tokenizer.encode(["The Sun is yellow because", "The"], { pad_to_max_length: false });
157+
console.log(tokens.input_ids.getShape());
158+
// out_shape: [2, 6]
159+
160+
// Resulting tokens tensor will be padded to 1024, sequences which exceed this length will be truncated.
161+
// Equivalent of HuggingFace hf_tokenizer.encode(prompt, padding="max_length", truncation=True, max_length=1024)
162+
tokens = tokenizer.encode([
163+
"The Sun is yellow because",
164+
"The",
165+
"The longest string ever".repeat(2000),
166+
], {
167+
pad_to_max_length: true,
168+
max_length: 1024,
169+
});
170+
console.log(tokens.input_ids.getShape());
171+
// out_shape: [3, 1024]
172+
173+
// For single string prompts truncation and padding are also applied.
174+
tokens = tokenizer.encode("The Sun is yellow because", { pad_to_max_length: true, max_length: 128 });
175+
console.log(tokens.input_ids.getShape());
176+
// out_shape: [1, 128]
177+
```
178+
</TabItemJS>
124179
</LanguageTabs>

src/js/eslint.config.cjs

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,13 @@ module.exports = defineConfig([
5353
"json_schema",
5454
"structured_output_config",
5555
"structural_tags_config",
56+
"skip_special_tokens",
57+
"add_special_tokens",
58+
"pad_to_max_length",
59+
"max_length",
60+
"padding_side",
61+
"add_second_input",
62+
"number_of_inputs",
5663
],
5764
},
5865
],

src/js/include/addon.hpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ struct AddonData {
1212
Napi::FunctionReference tokenizer;
1313
Napi::FunctionReference perf_metrics;
1414
Napi::FunctionReference chat_history;
15+
Napi::ObjectReference openvino_addon;
1516
};
1617

1718
void init_class(Napi::Env env,

src/js/include/helper.hpp

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,9 +37,13 @@ ov::AnyMap js_to_cpp<ov::AnyMap>(const Napi::Env& env, const Napi::Value& value)
3737
/** @brief A template specialization for TargetType std::string */
3838
template <>
3939
std::string js_to_cpp<std::string>(const Napi::Env& env, const Napi::Value& value);
40+
template <>
41+
int64_t js_to_cpp<int64_t>(const Napi::Env& env, const Napi::Value& value);
4042
/** @brief A template specialization for TargetType std::vector<std::string> */
4143
template <>
4244
std::vector<std::string> js_to_cpp<std::vector<std::string>>(const Napi::Env& env, const Napi::Value& value);
45+
template <>
46+
std::vector<int64_t> js_to_cpp<std::vector<int64_t>>(const Napi::Env& env, const Napi::Value& value);
4347
/** @brief A template specialization for TargetType GenerateInputs */
4448
template <>
4549
GenerateInputs js_to_cpp<GenerateInputs>(const Napi::Env& env, const Napi::Value& value);
@@ -58,6 +62,8 @@ ov::genai::StructuredOutputConfig::Tag js_to_cpp<ov::genai::StructuredOutputConf
5862
/** @brief A template specialization for TargetType ov::genai::StructuredOutputConfig::StructuralTag */
5963
template <>
6064
ov::genai::StructuredOutputConfig::StructuralTag js_to_cpp<ov::genai::StructuredOutputConfig::StructuralTag>(const Napi::Env& env, const Napi::Value& value);
65+
template <>
66+
ov::Tensor js_to_cpp<ov::Tensor>(const Napi::Env& env, const Napi::Value& value);
6167
/**
6268
* @brief Unwraps a C++ object from a JavaScript wrapper.
6369
* @tparam TargetType The C++ class type to extract.
@@ -110,6 +116,12 @@ Napi::Value cpp_to_js<std::vector<size_t>, Napi::Value>(const Napi::Env& env, co
110116

111117
template <>
112118
Napi::Value cpp_to_js<ov::genai::JsonContainer, Napi::Value>(const Napi::Env& env, const ov::genai::JsonContainer& json_container);
119+
120+
template <>
121+
Napi::Value cpp_to_js<ov::Tensor, Napi::Value>(const Napi::Env& env, const ov::Tensor& tensor);
122+
123+
template <>
124+
Napi::Value cpp_to_js<ov::genai::TokenizedInputs, Napi::Value>(const Napi::Env& env, const ov::genai::TokenizedInputs& tokenized_inputs);
113125
/**
114126
* @brief Template function to convert C++ map into Javascript Object. Map key must be std::string.
115127
* @tparam MapElementType C++ data type of map elements.
@@ -130,3 +142,5 @@ bool is_chat_history(const Napi::Env& env, const Napi::Value& value);
130142
std::string json_stringify(const Napi::Env& env, const Napi::Value& value);
131143

132144
Napi::Value json_parse(const Napi::Env& env, const std::string& value);
145+
146+
Napi::Function get_prototype_from_ov_addon(const Napi::Env& env, const std::string& ctor_name);

src/js/include/tokenizer.hpp

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,12 @@ class TokenizerWrapper : public Napi::ObjectWrap<TokenizerWrapper> {
1515
Napi::Value get_eos_token_id(const Napi::CallbackInfo& info);
1616
Napi::Value get_pad_token(const Napi::CallbackInfo& info);
1717
Napi::Value get_pad_token_id(const Napi::CallbackInfo& info);
18+
Napi::Value get_chat_template(const Napi::CallbackInfo& info);
19+
Napi::Value get_original_chat_template(const Napi::CallbackInfo& info);
20+
Napi::Value set_chat_template(const Napi::CallbackInfo& info);
21+
Napi::Value supports_paired_input(const Napi::CallbackInfo& info);
22+
Napi::Value encode(const Napi::CallbackInfo& info);
23+
Napi::Value decode(const Napi::CallbackInfo& info);
1824
private:
1925
ov::genai::Tokenizer _tokenizer;
2026
};

src/js/lib/addon.ts

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@ import { createRequire } from "module";
22
import { platform } from "node:os";
33
import { join, dirname, resolve } from "node:path";
44
import type { ChatHistory as IChatHistory } from "./chatHistory.js";
5+
import type { Tokenizer as ITokenizer } from "./tokenizer.js";
6+
import { addon as ovAddon } from "openvino-node";
57

68
export type EmbeddingResult = Float32Array | Int8Array | Uint8Array;
79
export type EmbeddingResults = Float32Array[] | Int8Array[] | Uint8Array[];
@@ -60,6 +62,8 @@ interface OpenVINOGenAIAddon {
6062
TextEmbeddingPipeline: TextEmbeddingPipelineWrapper;
6163
LLMPipeline: any;
6264
ChatHistory: IChatHistory;
65+
Tokenizer: ITokenizer;
66+
setOpenvinoAddon: (ovAddon: any) => void;
6367
}
6468

6569
// We need to use delayed import to get an updated Path if required
@@ -78,7 +82,8 @@ function getGenAIAddon(): OpenVINOGenAIAddon {
7882
}
7983

8084
const addon = getGenAIAddon();
85+
addon.setOpenvinoAddon(ovAddon);
8186

82-
export const { ChatHistory } = addon;
87+
export const { TextEmbeddingPipeline, LLMPipeline, ChatHistory, Tokenizer } = addon;
8388
export type ChatHistory = IChatHistory;
84-
export default addon;
89+
export type Tokenizer = ITokenizer;

src/js/lib/index.ts

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,3 +40,5 @@ export const { LLMPipeline, TextEmbeddingPipeline } = PipelineFactory;
4040
export { DecodedResults } from "./pipelines/llmPipeline.js";
4141
export * from "./utils.js";
4242
export * from "./addon.js";
43+
export type { TokenizedInputs, EncodeOptions, DecodeOptions } from "./tokenizer.js";
44+
export type { ChatMessage, ExtraContext, ToolDefinition } from "./chatHistory.js";

src/js/lib/pipelines/llmPipeline.ts

Lines changed: 3 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,14 @@
11
import util from "node:util";
2-
import addon, { ChatHistory } from "../addon.js";
2+
import { ChatHistory, LLMPipeline as LLMPipelineWrap } from "../addon.js";
33
import { GenerationConfig, StreamingStatus, LLMPipelineProperties } from "../utils.js";
4+
import { Tokenizer } from "../tokenizer.js";
45

56
export type ResolveFunction = (arg: { value: string; done: boolean }) => void;
67
export type Options = {
78
disableStreamer?: boolean;
89
max_new_tokens?: number;
910
};
1011

11-
interface Tokenizer {
12-
/** Applies a chat template to format chat history into a prompt string. */
13-
applyChatTemplate(
14-
chatHistory: Record<string, any>[] | ChatHistory,
15-
addGenerationPrompt: boolean,
16-
chatTemplate?: string,
17-
tools?: Record<string, any>[],
18-
extraContext?: Record<string, any>,
19-
): string;
20-
getBosToken(): string;
21-
getBosTokenId(): number;
22-
getEosToken(): string;
23-
getEosTokenId(): number;
24-
getPadToken(): string;
25-
getPadTokenId(): number;
26-
}
27-
2812
/** Structure with raw performance metrics for each generation before any statistics are calculated. */
2913
export type RawMetrics = {
3014
/** Durations for each generate call in milliseconds. */
@@ -167,7 +151,7 @@ export class LLMPipeline {
167151
async init() {
168152
if (this.isInitialized) throw new Error("LLMPipeline is already initialized");
169153

170-
this.pipeline = new addon.LLMPipeline();
154+
this.pipeline = new LLMPipelineWrap();
171155

172156
const initPromise = util.promisify(this.pipeline.init.bind(this.pipeline));
173157
const result = await initPromise(this.modelPath, this.device, this.properties);

0 commit comments

Comments
 (0)