同步 MMKG 流水线文档与算子真实行为#32
Open
W-RMSL wants to merge 1 commit into
Open
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates the Multimodal KG pipeline guide (ZH/EN) to better reflect the current MMKG pipeline’s expected inputs/outputs and example data, including sample image paths and updated visual-triple examples.
Changes:
- Add sample image references and update
img_dictguidance for local image paths. - Update the documented visual triple (
vis_triple) format and example IO payloads. - Document
vis_urlalongside the input schema and update example JSON accordingly.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 10 comments.
| File | Description |
|---|---|
| docs/zh/notes/kg_guide/kg_pipelines_by_types/multimodal_kg_pipeline.md | Updates ZH multimodal pipeline guide’s sample assets, input schema, and visual triple examples. |
| docs/en/notes/kg_guide/kg_pipelines_by_types/multimodal_kg_pipeline.md | Updates EN multimodal pipeline guide’s sample assets, input schema, and visual triple examples. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
74
to
79
| This pipeline requires at least the following fields: | ||
|
|
||
| - **raw_chunk**: source text for entity and textual triple extraction | ||
| - **img_dict**: an image dictionary where keys are image IDs and values are local image paths | ||
| - **img_dict**: an image dictionary where keys are image IDs (free-form strings that appear in `vis_triple`) and values are local image paths | ||
| - **vis_url**: a list of image paths the VLM opens during step 5 QA generation. Its order must match the order in which image IDs first appear in `img_dict`. | ||
|
|
|
|
||
| - combine images and candidate entities to extract visual facts | ||
| - typically output triples in the form `"<subj> entity <rel> depicted_in <obj> image_id"` | ||
| - output triples in the form `"<subj> entity <obj> image_id <rel> depicted_in "` |
Comment on lines
+174
to
+176
| "<subj> Cybertruck <obj> img_cybertruck <rel> depicted_in ", | ||
| "<subj> Elon Musk <obj> img_musk_stage <rel> depicted_in ", | ||
| "<subj> Cybertruck <obj> img_musk_stage <rel> depicted_in " |
| - Sample images: `example_data/MultimodalKGPipeline/images/cyber.jpg`, `example_data/MultimodalKGPipeline/images/musk.jpg` | ||
|
|
||
| For real image-text workloads, values in `img_dict` must be valid local image paths. The default JSON file is provided as a runnable input structure. | ||
| Values in `img_dict` are the actual paths the VLM serving layer opens. Only **local paths** are supported today: the serving layer calls `open(path, "rb")` and base64-encodes the bytes into the request, and it does not fetch remote URLs. The default data ships with runnable example images, and the paths are written relative to the `api_pipelines/` directory (the CWD when you run `python multimodal_kg_pipeline.py`). |
Comment on lines
74
to
79
| 该流水线至少需要以下字段: | ||
|
|
||
| - **raw_chunk**:原始文本,用于实体和文本三元组抽取。 | ||
| - **img_dict**:图片字典,key 为图片 ID,value 为本地图片路径。 | ||
| - **img_dict**:图片字典,key 为图片 ID(自定义字符串,会出现在 `vis_triple` 中),value 为本地图片路径。 | ||
| - **vis_url**:图片路径列表,供 step5 QA 生成时打开图片传给 VLM。元素顺序需要与 `img_dict` 中图片首次出现的顺序一致。 | ||
|
|
|
|
||
| - 结合图片与候选实体抽取视觉事实 | ||
| - 输出格式通常为 `"<subj> 实体 <rel> depicted_in <obj> 图片ID"` | ||
| - 输出格式为 `"<subj> 实体 <obj> 图片ID <rel> depicted_in "` |
Comment on lines
+174
to
+176
| "<subj> Cybertruck <obj> img_cybertruck <rel> depicted_in ", | ||
| "<subj> Elon Musk <obj> img_musk_stage <rel> depicted_in ", | ||
| "<subj> Cybertruck <obj> img_musk_stage <rel> depicted_in " |
| - 示例图片:`example_data/MultimodalKGPipeline/images/cyber.jpg`、`example_data/MultimodalKGPipeline/images/musk.jpg` | ||
|
|
||
| 真实图文场景中,`img_dict` 的 value 需要是本地可访问的图片路径;默认数据使用 JSON 格式,可直接作为结构示例。 | ||
| `img_dict` 的 value 是 VLM 实际打开的图片路径,目前仅支持**本地路径**(serving 层用 `open(path, "rb")` 读字节再 base64 编码内联到请求里,不会自动下载远程 URL)。默认数据已随包附带可直接运行的示例图,路径按 `python multimodal_kg_pipeline.py` 在 `api_pipelines/` 目录下的相对位置给出。 |
| - Sample images: `example_data/MultimodalKGPipeline/images/cyber.jpg`, `example_data/MultimodalKGPipeline/images/musk.jpg` | ||
|
|
||
| For real image-text workloads, values in `img_dict` must be valid local image paths. The default JSON file is provided as a runnable input structure. | ||
| Values in `img_dict` are the actual paths the VLM serving layer opens. Only **local paths** are supported today: the serving layer calls `open(path, "rb")` and base64-encodes the bytes into the request, and it does not fetch remote URLs. The default data ships with runnable example images, and the paths are written relative to the `api_pipelines/` directory (the CWD when you run `python multimodal_kg_pipeline.py`). |
| - 示例图片:`example_data/MultimodalKGPipeline/images/cyber.jpg`、`example_data/MultimodalKGPipeline/images/musk.jpg` | ||
|
|
||
| 真实图文场景中,`img_dict` 的 value 需要是本地可访问的图片路径;默认数据使用 JSON 格式,可直接作为结构示例。 | ||
| `img_dict` 的 value 是 VLM 实际打开的图片路径,目前仅支持**本地路径**(serving 层用 `open(path, "rb")` 读字节再 base64 编码内联到请求里,不会自动下载远程 URL)。默认数据已随包附带可直接运行的示例图,路径按 `python multimodal_kg_pipeline.py` 在 `api_pipelines/` 目录下的相对位置给出。 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.