From baa170fbe6151d029d7985accb2a989706cbd418 Mon Sep 17 00:00:00 2001 From: mokai Date: Sat, 20 Jun 2026 00:35:56 +0800 Subject: [PATCH] docs: add offline packing attribution Change-Id: If2f609f50daf94107ff627ceb4c60ab36d7951c3 --- docs/source/features/offline_data_packing.md | 27 +++++++++++++++++++ .../vlm/offline_packing/README.md | 27 +++++++++++++++++++ 2 files changed, 54 insertions(+) diff --git a/docs/source/features/offline_data_packing.md b/docs/source/features/offline_data_packing.md index 400eb60f..e95d7cb8 100644 --- a/docs/source/features/offline_data_packing.md +++ b/docs/source/features/offline_data_packing.md @@ -116,3 +116,30 @@ Step 1’s token counts depend on the actual `AutoProcessor` logic, so you can c * Adjust image-token budget / resolution: add processor-supported arguments under `model.processor_kwargs` (e.g. Qwen-VL’s `min_pixels`/`max_pixels`). * Template alignment: if you add a new `model.model_type`, make sure `tools/data_preprocess/vlm/offline_packing/utils.py` contains the corresponding entry in `TEMPLATES[sample_type][model_type]`; otherwise Step 1 will raise “No template found for model_type ...”. * Media pre-processing: under `media_preprocess` you can assign pre-processing function names per modality (implementations in `tools/data_preprocess/vlm/offline_packing/media_preprocess_utils.py`) to control resize/crop/frame-reading behaviour. + +## Acknowledgements + +The WDS-native offline packing workflow in LoongForge is based on the multimodal +offline packing framework originally developed for LLaVA-OneVision-1.5 and later +migrated and upgraded for LLaVA-OneVision-2. + +LoongForge previously collaborated with the LLaVA-OneVision work. Some historical +repository or package names may still use the older `aiak-*` naming, while the +current LoongForge repository has migrated and adapted part of the +LLaVA-OneVision offline packing capabilities. + +Upstream references: + +- LLaVA-OneVision-1.5 offline packing: + https://github.com/fdcp/LLaVA-OneVision-1.5/tree/main/tools/data_preprocess/offline_packing +- LLaVA-OneVision-1.5 offline packing examples: + https://github.com/fdcp/LLaVA-OneVision-1.5/tree/main/examples_offline_packing +- LLaVA-OneVision-2 offline packing: + https://github.com/EvolvingLMMs-Lab/LLaVA-OneVision-2/tree/main/offline_packing +- LLaVA-OneVision-2 sample packing scripts: + https://github.com/EvolvingLMMs-Lab/LLaVA-OneVision-2/tree/main/examples/llava_onevision1_5/sample_packing + +LoongForge refactors this workflow for native WebDataset tar-shard input, +manifest/SQLite-based sample indexing, media-type-specific packing, pack-plan +generation, tar byte-offset based WebDataset writing, and runtime handling for +packed text/image/video samples. diff --git a/tools/data_preprocess/vlm/offline_packing/README.md b/tools/data_preprocess/vlm/offline_packing/README.md index 0da78e08..3273ad09 100644 --- a/tools/data_preprocess/vlm/offline_packing/README.md +++ b/tools/data_preprocess/vlm/offline_packing/README.md @@ -190,3 +190,30 @@ Implementation notes: - Packed WDS JSON contains `_meta.pack_id`, `_meta.sample_ids`, `_meta.token_lens`, and `_meta.total_token_len` so a written packed sample can be audited without joining back to `pack_plan.jsonl`. + +## Acknowledgements + +The WDS-native offline packing workflow in LoongForge is based on the multimodal +offline packing framework originally developed for LLaVA-OneVision-1.5 and later +migrated and upgraded for LLaVA-OneVision-2. + +LoongForge previously collaborated with the LLaVA-OneVision work. Some historical +repository or package names may still use the older `aiak-*` naming, while the +current LoongForge repository has migrated and adapted part of the +LLaVA-OneVision offline packing capabilities. + +Upstream references: + +- LLaVA-OneVision-1.5 offline packing: + https://github.com/fdcp/LLaVA-OneVision-1.5/tree/main/tools/data_preprocess/offline_packing +- LLaVA-OneVision-1.5 offline packing examples: + https://github.com/fdcp/LLaVA-OneVision-1.5/tree/main/examples_offline_packing +- LLaVA-OneVision-2 offline packing: + https://github.com/EvolvingLMMs-Lab/LLaVA-OneVision-2/tree/main/offline_packing +- LLaVA-OneVision-2 sample packing scripts: + https://github.com/EvolvingLMMs-Lab/LLaVA-OneVision-2/tree/main/examples/llava_onevision1_5/sample_packing + +LoongForge refactors this workflow for native WebDataset tar-shard input, +manifest/SQLite-based sample indexing, media-type-specific packing, pack-plan +generation, tar byte-offset based WebDataset writing, and runtime handling for +packed text/image/video samples.