Skip to content

robustonian/HY-Motion-1.0

 
 

Repository files navigation

中文阅读

Banner

HY-Motion 1.0: Scaling Flow Matching Models for 3D Motion Generation

Teaser

🔥 News

Introduction

HY-Motion 1.0 is a series of text-to-3D human motion generation models based on Diffusion Transformer (DiT) and Flow Matching. It allows developers to generate skeleton-based 3D character animations from simple text prompts, which can be directly integrated into various 3D animation pipelines. This model series is the first to scale DiT-based text-to-motion models to the billion-parameter level, achieving significant improvements in instruction-following capabilities and motion quality over existing open-source models.

Key Features

  • State-of-the-Art Performance: Achieves state-of-the-art performance in both instruction-following capability and generated motion quality.

  • Billion-Scale Models: We are the first to successfully scale DiT-based models to the billion-parameter level for text-to-motion generation. This results in superior instruction understanding and following capabilities, outperforming comparable open-source models.

  • Advanced Three-Stage Training: Our models are trained using a comprehensive three-stage process:

    • Large-Scale Pre-training: Trained on over 3,000 hours of diverse motion data to learn a broad motion prior.

    • High-Quality Fine-tuning: Fine-tuned on 400 hours of curated, high-quality 3D motion data to enhance motion detail and smoothness.

    • Reinforcement Learning: Utilizes Reinforcement Learning from human feedback and reward models to further refine instruction-following and motion naturalness.

System Overview

Architecture

ComparisonSoTA

🎁 Model Zoo

HY-Motion 1.0 Series

Model Description Date Size Huggingface
HY-Motion-1.0 Standard Text to Motion Generation Model 2025-12-30 1.0B Download
HY-Motion-1.0-Lite Lightweight Text to Motion Generation Model 2025-12-30 0.46B Download

🤗 Get Started with HY-Motion 1.0

HY-Motion 1.0 supports macOS, Windows, and Linux.

1. Installation

First, install PyTorch via the official site. Then install the dependencies:

pip install -r requirements.txt

2. Download Model Weights

Please follow the instructions in ckpts/README.md to download the necessary model weights.

Code Usage (CLI)

We provide a script for local batch inference, suitable for processing large amounts of prompts.

# HY-Motion-1.0
python3 local_infer.py --model_path ckpts/tencent/HY-Motion-1.0

# HY-Motion-1.0-Lite
python3 local_infer.py --model_path ckpts/tencent/HY-Motion-1.0-Lite

Common Parameters:

  • --input_text_dir: Directory containing .txt or .json prompt files.
  • --output_dir: Directory to save results (default: output/local_infer).
  • --disable_duration_est: Disable LLM-based duration estimation.
  • --disable_rewrite: Disable LLM-based prompt rewriting.
  • --prompt_engineering_host / --prompt_engineering_model_path: (Optional) Host address / local checkpoint for the Duration Prediction & Prompt Rewrite Module.
    • Download: You can download the Duration Prediction & Prompt Rewrite Module from Here.
    • Note: If you do not set these parameter, you must also set --disable_duration_est and --disable_rewrite. Otherwise, the script will raise an error due to host unavailable.

Gradio App

You can host a Gradio web interface on your local machine for interactive visualization:

python3 gradio_app.py

After running the command, open your browser and visit http://localhost:7860

Custom Character Model (VRM)

You can use your own VRM character model for the web preview by setting the HYMOTION_PREVIEW_VRM environment variable:

export HYMOTION_PREVIEW_VRM="path/to/your/model.vrm"
python3 gradio_app.py

Notes:

  • The VRM file will be base64-encoded and embedded in the HTML preview. Large VRM files may increase initial loading time.
  • VRM's MToon shaders are automatically converted to standard materials for web compatibility.
  • Coordinate system differences between SMPL and VRM are automatically handled.

Custom FBX Template

You can override the default FBX template for motion retargeting by setting the HYMOTION_TEMPLATE_FBX environment variable:

export HYMOTION_TEMPLATE_FBX="path/to/your/template.fbx"
python3 local_infer.py --model_path ckpts/tencent/HY-Motion-1.0

VRM to FBX Conversion

A utility script is provided to convert VRM files to FBX format:

python3 scripts/vrm_to_fbx.py input.vrm output.fbx

Supported backends (auto-detected in order of preference):

  1. Blender (requires blender on PATH)
  2. Assimp CLI (requires assimp on PATH)
  3. pyassimp (Python library)

🔗 BibTeX

If you found this repository helpful, please cite our reports:

@article{hymotion2025,
  title={HY-Motion 1.0: Scaling Flow Matching Models for Text-To-Motion Generation},
  author={Tencent Hunyuan 3D Digital Human Team},
  journal={arXiv preprint arXiv:2512.23464},
  year={2025}
}

Acknowledgements

We would like to thank the contributors to the FLUX, diffusers, HuggingFace, SMPL/SMPLH, CLIP, Qwen3, PyTorch3D, kornia, transforms3d, FBX-SDK, GVHMR, and HunyuanVideo repositories or tools, for their open research and exploration.

About

HY-Motion model for 3D character animation generation.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 85.3%
  • HTML 14.7%