Real-time facial animation system that drives a 3D FLAME head model using MediaPipe face tracking. Captures facial expressions from video or webcam and animates a 3D avatar in real-time.
- 🎭 Real-time face tracking using MediaPipe (52 ARKit blendshapes)
- 🗿 3D avatar animation with FLAME model (100 expression coefficients)
- 🎯 Pre-trained mappings for accurate expression transfer
- 🔄 Head pose tracking with natural mirror effect
- 😊 Rich facial expressions: jaw movement, smiles, blinks, eyebrow raises, mouth shapes, and more
- 🛠️ Interactive tools for debugging and exploration
- 🍎 macOS/Apple Silicon optimized using PyVista for visualization
- ⚡ High performance: 30+ FPS on modern hardware
- Python 3.8+
- macOS, Windows, or Linux
- Webcam or video file for input
- ~150 MB disk space for models
git clone https://github.com/yourusername/flame-avatar-driver.git
cd flame-avatar-driverpython -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activatepip install -r requirements.txtcurl -o face_landmarker.task https://storage.googleapis.com/mediapipe-models/face_landmarker/face_landmarker/float16/1/face_landmarker.taskOr download manually from MediaPipe Models
- Register at FLAME Model Website
- Download FLAME 2020 (
generic_model.pkl) - Place in
models/folder
Important: The pre-trained mappings require FLAME 2020 specifically.
The mapping files should already be in the mappings/ folder. If not:
python tools/download_mappings.pypython src/main.pyBy default, this uses a video file specified in the code. Press q to quit.
Edit src/main.py and change line ~14:
VIDEO_PATH = 0 # Change from 'example.mov' to 0 for webcamIn src/visualizer.py, line 17, modify the camera position:
# Change the z-coordinate value (third number in the first tuple)
self.plotter.camera_position = [(0, 0, 1.2), (0, 0, 0), (0, 1, 0)]
# Smaller value = closer camera = bigger face
# Try values: 0.8 (very close), 1.2 (recommended), 1.5 (far), 2.0 (very far)In src/translator.py, line ~125:
expression = expression * 2.0 # Adjust multiplier
# 1.0 = normal, 2.0 = double, 3.0 = tripleTest individual FLAME expression components:
python tools/flame_explorer.pyUseful for understanding what each of the 100 expression coefficients does. Change the expression index in the file to explore different expressions.
See what MediaPipe detects in your video:
python tools/tracker_detailed.pyShows active blendshapes with scores > 0.1 every 30 frames.
Download pre-trained MediaPipe to FLAME mappings:
python tools/download_mappings.pyflame-avatar-driver/
├── README.md # This file
├── LICENSE # MIT License
├── requirements.txt # Python dependencies
├── .gitignore # Git ignore rules
│
├── src/ # Main source code
│ ├── main.py # Main application
│ ├── translator.py # MediaPipe → FLAME translation
│ └── visualizer.py # PyVista 3D rendering
│
├── tools/ # Development tools
│ ├── flame_explorer.py # Expression explorer
│ ├── tracker_detailed.py # Blendshape debugger
│ └── download_mappings.py # Mapping downloader
│
├── mappings/ # Pre-trained mappings
│ ├── bs2exp.npy # Blendshape → Expression
│ ├── bs2eye.npy # Blendshape → Eye pose
│ └── bs2pose.npy # Blendshape → Jaw pose (optional)
│
├── models/ # Model files (download separately)
│ └── README.md # Download instructions
│
└── examples/ # Example assets
└── screenshots/ # Demo screenshots
-
Face Tracking: MediaPipe detects 468 3D facial landmarks and computes 52 ARKit-compatible blendshape scores (0-1 normalized values)
-
Translation: Pre-trained linear transformation matrices convert the 52 MediaPipe blendshapes into 100 FLAME expression coefficients plus jaw and eye pose parameters
-
Mesh Deformation: FLAME's expression basis vectors deform the template mesh vertices based on the expression coefficients
-
Head Pose: Yaw and pitch angles are extracted from landmark positions and applied as rotation matrices (with mirror effect for natural viewing)
-
Real-time Rendering: PyVista displays the animated 3D mesh at 30+ FPS with smooth camera positioning
- Input Format: 52 MediaPipe ARKit blendshapes (normalized 0-1)
- Output Format: 100 FLAME expression coefficients + 3 jaw pose params + 6 eye pose params
- Mapping Type: Linear transformation matrices (52×100, 52×3, 52×6)
- 3D Model: FLAME 2020 (5,023 vertices, 9,976 triangles)
- Training Data: Mappings trained on NerSemble and IMAvatar datasets
- Performance: 30+ FPS on M1 MacBook, ~25 FPS on older Intel machines
Solution 1: Increase the global multiplier in src/translator.py:
expression = expression * 3.0 # Try 2.0 to 5.0Solution 2: Check that pre-trained mappings loaded successfully. You should see:
✓ Loaded expression mapping from: /path/to/mappings
Solution: Adjust the rotation matrices in src/main.py around line 93. The key rotation is:
R_z = np.array([
[np.cos(np.radians(180)), -np.sin(np.radians(180)), 0],
[np.sin(np.radians(180)), np.cos(np.radians(180)), 0],
[0, 0, 1]
])
R_x = np.array([
[1, 0, 0],
[0, np.cos(np.radians(-35)), -np.sin(np.radians(-35))],
[0, np.sin(np.radians(-35)), np.cos(np.radians(-35))]
])Try adjusting the 180 degrees to other values (0, 90, 270) to find the correct orientation.
Solution: Adjust the camera distance in src/visualizer.py line 17:
self.plotter.camera_position = [(0, 0, 0.8), (0, 0, 0), (0, 1, 0)]Try values from 0.8 to 2.0 to find your preferred zoom.
Solution: Run the download script:
python tools/download_mappings.pyOr manually download from PeizhiYan's repository.
Solutions:
- Reduce video resolution
- Close other applications
- Update graphics drivers
- Check CPU usage (should be < 80%)
Solutions:
- Verify you downloaded FLAME 2020 (not 2023 or other versions)
- Check file is named exactly
generic_model.pkl - Ensure file is in
models/folder - Verify file size is ~100 MB
Solutions:
- Ensure good lighting
- Face the camera directly
- Check video file is valid and readable
- Try with webcam instead:
VIDEO_PATH = 0
- MediaPipe: Google's ML solutions for face tracking
- FLAME Model: Max Planck Institute for Intelligent Systems
- PyVista: 3D visualization built on VTK
Based on the approach from PeizhiYan's MediaPipe-to-FLAME repository
@article{FLAME:SiggraphAsia2017,
title = {Learning a model of facial shape and expression from {4D} scans},
author = {Li, Tianye and Bolkart, Timo and Black, Michael J. and Li, Hao and Romero, Javier},
journal = {ACM Transactions on Graphics (TOG), Proc. SIGGRAPH Asia},
volume = {36},
number = {6},
year = {2017}
}This project is licensed under the MIT License - see the LICENSE file for details.
Important Notes on Third-Party Content:
- The FLAME model requires separate registration and licensing from MPI-IS
- MediaPipe is licensed under Apache 2.0 (Google LLC)
- Pre-trained mappings are for research and educational use only (based on public datasets)
Contributions are welcome! Here's how you can help:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- Add support for full-body SMPL-X model
- Implement temporal smoothing for more stable animations
- Add recording/export functionality
- Create GUI for parameter adjustment
- Add support for multiple faces
- Optimize for mobile/embedded devices
- Add more pre-trained mappings for different FLAME variants
Special thanks to:
- PeizhiYan for the MediaPipe-to-FLAME mapping approach
- FLAME authors for the parametric head model
- MediaPipe team for the face tracking solution
- The open-source community for invaluable tools and libraries
For questions, issues, or suggestions:
- Open an issue on GitHub
- Email: [your-email@example.com]
- Twitter: [@yourusername]
⭐ If you find this project useful, please consider giving it a star on GitHub!