InsTaG: Learning Personalized 3D Talking Head from Few-Second Video

1Beihang University, 2Griffith University, 3RIKEN AIP, 4The University of Tokyo
CVPR 2025

InsTaG learns lifelike personalized 3D talking head from few training data, attaining fast adaptation, inference, and realistic reenactment building upon 3DGS person-specific synthesizers.

Demo

Abstract

This paper introduces InsTaG, a 3D talking head synthesis framework that allows a fast learning of realistic personalized 3D talking head from few training data. Built upon a lightweight 3DGS person-specific synthesizer with universal motion priors, InsTaG achieves high-quality and fast adaptation while preserving high-level personalization and efficiency.

As preparation, we first propose an Identity-Free Pre-training strategy that enables the pre-training of the person-specific model and encourages the collection of universal motion priors from long-video data corpus. To fully exploit the universal motion priors to learn an unseen new identity, we then present a Motion-Aligned Adaptation strategy to adaptively align the target head to the pre-trained field, and constrain a robust dynamic head structure under few training data.

Extensive experiments demonstrate our outstanding performance and efficiency under various data scenarios to render high-quality personalized talking head videos.

Video

Framework

For preparation, InsTaG collects the common knowledge of talking motion from a long-video corpus by Identity-Free Pre-training, storing it as a motion field. Given a short video with new identity, Motion-Aligned Adaptation strategy builds a robust and fast person-specific synthesizer with the pre-trained motion field to learn a high-quality personalized 3D talking head.

Comparison

Comparison with current SOTA baselines. Zoom in for better visualization.

BibTeX

@inproceedings{li2025instag,
    title={InsTaG: Learning Personalized 3D Talking Head from Few-Second Video}, 
    author={Li, Jiahe and Zhang, Jiawei and Bai, Xiao and Zheng, Jin and Zhou, Jun and Gu, Lin},
    booktitle={Proceedings of the IEEE/CVF conference on computer vision and pattern recognition},
    year={2025}
}