CVPR 2026

AvatarPointillist: Autoregressive 4D Gaussian Avatarization

An autoregressive framework for generating dynamic 4D Gaussian avatars from a single portrait image.

Hongyu Liu^1,2 Xuan Wang² Yating Wang² Zijian Wu² Ziyu Wan³ Yue Ma¹ Runtao Liu¹ Boyao Zhou² Yujun Shen² Qifeng Chen¹

¹HKUST ²Ant Group ³City University of Hong Kong

AvatarPointillist generates dynamic 4D Gaussian avatars from a single portrait image. The method adopts a decoder-only autoregressive Transformer to generate Gaussian point clouds, jointly predicts per-point binding for animation, and refines renderable attributes with a dedicated Gaussian decoder.

Abstract

We introduce AvatarPointillist, a novel framework for generating dynamic 4D Gaussian avatars from a single portrait image. At the core of our method is a decoder-only Transformer that autoregressively generates a point cloud for 3D Gaussian Splatting. This sequential design enables precise and adaptive construction by adjusting point density and total point count based on subject complexity. During generation, the autoregressive model jointly predicts per-point binding information for realistic animation. A dedicated Gaussian decoder then converts the generated points into complete, renderable Gaussian attributes. Conditioning the decoder on latent features from the autoregressive generator substantially improves fidelity, leading to high-quality, photorealistic, and controllable avatars.

Gallery

Our gallery showcases representative animation results produced by AvatarPointillist. In each video, the left side is the input, the middle is the output, and the right side shows the target expression and camera.

Method Overview

The framework consists of two components: an autoregressive model for Gaussian geometry generation and a Gaussian decoder for rendering attributes. The autoregressive model takes image features from DINOv2 together with point-cloud features, predicts quantized tokens for coordinates and binding, and enables animation through the predicted binding information and linear blend skinning.

Results

Qualitative Comparisons

Each page presents one qualitative comparison clip extracted from the consolidated comparison video. Use the arrows or the dots to browse the three comparison segments.

Comparison 1

Comparison 2

Comparison 3

BibTeX

@inproceedings{liu2026avatarpointillist,
  title     = {AvatarPointillist: Autoregressive 4D Gaussian Avatarization},
  author    = {Hongyu Liu and Xuan Wang and Yating Wang and Zijian Wu and Ziyu Wan and Yue Ma and Runtao Liu and Boyao Zhou and Yujun Shen and Qifeng Chen},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2026}
}

Please update the citation key or venue name here if the final public bibliographic entry differs from the current version.