Abstract: We study the problem of 3D-aware full-body human generation, aiming at creating animatable human avatars with high-quality textures and geometries. Generally, two challenges remain in this field: i) existing methods struggle to generate geometries with rich realistic details such as the wrinkles of garments; ii) they typically utilize volumetric radiance fields and neural renderers in the synthesis process, making high-resolution rendering non-trivial. To overcome these problems, we propose GETAvatar, a Generative model that directly generates Explicit Textured 3D meshes for animatable human Avatar, with photo-realistic appearance and fine geometric details. Specifically, we first design an articulated 3D human representation with explicit surface modeling, and enrich the generated humans with realistic surface details by learning from the 2D normal maps of 3D scan data. Second, with the explicit mesh representation, we can use a rasterization-based renderer to perform surface rendering, allowing us to achieve high-resolution image generation efficiently. Extensive experiments demonstrate that GETAvatar achieves state-of-the-art performance on 3D-aware human generation both in appearance and geometry quality. Notably, GETAvatar can generate images at 512x512 resolution with 17FPS and 1024x1024 resolution with 14FPS, improving upon previous methods by 2x.
📢 News
[2023-10-19]: Code and pretrained model on THuman2.0 released! Check more details here
⚒️ Requirements
We recommend Linux for performance and compatibility reasons.
1 – 8 high-end NVIDIA GPUs. We have done all testing and development using V100 GPUs.
64-bit Python 3.8 and PyTorch 1.9.0. See https://pytorch.org for PyTorch install
instructions.
CUDA toolkit 11.1 or later. (Why is a separate CUDA toolkit installation required? We
use the custom CUDA extensions from the StyleGAN3 repo. Please
see Troubleshooting)
.
Blender. Download Blender from official link. We used blender-3.2.2-linux, we haven’t tested on other versions but newer versions should be OK.
We also recommend to install Nvdiffrast following instructions
from official repo, and
install Kaolin.
We train GETAvatar on 3D human scan datasets (THuman2.0 and RenderPeople).
Here use THuman2.0 as an example because it’s free. The same pipeline works also for the commericial dataset RenderPeople.
You can run multiple instantces of the script in parallel by simply specifying --tot to be the number of total instances and --id to be the rank of current instance.
You can run multiple instantces of the script in parallel by simply specifying --device_id to be the device ID, --tot to be the number of total instances and --id to be the rank of current instance.
You can run multiple instantces of the script in parallel by simply specifying --tot to be the number of total instances and --id to be the rank of current instance.
The final structure of training dataset is as following:
You can specify the image resolution with --img_res, the path of checkpoints with --resume_pretrained, the type of the motion sequence with --action_type.
🙀 Train the model
You can train new models using train_3d.py. For example:
🔥 🔥 🔥GETAvatar: Generative Textured Meshes for Animatable Human Avatars (ICCV 2023)🔥 🔥 🔥
Official PyTorch implementation
GETAvatar: Generative Textured Meshes for Animatable Human Avatars
Xuanmeng Zhang*, Jianfeng Zhang*, Rohan Chacko, Hongyi Xu, Guoxian Song, Yi Yang, Jiashi Feng
Paper, Project Page
Abstract: We study the problem of 3D-aware full-body human generation, aiming at creating animatable human avatars with high-quality textures and geometries. Generally, two challenges remain in this field: i) existing methods struggle to generate geometries with rich realistic details such as the wrinkles of garments; ii) they typically utilize volumetric radiance fields and neural renderers in the synthesis process, making high-resolution rendering non-trivial. To overcome these problems, we propose GETAvatar, a Generative model that directly generates Explicit Textured 3D meshes for animatable human Avatar, with photo-realistic appearance and fine geometric details. Specifically, we first design an articulated 3D human representation with explicit surface modeling, and enrich the generated humans with realistic surface details by learning from the 2D normal maps of 3D scan data. Second, with the explicit mesh representation, we can use a rasterization-based renderer to perform surface rendering, allowing us to achieve high-resolution image generation efficiently. Extensive experiments demonstrate that GETAvatar achieves state-of-the-art performance on 3D-aware human generation both in appearance and geometry quality. Notably, GETAvatar can generate images at 512x512 resolution with 17FPS and 1024x1024 resolution with 14FPS, improving upon previous methods by 2x.
📢 News
⚒️ Requirements
🏃♂️ Getting Started
Clone the gitlab code and necessary files:
SMPL models
Download the SMPL human models from this (male, female and neutral models) and the mixamo motion sequences from here.
Place them as following:
📝 Preparing datasets
We train GETAvatar on 3D human scan datasets (THuman2.0 and RenderPeople). Here use THuman2.0 as an example because it’s free. The same pipeline works also for the commericial dataset RenderPeople.
First, download THuman2.0 dataset and download the fitted SMPL results.
Place them as following:
First, run the pre-processing script
prepare_thuman_scans_smpl.pyto align the human scans:You can run multiple instantces of the script in parallel by simply specifying
--totto be the number of total instances and--idto be the rank of current instance.Second, render the RGB image with blender:
You can run multiple instantces of the script in parallel by simply specifying
--device_idto be the device ID,--totto be the number of total instances and--idto be the rank of current instance.Next, generate the camera pose and SMPL labels:
Finally, render the normal images with
pytorch3d:You can run multiple instantces of the script in parallel by simply specifying
--totto be the number of total instances and--idto be the rank of current instance.The final structure of training dataset is as following:
🙉 Inference
Download pretrained model from here and save into
./pretrained_model.You can generate the multi-view visualization with
gen_multi_view_3d.py. For example:You can specify
--img_resto be the image resolution and--resume_pretrainedto be the path of checkpoints.You can generate the animation with
gen_animation_view_3d.py. For example:You can specify the image resolution with
--img_res, the path of checkpoints with--resume_pretrained, the type of the motion sequence with--action_type.🙀 Train the model
You can train new models using
train_3d.py. For example:For distributed training, run the script
dist_train.sh:🙏 Credit
GETAvatar builds upon several previous works:
We would like to thank the authors for their contribution to the community!
🎓 Citation
If you find this codebase useful for your research, please use the following entry.
@inproceedingszhang2023getavatar,title=GETAvatar:GenerativeTexturedMeshesforAnimatableHumanAvatars,author=Zhang,XuanmengandZhang,JianfengandRohan,ChackoandXu,HongyiandSong,GuoxianandYang,YiandFeng,Jiashi,booktitle=ICCV,year=2023