OHTA is a novel approach capable of creating implicit animatable hand avatars using just a single image. It facilitates 1) text-to-avatar conversion, 2) hand texture and geometry editing, and 3) interpolation and sampling within the latent space.
Updates
[06/2024] :star_struck: Code released!
[02/2024] :partying_face: OHTA is accepted to CVPR 2024! Working on code release!
:desktop_computer: Installation
Environment
Create the conda environment for OHTA with the given script:
You should accept MANO License and download the MANO model from the official website.
PairOF and MANO-HD
Download the pre-trained PairOF and MANO-HD from here, which are provided by HandAvatar.
We refer to the MANO-HD implementation from HandAvatar.
🔥 Pre-trained Model
We provide the pre-trained model after prior learning, which can be used for one-shot creation. Please download the weights from link.
Data Preparation
Training and evaluation on InterHand2.6M
You should download the dataset from the official website to train the prior model or evaluate the one-shot performance on InterHand2.6M.
After downloading the pre-trained models and data, you should organize the folder as follows:
For training and evaluation, you also need to generate hand segmentations.
First, you should follow HandAvatar to generate masks by MANO rendering.
Please refer to scripts/seg_interhand2.6m_from_mano.py for generating the MANO segmentation:
python scripts/seg_interhand2.6m_from_mano.py
To better train the prior model, we further utilize SAM to generate more hand-aligned segmentations with joint and bounding box prompts.
We strongly recommend using segmentations as well as possible for prior learning.
Please refer to scripts/seg_with_sam.py for more details:
python scripts/seg_with_sam.py
Data for One-shot Creation
For one-shot creation, you should use the hand pose estimator to predict the MANO parameters of the input image, and then process the data to the input format.
We have provided a tool for obtaining HandMesh through fitting, along with metadata in the required format. You can refer to HandMesh for data preparation tools. Our method is not limited to using HandMesh; you can also use other Hand Mesh Estimators such as Hamer. You can also refer to scripts/seg_with_sam.py for generating the hand mask of in-the-wild hand images.
We provide the process script in scripts/process_interhand2.6m, which can process the data of InterHand2.6M to the format for one-shot creation.
python scripts/process_interhand2.6m.py
We also provide some processed samples in example_data.
Avatar Creation
One-shot creation
After processing the image to the input format, you can use the create.py script to create the hand avatar as below:
If you are interested in generating hand avatars using text prompts, you can utilize image generation tools (e.g., ControlNet) with text and depth map (obtained by MANO rendering) prompts. After that, you can convert the data to the input format described above for avatar generation.
:running_woman: Evaluation on InterHand2.6M
After creating the one-shot avatar using InterHand2.6M, you can evaluate the performance on the subset.
If you find our work useful for your research, please consider citing the paper:
@inproceedings{
zheng2024ohta,
title={OHTA: One-shot Hand Avatar via Data-driven Implicit Priors},
author={Zheng, Xiaozheng and Wen, Chao and Zhuo, Su and Xu, Zeran and Li, Zhaohu and Zhao, Yang and Xue, Zhou},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year={2024}
}
:newspaper_roll: License
Distributed under the MIT License. See LICENSE for more information.
Acknowledgements
This project is built on source codes shared by HandAvatar and PyTorch3D. We thank the authors for their great job!
OHTA: One-shot Hand Avatar via Data-driven Implicit Priors
OHTA is a novel approach capable of creating implicit animatable hand avatars using just a single image. It facilitates 1) text-to-avatar conversion, 2) hand texture and geometry editing, and 3) interpolation and sampling within the latent space.
[06/2024] :star_struck: Code released!
[02/2024] :partying_face: OHTA is accepted to CVPR 2024! Working on code release!
:desktop_computer: Installation
Environment
Create the conda environment for OHTA with the given script:
SMPL-X
You should accept SMPL-X Model License and install SMPL-X.
MANO
You should accept MANO License and download the MANO model from the official website.
PairOF and MANO-HD
Download the pre-trained PairOF and MANO-HD from here, which are provided by HandAvatar. We refer to the MANO-HD implementation from HandAvatar.
🔥 Pre-trained Model
We provide the pre-trained model after prior learning, which can be used for one-shot creation. Please download the weights from link.
Training and evaluation on InterHand2.6M
You should download the dataset from the official website to train the prior model or evaluate the one-shot performance on InterHand2.6M. After downloading the pre-trained models and data, you should organize the folder as follows:
For training and evaluation, you also need to generate hand segmentations. First, you should follow HandAvatar to generate masks by MANO rendering. Please refer to
scripts/seg_interhand2.6m_from_mano.pyfor generating the MANO segmentation:To better train the prior model, we further utilize SAM to generate more hand-aligned segmentations with joint and bounding box prompts. We strongly recommend using segmentations as well as possible for prior learning. Please refer to
scripts/seg_with_sam.pyfor more details:Data for One-shot Creation
For one-shot creation, you should use the hand pose estimator to predict the MANO parameters of the input image, and then process the data to the input format.
We have provided a tool for obtaining HandMesh through fitting, along with metadata in the required format. You can refer to HandMesh for data preparation tools. Our method is not limited to using HandMesh; you can also use other Hand Mesh Estimators such as Hamer. You can also refer to
scripts/seg_with_sam.pyfor generating the hand mask of in-the-wild hand images.We provide the process script in
scripts/process_interhand2.6m, which can process the data of InterHand2.6M to the format for one-shot creation.We also provide some processed samples in
example_data.One-shot creation
After processing the image to the input format, you can use the
create.pyscript to create the hand avatar as below:Texture editing
You can also edit the avatar with the given content and the corresponding mask:
Text-to-avatar
If you are interested in generating hand avatars using text prompts, you can utilize image generation tools (e.g., ControlNet) with text and depth map (obtained by MANO rendering) prompts. After that, you can convert the data to the input format described above for avatar generation.
:running_woman: Evaluation on InterHand2.6M
After creating the one-shot avatar using InterHand2.6M, you can evaluate the performance on the subset.
You can use the script to train the prior model on InterHand2.6M:
:love_you_gesture: Citation
If you find our work useful for your research, please consider citing the paper:
:newspaper_roll: License
Distributed under the MIT License. See
LICENSEfor more information.This project is built on source codes shared by HandAvatar and PyTorch3D. We thank the authors for their great job!