CVPR 2023 Highlight

High-fidelity 3D Human Digitization from Single 2K Resolution Images

Sang-Hun Han1, Min-Gyu Park2, Ju Hong Yoon2,
Ju-Mi Kang2, Young-Jae Park1, and Hae-Gon Jeon1

1Gwangju Institute of Science and Technology, 2Korea Electronics Technology Institute


High-quality 3D human body reconstruction requires high-fidelity and large-scale training data and appropriate network design that effectively exploits the high-resolution input images. To tackle these problems, we propose a simple yet effective 3D human digitization method called 2K2K, which constructs a large-scale 2K human dataset and infers 3D human models from 2K resolution images. The proposed method separately recovers the global shape of a human and its details. The low-resolution depth network predicts the global structure from a low-resolution image, and the part-wise image-to-normal network predicts the details of the 3D human body structure. The high-resolution depth network merges the global 3D shape and the detailed structures to infer the high-resolution front and back side depth maps. Finally, an off-the-shelf mesh generator reconstructs the full 3D human model, which are available at In addition, we also provide 2,050 3D human models, including texture maps, 3D joints, and SMPL parameters for research purposes. In experiments, we demonstrate competitive performance over the recent works on various datasets.




An overall framework of 2k2k method. The first phase predicts the low-resolution front/back view depth maps, and the high-resolution front/back view normal maps by merging each part-wise normal maps. In the second phase, the high-resolution depth network upsamples the low-resolution depth maps with the guidance of the high-resolution normal maps. Finally, the mesh generation via screened Poisson reconstructs the full 3D model.



The 2K2K dataset consists of 2,050 3D human models scanned from scan booths, including high-quality public human scan models. Our dataset contains subjects of various genders, ages, objects, poses, and cloths with high-resolution scans.

Qualitative Results


Internet Photo Results

Single image 3D human reconstruction results in the wild. The images are downloaded from internet.

Video Frame Results

The result of reconstructing a video taken by a Youtube, Samsung Galaxy mobile phone, and DSLR, respectively.


title={High-fidelity 3D Human Digitization from Single 2K Resolution Images},
author={Han, Sang-Hun and Park, Min-Gyu and Yoon, Ju Hong and Kang, Ju-Mi and Park, Young-Jae and Jeon, Hae-Gon},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},