Complete 3D Human Reconstruction from a Single Incomplete Image

1University of Southern California, 2Adobe Research

Abstract

We present a method to reconstruct a complete human geometry and texture from an image of a person with only partial body observed, e.g., a torso. The core challenge arises from the occlusion: there exists no pixel to reconstruct where many existing single-view human reconstruction methods are not designed to handle such invisible parts, leading to missing data in 3D. To address this challenge, we introduce a novel coarse-to-fine human reconstruction framework. For coarse reconstruction, explicit volumetric features are learned to generate a complete human geometry with 3D convolutional neural networks conditioned by a 3D body model and the style features from visible parts. An implicit network combines the learned 3D features with the high-quality surface normals enhanced from multiviews to produce fine local details, e.g., high-frequency wrinkles. Finally, we perform progressive texture inpainting to reconstruct a complete appearance of the person in a view-consistent way, which is not possible without the reconstruction of a complete geometry. In experiments, we demonstrate that our method can reconstruct high-quality 3D humans, which is robust to occlusion.

Overview

Given an image I of a person with occlusion and a guiding 3D body pose P, we reconstruct a complete 3D human model Gf in a coarse-to-fine manner: we first build the volume of image features F by extracting the 2D image features and copying them in a depth direction. This image feature volume is concatenated with the 3D body pose P recorded on the volume. Our 3D CNN G3d generates complete and coherent volumetric features whose generative power is enabled by jointly learning with 3D discriminator D3d with explicit shape prediction S3d. The coarse MLP C produces the coarse yet complete occupancy of the continually sampled 3D points and their intermediate global features F* where we represent the 3D surface by using 0.5 level-set occupancy field. The fine MLP Cf combines F* and surface normals enhanced from multiviews to output fine-grained occupancy. We also complete the appearance by performing view-progressive texture inpainting.

Full Body Image

Partial Body Image



Image with Natural Occlusion

BibTeX

@inproceedings{wang2023complete,
  title={Complete 3D Human Reconstruction From a Single Incomplete Image},
  author={Wang, Junying and Yoon, Jae Shin and Wang, Tuanfeng Y and Singh, Krishna Kumar and Neumann, Ulrich},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={8748--8758},
  year={2023}
}
}