Abstract
We present a method for 3D face reconstruction from multi-view images with different expressions. We formulate this problem from the perspective of non-rigid multi-view stereo (NRMVS). Unlike previous learning-based methods, which often regress the face shape directly, our method optimizes the 3D face shape by explicitly enforcing multiview appearance consistency, which is known to be effective in recovering shape details according to conventional multi-view stereo methods. Furthermore, by estimating face shape through optimization based on multi-view consistency, our method can potentially have better generalization to unseen data. However, this optimization is challenging since each input image has a different expression. We facilitate it with a CNN network that learns to regularize the non-rigid 3D face according to the input image and preliminary optimization results. Extensive experiments show that our method achieves the state-of-the-art performance on various datasets and generalizes well to the data in the wild.
Type
Publication
In Conference on Computer Vision and Pattern Recognition 2020