The appearance of the human face varies significantly as the facial expression changes, hence, the human face is not a rigid object. In addition, the human can move the head freely in front of the camera. Therefore, the motion of the face is the sum of two independent (or different) motions: rigid motion and nonrigid motion. The rigid motion consists of the global motion of the face describing the rotation and translation of the head, or face pose; while the nonrigid motion consists of the local motion of the face describing the contraction of facial muscles, or facial expression. When captured by the camera, both motions are mixed together to form a 2D face motion in the image plane.
In our research, a novel technique is proposed to recover 3D face pose and facial expression simultaneously from a monocular video sequence in real time. First, twenty-eight salient facial features are detected and tracked robustly under various face orientations and significant facial expressions. Second, after modelling the coupling between face pose and facial expression in the 2D image as a nonlinear function, a normalized SVD (N-SVD) decomposition technique is proposed to recover the pose and expression parameters analytically. Subsequently, the solution obtained from the N-SVD technique is further refined via a nonlinear technique by imposing the orthonormality constraints on the pose parameters. Therefore, our proposed method can recover the face pose and facial expression from the face images robustly and accurately.