E-Book, Englisch, Band 2, 158 Seiten, eBook
Colmenarez / Xiong / Huang Facial Analysis from Continuous Video with Applications to Human-Computer Interface
1. Auflage 2005
ISBN: 978-1-4020-7803-3
Verlag: Springer US
Format: PDF
Kopierschutz: 1 - PDF Watermark
E-Book, Englisch, Band 2, 158 Seiten, eBook
Reihe: International Series on Biometrics
ISBN: 978-1-4020-7803-3
Verlag: Springer US
Format: PDF
Kopierschutz: 1 - PDF Watermark
Zielgruppe
Research
Autoren/Hrsg.
Weitere Infos & Material
Information-based Maximum Discrimination.- Face and Facial Feature Detection.- Face and Facial Feature Tracking.- Face and Facial Expression Recognition.- 3-D Model-based Image Communication.- Implementations, Experiments and Results.- Application in an Audio-visual Person Recognition System.- Conclusions.
2. Previous Approaches (p. 27-28)
In early tracking systems [26, 27, 28], the feature matching step was carried out from one frame to the next using optical flow computations, resulting in drifting errors accumulating over long image sequences. In later techniques, feature texture information is gathered during initialization, so the feature matching step is carried out with respect to the initialization frame to overcome drifting.
In order to deal with large out-of-plane rotations, a 3D model of the geometry of the face has been used together with the texture obtained from the initialization step to achieve 3D pose estimation simultaneously with face tracking in an analysis-by-synthesis scheme [29, 30]. In this approach, the 3D model is used to create the templates by rendering the texture given the head pose so that the feature matching step performs well on large out-of–plane rotations. However, this system requires the 3D model of the person’s head/face.
A wire-frame model capable of nonrigid motion has also been used to analyze facial expressions together with the global position of the face [31]; however, the templates used in the feature matching algorithm do not adapt according to the nonrigid deformation or global head position, resulting in poor accuracy on extreme expressions and large out-of-plane rotations when the templates and the input images do not match well.
In this approach, a piece-wise linear deformation model is used to constrain the non-rigid motion into a subspace of deformations established beforehand. In a more complex scheme [32], optical flow constraints are used together with a wire-frame model to track rigid and nonrigid motion and adapt the wire-frame model to fit the person’s head. One of the most serious limitations in the wire-frame approaches is the fitting of the wire-frame model to the face in the initialization frame; this task involve the accurate location of many facial feature points and is carried out by hand.
Other approaches of current interest are those based on "blobs," where the face and other body parts are modelled with 2D or 3D Gaussian distributions of pixels. Pixels are clustered by their intensity [33] or color [34], or even by disparity maps from stereo images [35]. Although these techniques fail to capture nonrigid facial motion, they are easily initialized and operate very efficiently, especially even in sequences with moderate occlusion.
In general, algorithms that use complex wire-frame models provide a framework for high-level motion analysis of nonrigid facial motion. However, these complex models need to be customized to the face being tracked during a similarly complex initialization procedure. At the other end of the spectrum, algorithms based on simple models, such as blobs, have proven to be feasible. Their simple initialization procedures and low computational requirements allow them to run in real-time on portable computers, but they are limited in the amount of information they extract from the object parts.
The next step would be to combine these two schemes in a hierarchical approach that would benefit from both; however, the gap between the two schemes is too wide to be bridged since the complex models still must be initialized with person-independent accurate location of facial features. The face and facial feature tracking algorithm described here stands somewhere in between these two schemes. Faces and facial features are detected and tracked using person-independent appearance and geometry models that can be easily initialized and efficiently implemented to perform in real time. Nine facial features of multiple people are tracked; these account for global positions of the heads as well as for nonrigid facial deformations.