Shao / Shan / Luo | Multimedia Interaction and Intelligent User Interfaces | E-Book | www2.sack.de
E-Book

E-Book, Englisch, 302 Seiten

Reihe: Advances in Computer Vision and Pattern Recognition

Shao / Shan / Luo Multimedia Interaction and Intelligent User Interfaces

Principles, Methods and Applications
1. Auflage 2010
ISBN: 978-1-84996-507-1
Verlag: Springer
Format: PDF
Kopierschutz: 1 - PDF Watermark

Principles, Methods and Applications

E-Book, Englisch, 302 Seiten

Reihe: Advances in Computer Vision and Pattern Recognition

ISBN: 978-1-84996-507-1
Verlag: Springer
Format: PDF
Kopierschutz: 1 - PDF Watermark



Consumer electronics (CE) devices, providing multimedia entertainment and enabling communication, have become ubiquitous in daily life. However, consumer interaction with such equipment currently requires the use of devices such as remote controls and keyboards, which are often inconvenient, ambiguous and non-interactive. An important challenge for the modern CE industry is the design of user interfaces for CE products that enable interactions which are natural, intuitive and fun. As many CE products are supplied with microphones and cameras, the exploitation of both audio and visual information for interactive multimedia is a growing field of research. Collecting together contributions from an international selection of experts, including leading researchers in industry, this unique text presents the latest advances in applications of multimedia interaction and user interfaces for consumer electronics. Covering issues of both multimedia content analysis and human-machine interaction, the book examines a wide range of techniques from computer vision, machine learning, audio and speech processing, communications, artificial intelligence and media technology. Topics and features: introduces novel computationally efficient algorithms to extract semantically meaningful audio-visual events; investigates modality allocation in intelligent multimodal presentation systems, taking into account the cognitive impacts of modality on human information processing; provides an overview on gesture control technologies for CE; presents systems for natural human-computer interaction, virtual content insertion, and human action retrieval; examines techniques for 3D face pose estimation, physical activity recognition, and video summary quality evaluation; discusses the features that characterize the new generation of CE and examines how web services can be integrated with CE products for improved user experience. This book is an essential resource for researchers and practitioners from both academia and industry working in areas of multimedia analysis, human-computer interaction and interactive user interfaces. Graduate students studying computer vision, pattern recognition and multimedia will also find this a useful reference.

Shao / Shan / Luo Multimedia Interaction and Intelligent User Interfaces jetzt bestellen!

Weitere Infos & Material


1;Preface;4
2;Contents;7
3;Retrieving Human Actions Using Spatio-Temporal Features and Relevance Feedback;9
3.1;Introduction;9
3.2;Action Retrieval Scheme;12
3.2.1;Action Retrieval Framework;12
3.2.2;Spatio-Temporal Interest Point Detection;13
3.2.3;Feature Description;14
3.2.4;Codebook Formation and Action Video Representation;16
3.2.5;Similarity Matching Scheme;16
3.3;Action Retrieval on the KTH Dataset;16
3.3.1;Dataset Processing;16
3.3.2;Performance Evaluation;17
3.3.3;Summary for Experiments on the KTH Dataset;21
3.4;Realistic Action Retrieval in Movies;22
3.4.1;Challenges of This Task;22
3.4.2;Implementation;24
3.4.3;Result Demonstration;26
3.4.4;Discussion;28
3.4.5;Application;29
3.5;Conclusions;29
3.6;References;30
4;Computationally Efficient Clustering of Audio-Visual Meeting Data;32
4.1;Introduction;32
4.2;Background;33
4.2.1;Challenges in Meeting Analysis;35
4.2.2;Background on Speaker Diarization;37
4.2.3;Background on Audio-Visual Synchrony;38
4.2.4;Human Body Motions in Conversations;39
4.3;Approach;40
4.4;The Augmented MultiParty Interaction (AMI) Corpus;41
4.5;Audio Speaker Diarization;43
4.5.1;Traditional Offline Speaker Diarization;43
4.5.1.1;Feature Extraction;43
4.5.1.2;Speech/Nonspeech Detection;43
4.5.1.3;Speaker Segmentation and Clustering;44
4.5.2;Online Speaker Diarization;45
4.5.2.1;Unsupervised Bootstrapping of Speaker Models;45
4.5.2.2;Speaker Recognition;46
4.5.2.3;A Note on Model Order Selection;46
4.5.3;Summary of the Diarization Performance;47
4.6;Extracting Computationally Efficient Video Features;48
4.6.1;Estimating Personal Activity Levels in the Compressed Domain;49
4.6.2;Finding Personal Head and Hand Activity Levels;50
4.6.3;Estimating Speakers Using Video Only;53
4.7;Associating Speaker Clusters with Video Channels;55
4.8;Audio-Visual Clustering Results;57
4.8.1;Using Raw Visual Activity;57
4.8.2;Using Estimates of Speaking Activity from Video;58
4.9;Discussion;60
4.10;References;62
5;Cognitive-Aware Modality Allocation in Intelligent Multimodal Information Presentation;67
5.1;Introduction;67
5.2;Modality and Human Information Processing;69
5.2.1;Modality and Sensory Processing;70
5.2.2;Modality and Perception;71
5.2.2.1;Visual Attention;71
5.2.2.2;Auditory Attention;71
5.2.2.3;Cross-Modal Attention;72
5.2.3;Modality and Working Memory;72
5.2.3.1;Working Memory Theory;73
5.2.3.2;Dual Coding Theory;73
5.2.3.3;Relating the Two Theories;74
5.3;Experiment on Modality Effects in High-Load HCI;75
5.3.1;Presentation Material;76
5.3.2;Task and Procedure;77
5.3.3;Measurements;77
5.3.4;Hypotheses;78
5.4;Results on Performance, Cognitive Load and Stress;78
5.4.1;Performance;78
5.4.2;Cognitive Load and Stress;80
5.5;Discussion;81
5.5.1;Text vs. Image;81
5.5.2;Visual Aid vs. Auditory Aid;81
5.5.3;Verbal Aid vs. Nonverbal Aid;82
5.5.4;Additional Aid vs. No Aid;83
5.5.5;Low Load vs. High Load;83
5.6;A Modality Suitability Prediction Model;84
5.7;Conclusions;86
5.8;References;86
6;Natural Human-Computer Interaction;90
6.1;Introduction;90
6.1.1;From Ergonomics to Human-Computer Interaction;90
6.1.2;Multimodal Interfaces;91
6.1.3;Natural Human-Computer Interaction;92
6.2;Natural Interaction Systems;92
6.2.1;Human-Centered Design;93
6.2.2;Intuitive Interaction;93
6.2.3;Natural Language and Tangible User Interfaces;94
6.3;Sensing Human Behavior;95
6.3.1;Sensed Spaces and Sensors Categories;95
6.3.2;Optical Sensors and Computer Vision Technologies;96
6.3.2.1;Image Analysis Techniques;96
6.3.2.2;Tracking Techniques;96
6.3.3;Observing Human Activity;96
6.3.3.1;People Detection;97
6.3.3.2;People Tracking;97
6.3.3.3;Gaze Estimation;98
6.4;State of the Art;99
6.4.1;Interactive Tabletop;99
6.4.2;Tangible User Interface;100
6.4.3;Smart Room;101
6.5;Smart Room with Tangible Natural Interaction;102
6.5.1;TANGerINE Smart Room: a Case Study;102
6.5.2;TANGerINE Smart Cube;104
6.5.2.1;Manipulation State Awareness;105
6.5.2.2;Gesture Detection Algorithm;105
6.5.2.3;Bluetooth-Based Proximity Awareness;105
6.5.3;Computer Vision Applied to the TANGerINE Platform;106
6.5.4;Observing Human Activity in TANGerINE Smart Room;107
6.6;References;108
7;Gesture Control for Consumer Electronics;112
7.1;Introduction;112
7.2;Sensing Technologies;113
7.2.1;Haptics;114
7.2.2;Handhold Sensors;114
7.2.3;Vision;114
7.2.4;Ultrasound;114
7.2.5;Infrared Proximity Sensing;115
7.3;Vision-Based Gesture Recognition;115
7.3.1;Body Part Detection;117
7.3.2;Gesture Tracking;119
7.3.3;Gesture Recognition;122
7.4;Gesture Control: Products and Applications;125
7.4.1;GestureTek;125
7.4.2;Toshiba;125
7.4.3;Mgestyk;126
7.4.4;Fraunhofer;126
7.4.5;TVs or Displays;126
7.4.6;Gaming;127
7.4.7;Mobile Phones;127
7.4.8;Automobiles;127
7.5;Conclusions;128
7.6;References;129
8;Empirical Study of a Complete System for Real-Time Face Pose Estimation;134
8.1;Introduction;134
8.2;Problem Definition;136
8.2.1;Problem Statement;136
8.2.2;Pose Estimation Algorithm;137
8.2.3;3D Mesh;138
8.2.4;Texture Extraction;138
8.3;Automatic Initialization;138
8.3.1;Face and Feature Detection;138
8.3.2;Mesh Initialization;140
8.4;Tracking;141
8.4.1;Overview of Method;141
8.4.2;2D Feature Tracking;142
8.4.3;Adaptation Step;142
8.4.4;Matching Criterion;143
8.4.4.1;Detection on Mesh Texture;145
8.4.4.2;Reconstruction Error of Mesh Texture;145
8.4.5;Detection of Failed Tracking;149
8.5;Results;150
8.5.1;Stability Analysis for Static Images;150
8.5.2;Accuracy on Videos;152
8.5.2.1;Semi-Automatic Annotation;152
8.5.2.2;Performance of Different Search Strategies: Angular Error;154
8.5.2.3;Performance of Different Search Strategies: MSE;155
8.5.2.4;Performance of Different Search Strategies: Computation Time;155
8.5.2.5;Influence of Texture Representation;156
8.5.2.6;Influence of Training Size;156
8.5.2.7;Benefits of the Proposed System;160
8.5.3;Analysis of Typical Results;162
8.5.4;Examples of Tracking Failure;164
8.6;Conclusions;165
8.7;References;165
9;Evolution-based Virtual Content Insertion with Visually Virtual Interactions in Videos;168
9.1;Introduction;168
9.2;System Overview;170
9.2.1;Essential Ideas;170
9.2.2;System Overview;171
9.3;Video Content Analysis;172
9.3.1;Frame Profiling;172
9.3.1.1;Motion Estimation;173
9.3.1.2;Region Segmentation;173
9.3.2;ROI Estimation;174
9.3.3;Aural Saliency Analysis;174
9.4;Virtual Content Analysis;175
9.4.1;Virtual Content Characterization;176
9.4.2;Behavior Modeling;178
9.4.2.1;The Cell Phase;179
9.4.2.2;The Microbe Phase;179
9.4.2.3;The Creature Phase;180
9.5;Virtual Content Insertion;182
9.5.1;Animation Generation;182
9.5.2;Layer Composition;184
9.6;Experimental Results;185
9.7;Summary;188
9.8;References;189
10;Physical Activity Recognition with Mobile Phones: Challenges, Methods, and Applications;190
10.1;Introduction;191
10.1.1;Background of Physical Activity Recognition;191
10.1.2;Practical Challenges on Mobile Devices;193
10.2;Accelerometer Based Physical Activity Recognition Methods;194
10.2.1;Data Format;194
10.2.2;Accelerometer Sensor Calibration;196
10.2.3;Signal Projection;200
10.2.4;Data Collection;201
10.2.5;Feature Extraction and Selection;202
10.2.6;Classification Algorithms;204
10.2.7;Smoothing Algorithms;206
10.3;System Design and Implementation;207
10.4;Applications and Use Cases;209
10.4.1;Physical Activity Diary;209
10.4.2;Mobile Healthcare and Wellness;212
10.4.3;Human-Centric Sensing and Sharing in Mobile Social Networks;213
10.4.4;User Interfaces;213
10.5;Conclusion and Future Work;215
10.6;References;217
11;Gestures in an Intelligent User Interface;219
11.1;Two Sides of the Same Coin;219
11.2;Related Work;221
11.2.1;A Human's Perspective;221
11.2.2;A System's Perspective;222
11.3;Experiment 1: Intuitive Gesturing;223
11.3.1;Method;224
11.3.1.1;Setup;224
11.3.2;Results;226
11.3.2.1;Condition Qx;226
11.3.2.2;Condition Xp;227
11.3.2.3;Sample Summary;227
11.3.2.4;Commands: Pointing;228
11.3.2.5;Commands: Selecting;229
11.3.2.6;Commands: Deselecting;230
11.3.2.7;Commands: Resizing;231
11.3.3;Conclusion;233
11.4;Experiment 2: Gesturing in the Interface;234
11.4.1;Method;234
11.4.1.1;Out-of-Range and Tracking;236
11.4.1.2;Select and Deselect;236
11.4.1.3;Rotate;236
11.4.1.4;Resizing;236
11.4.1.5;Restore and Remove;237
11.4.2;Results;237
11.4.2.1;Questionnaire;238
11.4.2.2;Observations;240
11.4.3;Conclusion;240
11.5;Conclusion and Discussion;242
11.6;References;243
12;Video Summary Quality Evaluation Based on 4C Assessment and User Interaction;247
12.1;Introduction;247
12.2;Related Work;249
12.3;Uniform Framework for Video Summary Quality Evaluation;251
12.3.1;Summary Unit Sequence Generation;252
12.3.2;Frame Alignment-Based Summary Unit Matching;252
12.4;Similarity-Based Automatic 4C Assessment;255
12.4.1;Coverage Assessment;255
12.4.2;Conciseness Assessment;257
12.4.3;Coherence Assessment;257
12.4.4;Context Assessment;259
12.5;User Interaction Based Individual Evaluation;260
12.5.1;User Interaction Based Requirement Gathering;261
12.5.2;Transformation of 4C Assessment Scores;261
12.5.3;Incremental User Interaction;264
12.6;Experiments;264
12.6.1;Validation of 4C Assessment Algorithm;265
12.6.2;Validation of Incremental User Interaction;269
12.6.3;Validation of Evaluation Result Transformation;270
12.7;Conclusions;271
12.8;References;271
13;Multimedia Experience on Web-Connected CE Devices;274
13.1;Introduction;275
13.2;Digital Photography Ecosystem;277
13.3;AutoPhotobook System;279
13.3.1;Design-Driven Photo Selection and Pagination;283
13.3.1.1;Blurry Image Removal;283
13.3.1.2;Duplicate Photo Detection;284
13.3.1.3;Theme-Based Pagination and Layout;287
13.3.2;Artistic Background Resizing and Assignment;288
13.3.2.1;STArt Design for Automatic Resizable Artwork;289
13.3.2.1.1;Transformation Algorithm;290
13.3.2.2;Dynamic Photo Layout Region on the Page;292
13.3.2.3;Theme Grammar for Photobook;292
13.3.3;Automatic Layout;292
13.3.3.1;Prior Related Work;293
13.3.3.2;The AutoPhotobook Layout Engine;293
13.3.3.3;Results Illustrating Text Support;294
13.3.4;User Interface Design;295
13.4;Powering CE 2.0 with AutoPhotobook;298
13.5;Conclusion;301
13.6;References;301
14;Index;304



Ihre Fragen, Wünsche oder Anmerkungen
Vorname*
Nachname*
Ihre E-Mail-Adresse*
Kundennr.
Ihre Nachricht*
Lediglich mit * gekennzeichnete Felder sind Pflichtfelder.
Wenn Sie die im Kontaktformular eingegebenen Daten durch Klick auf den nachfolgenden Button übersenden, erklären Sie sich damit einverstanden, dass wir Ihr Angaben für die Beantwortung Ihrer Anfrage verwenden. Selbstverständlich werden Ihre Daten vertraulich behandelt und nicht an Dritte weitergegeben. Sie können der Verwendung Ihrer Daten jederzeit widersprechen. Das Datenhandling bei Sack Fachmedien erklären wir Ihnen in unserer Datenschutzerklärung.