Eskenazi / Levow / Meng Crowdsourcing for Speech Processing
1. Auflage 2013
ISBN: 978-1-118-54127-2
Verlag: John Wiley & Sons
Format: PDF
Kopierschutz: Adobe DRM (»Systemvoraussetzungen)
Applications to Data Collection, Transcription and Assessment
E-Book, Englisch, 360 Seiten, E-Book
ISBN: 978-1-118-54127-2
Verlag: John Wiley & Sons
Format: PDF
Kopierschutz: Adobe DRM (»Systemvoraussetzungen)
Provides an insightful and practical introduction tocrowdsourcing as a means of rapidly processing speechdata
Intended for those who want to get started in the domainand learn how to set up a task, what interfaces areavailable, how to assess the work, etc. as well as for those whoalready have used crowdsourcing and want to create better tasks andobtain better assessments of the work of the crowd. It will includescreenshots to show examples of good and poor interfaces; examplesof case studies in speech processing tasks, going through the taskcreation process, reviewing options in the interface, in the choiceof medium (MTurk or other) and explaining choices, etc.
* Provides an insightful and practical introduction tocrowdsourcing as a means of rapidly processing speech data.
* Addresses important aspects of this new technique that shouldbe mastered before attempting a crowdsourcing application.
* Offers speech researchers the hope that they can spend muchless time dealing with the data gathering/annotation bottleneck,leaving them to focus on the scientific issues.
* Readers will directly benefit from the book's successfulexamples of how crowd- sourcing was implemented for speechprocessing, discussions of interface and processing choices thatworked and choices that didn't, and guidelines on howto play and record speech over the internet, how to design tasks,and how to assess workers.
Essential reading for researchers and practitioners in speechresearch groups involved in speech processing
Autoren/Hrsg.
Weitere Infos & Material
Contributors vii
Preface ix
1 An Overview
1.1 Growing Needs for Speech Data
1.1.1 Origins of Crowdsourcing
1.1.2 Operational Definition of Crowdsourcing
1.1.3 Functional Definition of Crowdsourcing
1.2 Some Issues
1.3 Some Terminology
1.4 Acknowledgements
References
2 The Basics
2.1 An Overview of the Literature on Crowdsourcing for Speech Processing
2.1.1 Evolution of the Use of Crowdsourcing for Speech
2.1.2 Geographic Locations of Crowdsourcing for Speech
2.1.3 Specific Areas of Research
2.2 Alternate Solutions
2.3 Some Ready-Made Platforms for Crowdsourcing
2.4 Making Task Creation Easier
2.5 Getting Down to Brass Tacks
2.5.1 Hearing and Being Heard Over the Web
2.5.2 Prequalification
2.5.3 Native Language of the Workers
2.5.4 Payment
2.5.5 Choice of Platform in the Literature
2.5.6 The Complexity of the Task
2.6 Quality Control
2.6.1 Was that Worker a Bot?
2.6.2 Quality Control in the Literature
2.7 Judging the Quality of the Literature
2.8 Some Quick Tips
References
13 Collecting Speech from Crowds
13.1 A Short History of Speech Collection
13.1.1 Speech Corpora
13.1.2 Spoken Language Systems
13.1.3 User-Configured Recording Environments
13.2 Technology for Web-based Audio Collection
13.2.1 Silverlight
13.2.2 Java
13.2.3 Flash
13.2.4 HTML and JavaScript
13.3 Example:WAMI Recorder
13.3.1 The JavaScript API
13.3.2 Audio Formats
13.4 Example: The WAMI Server
13.4.1 PHP Script
13.4.2 Google App Engine
13.4.3 Server Configuration Details
13.5 Example: Speech Collection on Amazon Mechanical Turk
13.5.1 Server Setup
13.5.2 Deploying to Amazon Mechanical Turk
13.5.3 The Command Line Interface
13.6 Using the Platform Purely for Payment
13.7 Advanced Methods of Crowdsourced Audio Collection
13.7.1 Collecting Dialogue Interactions
13.7.2 Human Computation
13.8 Summary
13.9 Acknowledgements
References
Index