Mehler / Sharoff / Santini | Genres on the Web | E-Book | sack.de
E-Book

E-Book, Englisch, Band 42, 362 Seiten, eBook

Reihe: Text, Speech and Language Technology

Mehler / Sharoff / Santini Genres on the Web

Computational Models and Empirical Studies

E-Book, Englisch, Band 42, 362 Seiten, eBook

Reihe: Text, Speech and Language Technology

ISBN: 978-90-481-9178-9
Verlag: Springer Netherland
Format: PDF
Kopierschutz: 1 - PDF Watermark



The volume “Genres on the Web” has been designed for a wide audience, from the expert to the novice. It is a required book for scholars, researchers and students who want to become acquainted with the latest theoretical, empirical and computational advances in the expanding field of web genre research. The study of web genre is an overarching and interdisciplinary novel area of research that spans from corpus linguistics, computational linguistics, NLP, and text-technology, to web mining, webometrics, social network analysis and information studies. This book gives readers a thorough grounding in the latest research on web genres and emerging document types.
The book covers a wide range of web-genre focused subjects, such as:

• The identification of the sources of web genres

• Automatic web genre identification

• The presentation of structure-oriented models

• Empirical case studies
One of the driving forces behind genre research is the idea ofa genre-sensitive information system, which incorporates genre cues complementing the current keyword-based search and retrieval applications.
Mehler / Sharoff / Santini Genres on the Web jetzt bestellen!

Zielgruppe


Research

Weitere Infos & Material


1;Foreword;6
2;Personal Note;9
3;Contents;10
4;Contributors;12
5;Part I Introduction;14
5.1;1 Riding the Rough Waves of Genre on the Web ;15
5.1.1;1.1 Why Is Genre Important?;15
5.1.1.1;1.1.1 Zooming In: Information on the Web;16
5.1.2;1.2 Trying to Grasp the Ungraspable?;18
5.1.2.1;1.2.1 In Quest of a Definition of Web Genre for Empirical Studies and Computational Applications;20
5.1.3;1.3 Empirical and Computational Approaches to Genre: Open Issues;21
5.1.3.1;1.3.1 Web Documents;21
5.1.3.2;1.3.2 Corpora, Genres and the Web;26
5.1.3.3;1.3.3 Empirical and Computational Models of Web Genres;30
5.1.4;1.4 Conclusions;34
5.1.5;1.5 Outline of the Volume;35
5.1.6;References;37
6;Part II Identifying the Sources of Web Genres;43
6.1;2 Conventions and Mutual Expectations ;44
6.1.1;2.1 Genres Are Not Rule-Bound;44
6.1.2;2.2 So, Let's Ask the Readers;46
6.1.3;2.3 An Editorial, Third Party, View of Genres on the Web;51
6.1.4;2.4 Data Source: Observation of User Actions;53
6.1.5;2.5 Conclusions;56
6.1.6;References;56
6.2;3 Identification of Web Genres by User Warrant ;58
6.2.1;3.1 Introduction;58
6.2.2;3.2 Criteria for the Identification of Web Genre;60
6.2.3;3.3 Operationalizing Traditional Genre Theory for the World Wide Web;61
6.2.3.1;3.3.1 A Genre's User Group;61
6.2.3.2;3.3.2 Genre: Function, Form and Substance;63
6.2.3.3;3.3.3 Genres on the Web: Further Implications for Research;66
6.2.4;3.4 Developing a Web Genre Palette;66
6.2.4.1;3.4.1 Collecting Genre Terminology in the Users' Own Words;67
6.2.4.2;3.4.2 Users Choose the Best of the Collected Genre Terminology;69
6.2.4.3;3.4.3 User Validation of the Genre Palette;72
6.2.4.4;3.4.4 A Fourth Study: Determining the Genres' Usefulness for Web Search;75
6.2.5;3.5 Conclusion;76
6.2.6;References;77
6.3;4 Problems in the Use-Centered Development of a Taxonomy of Web Genres ;79
6.3.1;4.1 Introduction;79
6.3.1.1;4.1.1 What Is the Purpose of a Genre Taxonomy?;80
6.3.2;4.2 Why Is It Hard to Develop a Web Genre Taxonomy?;81
6.3.2.1;4.2.1 Difficulties in Defining Genres;81
6.3.2.2;4.2.2 Difficulties in Developing the Scope and Expressiveness of the Taxonomy;83
6.3.3;4.3 A Use-Centered Development of a Taxonomy of Web Genres;85
6.3.3.1;4.3.1 Research Design: Naturalistic Field Study;85
6.3.3.2;4.3.2 Research Informants;85
6.3.3.3;4.3.3 Data Elicitation;86
6.3.3.4;4.3.4 Data Analysis;87
6.3.4;4.4 Results;88
6.3.5;4.5 Discussion;89
6.3.6;4.6 Conclusions;92
6.3.7;References;93
7;Part III Automatic Web Genre Identification;95
7.1;5 Cross-Testing a Genre Classification Model for the Web ;96
7.1.1;5.1 Introduction;96
7.1.2;5.2 Approximating Genre Population on the Web;99
7.1.2.1;5.2.1 Noise;100
7.1.2.2;5.2.2 Description of the Corpora Used for Cross-Testing;101
7.1.3;5.3 The Web as Communication;105
7.1.3.1;5.3.1 Genre Palette;105
7.1.3.2;5.3.2 Linguistically- and Functionally-Motivated Features;107
7.1.4;5.4 The Genre Model;107
7.1.4.1;5.4.1 Methodology;110
7.1.4.2;5.4.2 Flow and Hypotheses;111
7.1.5;5.5 Results;113
7.1.5.1;5.5.1 Cross-Testing Performance on Single Labels: BBC and 7-Webgenre Collections;114
7.1.5.2;5.5.2 Performances of Other Single-Label Models on the 7-Webgenre Collection;117
7.1.5.3;5.5.3 Cross-Testing Performance on Single Labels: Mapped Web Genres;120
7.1.5.4;5.5.4 Cross-Testing Performance on Single Labels: HCG and MCG in Isolation;122
7.1.5.5;5.5.5 The SPIRIT Sample: An Attempt to Assess Multilabelling;122
7.1.6;5.6 Discussion;126
7.1.7;5.7 Conclusion and Future Work;127
7.1.8;References;135
7.2;6 Formulating Representative Features with Respect to Genre Classification;138
7.2.1;6.1 Introduction;138
7.2.2;6.2 Defining Genre Classification;141
7.2.2.1;6.2.1 Document Representation in Conventional Text Classification;141
7.2.2.2;6.2.2 Harmonic Descriptor Representation (HDR) of Documents;141
7.2.2.3;6.2.3 Defining Genre;145
7.2.3;6.3 Classifiers;146
7.2.4;6.4 Dataset;147
7.2.5;6.5 Features;149
7.2.6;6.6 Results;151
7.2.6.1;6.6.1 Overall Accuracy;151
7.2.6.2;6.6.2 Precision and Recall;152
7.2.7;6.7 Conclusions;154
7.2.8;References;155
7.3;7 In the Garden and in the Jungle ;157
7.3.1;7.1 Introduction;157
7.3.2;7.2 Text Typology for the Web;159
7.3.3;7.3 An Experiment in Automatic Classification of the Web;163
7.3.4;7.4 Analysis of Results;167
7.3.4.1;7.4.1 Qualitative Assessment of Texts in Each Category;167
7.3.4.2;7.4.2 Assessing the Composition of ukWac;169
7.3.5;7.5 Conclusions and Future Research;170
7.3.6;References;173
7.4;8 Web Genre Analysis: Use Cases, Retrieval Models, and Implementation Issues ;175
7.4.1;8.1 Introduction;175
7.4.1.1;8.1.1 Contributions;176
7.4.2;8.2 Use Cases: Genre Analysis in the Retrieval Practice;176
7.4.2.1;8.2.1 Genre-Enabled Web Search;177
7.4.2.2;8.2.2 Information Extraction Based on Genre Information;177
7.4.2.3;8.2.3 Organizing Collections in Both Topic and Genre Dimensions;179
7.4.2.4;8.2.4 Empower Web Page Abstraction with Genre Information;180
7.4.3;8.3 Construction of Genre Retrieval Models;181
7.4.3.1;8.3.1 Problems of Genre Retrieval Models and Lessons Learned;182
7.4.3.2;8.3.2 New Elements for Genre Retrieval Models;184
7.4.4;8.4 Evaluation;186
7.4.4.1;8.4.1 Improving Generalization Capability;187
7.4.4.2;8.4.2 Measuring Generalization Capability;187
7.4.4.3;8.4.3 Experiments;188
7.4.5;8.5 Implementing Genre-Enabled Web Search;191
7.4.6;8.6 Conclusion;194
7.4.7;References;195
7.5;9 Marrying Relevance and Genre Rankings: An Exploratory Study ;198
7.5.1;9.1 Introduction;198
7.5.2;9.2 Related Work;200
7.5.2.1;9.2.1 Genre Classification;200
7.5.2.2;9.2.2 Readability Scores;201
7.5.2.3;9.2.3 Genres in Relevance Ranking;202
7.5.3;9.3 Data;203
7.5.3.1;9.3.1 Functional Styles Sample;203
7.5.3.2;9.3.2 ROMIP Collection;204
7.5.4;9.4 Formality Score;205
7.5.5;9.5 Results;208
7.5.5.1;9.5.1 Genre-Related Rankings;208
7.5.5.2;9.5.2 Merged Rankings;210
7.5.6;9.6 Conclusion;212
7.5.7;References;213
8;Part IV Structure-Oriented Models of Web Genres;216
8.1;10 Classification of Web Sites at Super-Genre Level ;217
8.1.1;10.1 Introduction;217
8.1.2;10.2 Related Work;220
8.1.3;10.3 Dataset;221
8.1.4;10.4 Features for Classification;224
8.1.4.1;10.4.1 Features Derived from Structure;224
8.1.4.2;10.4.2 Features Derived from Content;231
8.1.5;10.5 Classification of Web Sites;232
8.1.5.1;10.5.1 Classification by Structure;233
8.1.5.2;10.5.2 Classification by Content;235
8.1.5.3;10.5.3 Classification by Structure and Content;236
8.1.6;10.6 Conclusion;239
8.1.7;References;239
8.2;11 Mining Graph Patterns in Web-Based Systems: A Conceptual View ;242
8.2.1;11.1 Introduction;242
8.2.2;11.2 Mathematical Preliminaries;244
8.2.3;11.3 Structural Graph Measures;246
8.2.4;11.4 Graph Similarity Measures for Web Mining;247
8.2.4.1;11.4.1 Classical Similarity and Distance Measures for Graphs;247
8.2.4.2;11.4.2 Graph Similarity Measures Based on Trees;249
8.2.4.3;11.4.3 Structural Similarity of Generalized Trees;249
8.2.5;11.5 Applications;253
8.2.6;11.6 Conclusion;254
8.2.7;References;255
8.3;12 Genre Connectivity and Genre Drift in a Web of Genres ;259
8.3.1;12.1 Introduction;259
8.3.2;12.2 Methodology;260
8.3.2.1;12.2.1 Source Pages and Target Pages;262
8.3.2.2;12.2.2 Genre Categorization;263
8.3.3;12.3 Results and Discussion;266
8.3.3.1;12.3.1 Source Genres, Target Genres and Genre Pairs;266
8.3.3.2;12.3.2 Web of Genres;273
8.3.3.3;12.3.3 ``Hook'' Genres and ``Lug'' Genres;274
8.3.3.4;12.3.4 Genre Drift, Topic Drift and Small-World Implications;274
8.3.4;12.4 Conclusion;276
8.3.5;References;277
9;Part V Case Studies of Web Genres;279
9.1;13 Genre Emergence in Amateur Flash ;280
9.1.1;13.1 Genres, Multimedia and the Web;280
9.1.2;13.2 Flash and Newgrounds in Amateur Multimedia;283
9.1.3;13.3 Method;285
9.1.3.1;13.3.1 Sampling;285
9.1.3.2;13.3.2 Identifying Potential Emergent Genres;286
9.1.3.3;13.3.3 Cultural References and Message Content;288
9.1.4;13.4 Results;291
9.1.4.1;13.4.1 Network Analysis;291
9.1.4.2;13.4.2 Genre Features;293
9.1.4.3;13.4.3 Cultural References;297
9.1.4.4;13.4.4 Genre, Emergence and Social Network;300
9.1.5;13.5 Discussion and Conclusions;302
9.1.6;References;304
9.2;14 Variation Among Blogs: A Multi-Dimensional Analysis ;306
9.2.1;14.1 Introduction;306
9.2.2;14.2 Corpus Compilation and Analysis;308
9.2.3;14.3 Factor Analysis;309
9.2.3.1;14.3.1 Method;310
9.2.3.2;14.3.2 Results;310
9.2.3.3;14.3.3 Interpretation of Factors;311
9.2.4;14.4 Text Type Analysis;318
9.2.4.1;14.4.1 Method;318
9.2.4.2;14.4.2 Results;319
9.2.4.3;14.4.3 Interpretation of Clusters;320
9.2.5;14.5 Summary of Findings;323
9.2.6;References;324
9.3;15 Evolving Genres in Online Domains: The Hybrid Genre of the Participatory News Article ;326
9.3.1;15.1 Introduction;326
9.3.1.1;15.1.1 The Systemic Functional Approach to Genre;328
9.3.1.2;15.1.2 The English for Specific Purposes Approach to Genre;329
9.3.1.3;15.1.3 Problems with these Existing Approaches to Genre;331
9.3.1.4;15.1.4 A Solution: Social Genre and Cognitive Genre;332
9.3.1.5;15.1.5 A Web Genre: The Participatory News Article;336
9.3.2;15.2 Methodology;337
9.3.3;15.3 Results;340
9.3.3.1;15.3.1 The News Article;340
9.3.3.2;15.3.2 Reader Comments;343
9.3.4;15.4 Discussion;345
9.3.5;15.5 Conclusion;347
9.3.6;References;348
10;Part VI Prospect;352
10.1;16 Any Land in Sight? ;353
10.1.1;16.1 Web Genre Benchmarks;353
10.1.1.1;16.1.1 Genre Labels;354
10.1.1.2;16.1.2 Annotation;354
10.1.1.3;16.1.3 Representativeness;355
10.1.2;16.2 Work Plan;355
10.1.2.1;16.2.1 Benefits;355
11;Index;357


Ihre Fragen, Wünsche oder Anmerkungen
Vorname*
Nachname*
Ihre E-Mail-Adresse*
Kundennr.
Ihre Nachricht*
Lediglich mit * gekennzeichnete Felder sind Pflichtfelder.
Wenn Sie die im Kontaktformular eingegebenen Daten durch Klick auf den nachfolgenden Button übersenden, erklären Sie sich damit einverstanden, dass wir Ihr Angaben für die Beantwortung Ihrer Anfrage verwenden. Selbstverständlich werden Ihre Daten vertraulich behandelt und nicht an Dritte weitergegeben. Sie können der Verwendung Ihrer Daten jederzeit widersprechen. Das Datenhandling bei Sack Fachmedien erklären wir Ihnen in unserer Datenschutzerklärung.