Srivastava / Sahami | Text Mining | E-Book | www2.sack.de
E-Book

Srivastava / Sahami Text Mining

Classification, Clustering, and Applications
Erscheinungsjahr 2010
ISBN: 978-1-4200-5945-8
Verlag: Taylor & Francis
Format: PDF
Kopierschutz: Adobe DRM (»Systemvoraussetzungen)

Classification, Clustering, and Applications

E-Book, Englisch, 328 Seiten

Reihe: Chapman & Hall/CRC Data Mining and Knowledge Discovery Series

ISBN: 978-1-4200-5945-8
Verlag: Taylor & Francis
Format: PDF
Kopierschutz: Adobe DRM (»Systemvoraussetzungen)



The Definitive Resource on Text Mining Theory and Applications from Foremost Researchers in the Field
Giving a broad perspective of the field from numerous vantage points, Text Mining: Classification, Clustering, and Applications focuses on statistical methods for text mining and analysis. It examines methods to automatically cluster and classify text documents and applies these methods in a variety of areas, including adaptive information filtering, information distillation, and text search.

The book begins with chapters on the classification of documents into predefined categories. It presents state-of-the-art algorithms and their use in practice. The next chapters describe novel methods for clustering documents into groups that are not predefined. These methods seek to automatically determine topical structures that may exist in a document corpus. The book concludes by discussing various text mining applications that have significant implications for future research and industrial use.

There is no doubt that text mining will continue to play a critical role in the development of future information systems and advances in research will be instrumental to their success. This book captures the technical depth and immense practical potential of text mining, guiding readers to a sound appreciation of this burgeoning field.

Srivastava / Sahami Text Mining jetzt bestellen!

Weitere Infos & Material


Analysis of Text Patterns Using Kernel Methods
Marco Turchi, Alessia Mammone, and Nello Cristianini
Introduction

General Overview on Kernel Methods

Kernels for Text
Example
Conclusion and Further Reading
Detection of Bias in Media Outlets with Statistical Learning Methods
Blaz Fortuna, Carolina Galleguillos, and Nello Cristianini
Introduction

Overview of the Experiments

Data Collection and Preparation

News Outlet Identification

Topic-Wise Comparison of Term Bias

News Outlets Map

Related Work

Conclusion
Appendix A: Support Vector Machines

Appendix B: Bag of Words and Vector Space Models

Appendix C: Kernel Canonical Correlation Analysis

Appendix D: Multidimensional Scaling
Collective Classification for Text Classification
Galileo Namata, Prithviraj Sen, Mustafa Bilgic, and Lise Getoor
Introduction

Collective Classification: Notation and Problem Definition

Approximate Inference Algorithms for Approaches Based on Local Conditional Classifiers
Approximate Inference Algorithms for Approaches Based on Global Formulations

Learning the Classifiers
Experimental Comparison

Related Work

Conclusion

Topic Models
David M. Blei and John D. Lafferty
Introduction

Latent Dirichlet Allocation (LDA)
Posterior Inference for LDA

Dynamic Topic Models and Correlated Topic Models

Discussion

Nonnegative Matrix and Tensor Factorization for Discussion Tracking
Brett W. Bader, Michael W. Berry, and Amy N. Langville
Introduction

Notation

Tensor Decompositions and Algorithms

Enron Subset

Observations and Results

Visualizing Results of the NMF Clustering

Future Work
Text Clustering with Mixture of von Mises–Fisher Distributions
Arindam Banerjee, Inderjit Dhillon, Joydeep Ghosh, and Suvrit Sra
Introduction

Related Work

Preliminaries

EM on a Mixture of vMFs (moVMF)

Handling High-Dimensional Text Datasets

Algorithms

Experimental Results

Discussion

Conclusions and Future Work

Constrained Partitional Clustering of Text Data: An Overview
Sugato Basu and Ian Davidson
Introduction

Uses of Constraints

Text Clustering

Partitional Clustering with Constraints

Learning Distance Function with Constraints

Satisfying Constraints and Learning Distance Functions

Experiments

Conclusions

Adaptive Information Filtering
Yi Zhang
Introduction

Standard Evaluation Measures
Standard Retrieval Models and Filtering Approaches
Collaborative Adaptive Filtering
Novelty and Redundancy Detection
Other Adaptive Filtering Topics

Utility-Based Information Distillation
Yiming Yang and Abhimanyu Lad
Introduction
A Sample Task

Technical Cores

Evaluation Methodology

Data

Experiments and Results
Concluding Remarks
Text Search Enhanced with Types and Entities
Soumen Chakrabarti, Sujatha Das, Vijay Krishnan, and Kriti Puniyani
Entity-Aware Search Architecture

Understanding the Question

Scoring Potential Answer Snippets

Indexing and Query Processing

Conclusion

Index


Ashok N. Srivastava is the Principal Investigator of the Integrated Vehicle Health Management research project in the NASA Aeronautics Research Mission Directorate. Dr. Srivastava also leads the Intelligent Data Understanding group at NASA Ames Research Center.
Mehran Sahami is an Associate Professor and Associate Chair for Education in the computer science department at Stanford University.



Ihre Fragen, Wünsche oder Anmerkungen
Vorname*
Nachname*
Ihre E-Mail-Adresse*
Kundennr.
Ihre Nachricht*
Lediglich mit * gekennzeichnete Felder sind Pflichtfelder.
Wenn Sie die im Kontaktformular eingegebenen Daten durch Klick auf den nachfolgenden Button übersenden, erklären Sie sich damit einverstanden, dass wir Ihr Angaben für die Beantwortung Ihrer Anfrage verwenden. Selbstverständlich werden Ihre Daten vertraulich behandelt und nicht an Dritte weitergegeben. Sie können der Verwendung Ihrer Daten jederzeit widersprechen. Das Datenhandling bei Sack Fachmedien erklären wir Ihnen in unserer Datenschutzerklärung.