Galbrun / Miettinen Redescription Mining
1. Auflage 2017
ISBN: 978-3-319-72889-6
Verlag: Springer International Publishing
Format: PDF
Kopierschutz: 1 - PDF Watermark
E-Book, Englisch, 88 Seiten
Reihe: Computer Science (R0)
ISBN: 978-3-319-72889-6
Verlag: Springer International Publishing
Format: PDF
Kopierschutz: 1 - PDF Watermark
This book provides a gentle introduction to redescription mining, a versatile data mining tool that is useful to find distinct common characterizations of the same objects and, vice versa, to identify sets of objects that admit multiple shared descriptions. It is intended for readers who are familiar with basic data analysis techniques such as clustering, frequent itemset mining, and classification. Redescription mining is defined in a general way, making it applicable to different types of data. The general framework is made more concrete through many practical examples that show the versatility of redescription mining. The book also introduces the main algorithmic ideas for mining redescriptions, together with applications from various domains. The final part of the book contains variations and extensions of the basic redescription mining problem, and discusses some future directions and open questions.
Zielgruppe
Research
Autoren/Hrsg.
Weitere Infos & Material
1;Preface;6
2;Contents;7
3;List of Figures;9
4;List of Symbols;10
5;1 What Is Redescription Mining;11
5.1;1.1 First Examples of Redescriptions;11
5.2;1.2 Formal Definitions;15
5.2.1;1.2.1 The Data;15
5.2.2;1.2.2 The Descriptions;16
5.2.3;1.2.3 The Redescriptions;18
5.2.4;1.2.4 Other Constraints;21
5.2.5;1.2.5 Distance Functions: Why Jaccard?;23
5.2.6;1.2.6 Sets of Redescriptions;26
5.3;1.3 Related Data Mining Problems;28
5.4;1.4 A Short History;30
5.5;References;31
6;2 Algorithms for Redescription Mining;34
6.1;2.1 Finding Queries Using Itemset Mining;35
6.1.1;2.1.1 The MID Algorithm;37
6.1.2;2.1.2 Mining Redescriptions with the CHARM-L Algorithm;38
6.2;2.2 Queries Based on Decision Trees and Forests;39
6.2.1;2.2.1 The CARTwheels Algorithm;41
6.2.2;2.2.2 The SplitT and LayeredT Algorithms;44
6.2.3;2.2.3 The CLUS-RM Algorithm;47
6.3;2.3 Growing the Queries Greedily;49
6.3.1;2.3.1 The ReReMi Algorithm;49
6.4;2.4 A Comparative Discussion;53
6.5;2.5 Handling Missing Values;55
6.6;References;57
7;3 Applications, Variants, and Extensions of Redescription Mining;59
7.1;3.1 Applications of Redescription Mining;59
7.1.1;3.1.1 In Biology;60
7.1.2;3.1.2 In Ecology;63
7.1.3;3.1.3 In Social and Political Sciences and in Economics;64
7.1.4;3.1.4 In Engineering;67
7.2;3.2 Relational Redescription Mining;69
7.2.1;3.2.1 An Example of Relational Redescriptions;69
7.2.2;3.2.2 Formal Definition;71
7.3;3.3 Storytelling;74
7.3.1;3.3.1 Definition and Algorithms;75
7.3.2;3.3.2 Applications;77
7.4;3.4 Future Work: Richer Query Languages;81
7.4.1;3.4.1 Time-Series Redescriptions;81
7.4.2;3.4.2 Subgraph Redescriptions;83
7.4.3;3.4.3 Multi-Query and Multimodal Redescriptions;84
7.5;References;87




