Gkoulalas-Divanis / Verykios | Association Rule Hiding for Data Mining | E-Book | sack.de
E-Book

E-Book, Englisch, Band 41, 138 Seiten, eBook

Reihe: Advances in Database Systems

Gkoulalas-Divanis / Verykios Association Rule Hiding for Data Mining


1. Auflage 2010
ISBN: 978-1-4419-6569-1
Verlag: Springer US
Format: PDF
Kopierschutz: 1 - PDF Watermark

E-Book, Englisch, Band 41, 138 Seiten, eBook

Reihe: Advances in Database Systems

ISBN: 978-1-4419-6569-1
Verlag: Springer US
Format: PDF
Kopierschutz: 1 - PDF Watermark



Privacy and security risks arising from the application of different data mining techniques to large institutional data repositories have been solely investigated by a new research domain, the so-called privacy preserving data mining. Association rule hiding is a new technique in data mining, which studies the problem of hiding sensitive association rules from within the data.

Association Rule Hiding for Data Mining addresses the problem of "hiding" sensitive association rules, and introduces a number of heuristic solutions. Exact solutions of increased time complexity that have been proposed recently are presented, as well as a number of computationally efficient (parallel) approaches that alleviate time complexity problems, along with a thorough discussion regarding closely related problems (inverse frequent item set mining, data reconstruction approaches, etc.). Unsolved problems, future directions and specific examples are provided throughout this book to help the reader study, assimilate and appreciate the important aspects of this challenging problem.

Association Rule Hiding for Data Mining is designed for researchers, professors and advanced-level students in computer science studying privacy preserving data mining, association rule mining, and data mining. This book is also suitable for practitioners working in this industry.

Gkoulalas-Divanis / Verykios Association Rule Hiding for Data Mining jetzt bestellen!

Zielgruppe


Research

Weitere Infos & Material


Fundamental Concepts.- Background.- Classes of Association Rule Hiding Methodologies.- Other Knowledge Hiding Methodologies.- Summary.- Heuristic Approaches.- Distortion Schemes.- Blocking Schemes.- Summary.- Border Based Approaches.- Border Revision for Knowledge Hiding.- BBA Algorithm.- Max-Min Algorithms.- Summary.- Exact Hiding Approaches.- Menon's Algorithm.- Inline Algorithm.- Two-Phase Iterative Algorithm.- Hybrid Algorithm.- Parallelization Framework for Exact Hiding.- Quantifying the Privacy of Exact Hiding Algorithms.- Summary.- Epilogue.- Conclusions.- Roadmap to Future Work.


"Chapter 10 BBA Algorithm (S. 47-48)

Sun & Yu [66, 67] in 2005 proposed the first frequent itemset hiding methodology that relies on the notion of the border [46] of the nonsensitive frequent itemsets to track the impact of altering transactions in the original database. By evaluating the impact of each candidate item modification to the itemsets of the revised positive border, the algorithm greedily selects to apply those modifications (item deletions) that cause the least impact to the border itemsets. As already covered in the previous chapter, the border itemsets implicitly dictate the status (i.e., frequent vs. infrequent) of every itemset in the database. Consequently, the quality of the borders directly affects the quality of the sanitized database that is produced by the hiding algorithm.

The heuristic strategy that was proposed in [66, 67], assigns a weight to each itemset of the revised positive border (which is the original positive border after it has been shaped up with the removal of the sensitive itemsets) in an attempt to quantify its vulnerability of being affected by an item deletion. The assigned weights are dynamically computed during the sanitization process as a function of the current support of the corresponding itemsets in the database.

To hide a sensitive itemset, the algorithm calculates the expected impact of each candidate item deletion to the itemsets of the revised positive border, by computing the sum of the weights of the revised positive border itemsets that will be affected. Then, the algorithm determines the optimal deletion candidate item, which is the item whose deletion has minimal impact on the revised positive border, and deletes this item from a set of carefully selected transactions.

The proposed strategy aims to minimize the number of nonsensitive frequent itemsets that are affected from the hiding of the sensitive knowledge, as well as it attempts to maintain the relative support of the nonsensitive frequent itemsets in the sanitized database. The rest of this chapter is organized as follows. In Section 10.1, we explicitly state the objectives that drive the hiding process of the BBA algorithm. Following that, Section 10.2 provides the strategy that is employed for the hiding of a sensitive itemset in a way that bears minimal impact on the revised positive border, as well as it presents the order in which the sensitive itemsets are selected to be hidden. Finally, Section 10.3 delivers the pseudocode of the algorithm.

10.1 Hiding Goals

In Chapter 2 (Section 2.2.1) we presented the main goals of association rule hiding methodologies. In the context of frequent itemset hiding, these goals can be restated as follows: (i) all the sensitive itemsets that are mined from the original database should be hidden so that they cannot be mined from its sanitized counterpart under the same (or a higher) threshold of support, (ii) all the nonsensitive frequent itemsets of the original database should be preserved in the sanitized database so that they can be mined under the same threshold of support, and (iii) no ghost itemsets are introduced to the sanitized database, i.e. no itemset that was not among the nonsensitive frequent ones mined from the original database, can be mined from the sanitized database under the same or a higher threshold of support."



Ihre Fragen, Wünsche oder Anmerkungen
Vorname*
Nachname*
Ihre E-Mail-Adresse*
Kundennr.
Ihre Nachricht*
Lediglich mit * gekennzeichnete Felder sind Pflichtfelder.
Wenn Sie die im Kontaktformular eingegebenen Daten durch Klick auf den nachfolgenden Button übersenden, erklären Sie sich damit einverstanden, dass wir Ihr Angaben für die Beantwortung Ihrer Anfrage verwenden. Selbstverständlich werden Ihre Daten vertraulich behandelt und nicht an Dritte weitergegeben. Sie können der Verwendung Ihrer Daten jederzeit widersprechen. Das Datenhandling bei Sack Fachmedien erklären wir Ihnen in unserer Datenschutzerklärung.