Silvers | Building and Maintaining a Data Warehouse | E-Book | www2.sack.de
E-Book

E-Book, Englisch, 328 Seiten

Silvers Building and Maintaining a Data Warehouse


1. Auflage 2008
ISBN: 978-1-4200-6463-6
Verlag: Taylor & Francis
Format: PDF
Kopierschutz: Adobe DRM (»Systemvoraussetzungen)

E-Book, Englisch, 328 Seiten

ISBN: 978-1-4200-6463-6
Verlag: Taylor & Francis
Format: PDF
Kopierschutz: Adobe DRM (»Systemvoraussetzungen)



As it is with building a house, most of the work necessary to build a data warehouse is neither visible nor obvious when looking at the completed product. While it may be easy to plan for a data warehouse that incorporates all the right concepts, taking the steps needed to create a warehouse that is as functional and user-friendly as it is theoretically sound, is not especially easy. That’s the challenge that Building and Maintaininga Data Warehouse answers.

Based on a foundation of industry-accepted principles, this work provides an easy-to-follow approach that is cohesive and holistic. By offering the perspective of a successful data warehouse, as well as that of a failed one, this workdetails those factors that must be accomplished and those that are best avoided.

Organized to logically progress from more general to specific information, this valuable guide:

- Presents areas of a data warehouse individually and in sequence, showing how each piece becomes a working part of the whole

- Examines the concepts and principles that are at the foundation of every successful data warehouse

- Explains how to recognize and attend to problematic gaps in an established data warehouse

- Provides the big picture perspective that planners and executives require

Those considering the planning and creation of a data warehouse, as well as those who’ve already built one will profit greatly from the insights garnered by the author during his years of creating and gathering information on state-of-the-art data warehouses that are accessible, convenient, and reliable.

Silvers Building and Maintaining a Data Warehouse jetzt bestellen!

Zielgruppe


data managers and information specialists, as well as those businesses and organizations dependent on maintaining significant databases


Autoren/Hrsg.


Weitere Infos & Material


The Big Picture: An Introduction to Data Warehousing

Decision Support Systems; Dimensional and Third Normal Form Data Models; Storing the Data; Data Availability; Monitoring Data Quality.

Data Warehouse Philosophy

Enterprise Data; Subject Orientation; Data Integration; Form; Function; Grain; Nonvolatility; Time Variant; One Version of the Truth; Long-Term Investment.

Source System Analysis

Source System Analysis Principles; System of Record; Entity Data Arithmetic Data: Absolute, Relative, Numeric Data That Isn’t Arithmetic; Alphanumeric Data; Granularity; Latency; Transaction Data; Snapshot Data; Source System Analysis Methods; Data Profile; Data Flow Diagram; Data State Diagram; System of Record; Business Rules.

Relational Database Management System (RDBMS)

Relational Set Theory; RDBMS Product Offerings; Residual Costs; Licensing; Support and Maintenance; Extensibility; Connective Capacity.

Database Design

Data Modeling Methodology; Conceptual and Logical Data Models; Logical (Primary) Key; Attribute; Primary Key/Foreign Key Relation; Cardinality; Super Types and Subtypes; Physical Data Model; Dimensional Data Model; Third Normal Form Data Model; Recursive Data Model; Physical Data Model Summary; Data Architecture; Enterprise Data Warehouse; Data Mart; Operational Data Store; Summaries and Aggregates.

Data Acquisition and Integration

Source System and Target System Analysis; Direct and Indirect Requirements; Language; Data Profile; Data State; Data Mapping; Business Rules; Architecture; Extract, Transform, and Load (ETL); Extract, Load, and Transform (ELT); ETL Design and Process Principles.

Eleven Principles; Staging Principles Conclusion; ETL Functions; Extract Data from a Contiguous Dataset and from a Data Flow; Row-Level and Dataset-Level Transformation.

Surrogate Key Generation: Intradataset, Data Warehouse-Level Transformation, Intra-Data Warehouse, Changed Data Capture, ETL Key, Universe to Universe and Candidate to Universe, Load Data from a Stable and Contiguous Dataset, Load Data from a Data Flow.

Transaction Summary; Dimension Aggregation.

Common Problems: Source Data Anomalies, Incomplete, Redundant, and Misstated Source Data; Business Rule Changes, Obsolete, Redefined, and Unrecorded Data.

Business Intelligence (BI) Reporting

Success Factors; Performance; User Interface; Presentation of the Data Architecture; Alignment with the Data Model; Ability to Answer Questions; Mobility; Flexibility; Availability; Customer Success Factors.

Processes: Proactive, Reactive, Predefined, Ad Hoc; Data, Information, and Analytic Needs; BI Reporting Application and Architecture.

BI Reporting Methods: Predefined and Interactive Reports, Online Analytical Process (OLAP) Reports, MOLAP, ROLAP, HOLAP; Drilling; Push versus Pull; Printed on Paper; Report Archives; Web-Based BI Reporting; Operational BI Reporting: From an ODS, From an Operational System (Real-Time), EDI, Partnerships, and Data Sharing.

BI Reporting: Customer Relationship Management (CRM), Business Metrics Measure the Enterprise, Decisions and Decision Making Closer to the Action; Reporting around the Event; BI Search; Sarbanes–Oxley and BI Reporting; Data Mining; Statistics Concepts; Random Error; Statistical Significance.

Variables: Dependent and Independent.

Hypothesis; Data Mining Tools and Activities; Data Cleansing; Data Inspection; Compound, Lag, Numeric, and Categorical Variables; Hypothesis; Data Mining Algorithms; Neural Network; Decision Tree; CHAID; Nearest Neighbor; Rule Induction; Genetic Algorithm; Rule Validation and Testing; Overfitting.

Data Quality

Deming’s Definition of Quality; Data Quality Service Level Agreement (SLA); Deming’s Statistical Process Control; Process Measurement; Methods and Strategies; Data Stewardship; Post-Load Audit and Report Errant Data.

Plug in a Default Value and Report Errant Data; Reject a Record and Report the Errant Record; Reject a Dataset and Report the Errant Dataset.

Recycle the Data: In Place and Report Errant Data, Recycle Wheel and Report Errant Data, Data Quality Repository; Data Quality Fact Table: Dimensional Data Model, Third Normal Form Data Model; Data Quality Reporting.

Metadata

Types of Metadata; Static and Dynamic Metadata; Metadata Service Level Agreement (SLA); Metadata Repository; Central Metadata Repository: Dimensional Data Model; Third Normal Form; Distributed Metadata Repository; Real-Time Metadata; Data Quality as Metadata; Make or Buy a Metadata Repository.

Data Warehouse Customers

Strategic Decision Makers; Tactical Decision Makers; Knowledge Workers; Operational Applications; External Partners; Electronic Data Interchange (EDI) Partners; Data Warehouse Plan.

Future of Data Warehousing: An Epilogue

Scalability and Performance; Real-Time Data Warehousing; Increased Corporate Presence; Back to the Basics; Data Quality.

Short TOC

The Big Picture: An Introduction to Data Warehousing

Data Warehouse Philosophy

Source System Analysis

Relational Database Management System (RDBMS)

Database Design

Data Acquisition and Integration

Business Intelligence Reporting

Data Quality

Metadata

Data Warehouse Customers

Future of Data Warehousing: An Epilogue

Bibliography

Index

Toc to post to abstract

Preface

Acknowledgments

The Author

Introduction

The Big Picture: An Introduction to Data Warehousing

Introduction

Decision Support Systems

Dimensional and Third Normal Form Data Models

Storing the Data

Data Availability

Monitoring Data Quality

Data Warehouse Philosophy

Introduction

Enterprise Data

Subject Orientation

Data Integration

Form

Function

Grain

Nonvolatility

Time Variant

One Version of the Truth

Long-Term Investment

References

Source System Analysis

Introduction

Source System Analysis Principles

System of Record

Entity Data

Arithmetic Data

Absolute Arithmetic Data

Relative Arithmetic Data

Numeric Data That Isn’t Arithmetic

Alphanumeric Data

Granularity

Latency

Transaction Data

Snapshot Data

Source System Analysis Methods

Data Profile

Data Flow Diagram

Data State Diagram

System of Record

Business Rules

Closing Remarks

References

Relational Database Management System (RDBMS)

Introduction

Relational Set Theory

RDBMS Product Offerings

Residual Costs

Licensing

Support and Maintenance

Extensibility

Connective Capacity

Closing Remarks

References

Database Design

Introduction

Data Modeling Methodology

Conceptual Data Model

Logical Data Model

Logical (Primary) Key

Attribute

Primary Key/Foreign Key Relation

Cardinality

Super Types and Subtypes

Putting It All Together

Physical Data Model

Dimensional Data Model

Third Normal Form Data Model

Recursive Data Model

Physical Data Model Summary

Data Architecture

Enterprise Data Warehouse

Data Mart

Operational Data Store

Summaries and Aggregates

Closing Remarks

References

Data Acquisition and Integration

Introduction

Source System Analysis

Target System Analysis

Direct Requirements

Indirect Requirements

Direct and Indirect Requirements

Language

Data Profile

Data State

Data Mapping

Business Rules

Architecture

Extract, Transform, and Load (ETL)

Extract, Load, and Transform (ELT)

ETL Design Principles

ETL Process Principles

Principle 01: One Thing at a Time

Principle 02: Know When to Begin

Principle 03: Know When to End

Principle 04: Large to Medium to Small

Principle 05: Stage Data Integrity

Principle 06: Know What You Have

Process Principles Conclusion

ETL Staging Principles

Principle 07: Name the Data

Principle 08: Own the Data

Principle 09: Build the Data

Principle 10: Type the Data

Principle 11: Land the Data

Staging Principles Conclusion

ETL Functions

Extract Data from a Contiguous Dataset

Extract Data from a Data Flow

Row-Level Transformation

Dataset-Level Transformation

Surrogate Key Generation: Intradataset

Data Warehouse-Level Transformation

Surrogate Key Generation: Intra-Data Warehouse

Look-Up

Changed Data Capture

ETL Key

Universe to Universe and Candidate to Universe

Load Data from a Stable and Contiguous Dataset

Load Data from a Data Flow

Transaction Summary

Dimension Aggregation

Common Problems

Source Data Anomalies

Incomplete Source Data

Redundant Source Data

Misstated Source Data

Business Rule Changes

Obsolete Data

Redefined Data

Unrecorded Data

Closing Remarks

References

Business Intelligence Reporting

Introduction

BI Reporting Success Factors

Performance

User Interface

Presentation of the Data Architecture

Alignment with the Data Model

Ability to Answer Questions

Mobility

Flexibility

Availability

BI Customer Success Factors

Proactive Processes

Reactive Processes

Predefined Processes

Ad Hoc Processes

Data Needs.

Information Needs

Analytic Needs

BI Reporting Application

Architecture

BI Reporting Methods.

Predefined Reports

Interactive Reports

Online Analytical Process (OLAP) Reports

MOLAP

ROLAP

HOLAP

Drilling

Push versus Pull

Push

Pull

Printed on Paper

Report Archives

Web-Based BI Reporting

Operational BI Reporting: From an ODS

Operational BI Reporting: From an Operational System (Real-Time)

Operational BI Reporting: EDI, Partnerships, and Data Sharing.

BI Reporting: Thus Far.

Customer Relationship Management (CRM)

Business Metrics Measure the Enterprise

Decisions and Decision Making Closer to the Action

BI Reporting: Coming Soon

Reporting around the Event

BI Search

Sarbanes–Oxley and BI Reporting

Data Mining

Statistics Concepts

Random Error

Statistical Significance

Variables: Dependent and Independent

Hypothesis

Data Mining Tools

Data Mining Activities

Data Cleansing

Data Inspection

Compound Variables

Lag Variables

Numeric Variables

Categorical Variables

Hypothesis

Data Mining Algorithms

Neural Network

Decision Tree

CHAID

Nearest Neighbor

Rule Induction

Genetic Algorithm

Rule Validation and Testing

Overfitting

Closing Remarks

References

Data Quality

Introduction

Deming’s Definition of Quality

Data Quality Service Level Agreement (SLA)

Deming’s Statistical Process Control

Process Measurement

Methods and Strategies

Data Stewardship

Post-Load Audit and Report Errant Data

Plug in a Default Value and Report Errant Data

Reject a Record and Report the Errant Record

Reject a Dataset and Report the Errant Dataset

Recycle the Data: In Place and Report Errant Data

Recycle the Data: Recycle Wheel and Report Errant Data

Data Quality Repository

Data Quality Fact Table: Dimensional Data Model

Data Quality Fact Table: Third Normal Form Data Model

Data Quality Reporting

Follow Through

Closing Remarks

References

Metadata

Introduction

Types of Metadata

Static Metadata

Dynamic Metadata

Metadata Service Level Agreement (SLA)

Metadata Repository

Central Metadata Repository: Dimensional Data Model

Central Metadata Repository: Third Normal Form

Distributed Metadata Repository

Real-Time Metadata

Data Quality as Metadata

Make or Buy a Metadata Repository

Closing Remarks

References

Data Warehouse Customers

Introduction

Strategic Decision Makers

Tactical Decision Makers

Knowledge Workers

Operational Applications

External Partners

Electronic Data Interchange (EDI) Partners

Data Warehouse Plan

Strategic Decision Makers

Tactical Decision Makers

Knowledge Workers

Operational Applications

External Partners

Electronic Data Interchange (EDI) Partners

Closing Remarks

Future of Data Warehousing: An Epilogue

Introduction

Scalability and Performance

Real-Time Data Warehousing

Increased Corporate Presence

Back to the Basics

Data Quality

Bibliography

Index



Ihre Fragen, Wünsche oder Anmerkungen
Vorname*
Nachname*
Ihre E-Mail-Adresse*
Kundennr.
Ihre Nachricht*
Lediglich mit * gekennzeichnete Felder sind Pflichtfelder.
Wenn Sie die im Kontaktformular eingegebenen Daten durch Klick auf den nachfolgenden Button übersenden, erklären Sie sich damit einverstanden, dass wir Ihr Angaben für die Beantwortung Ihrer Anfrage verwenden. Selbstverständlich werden Ihre Daten vertraulich behandelt und nicht an Dritte weitergegeben. Sie können der Verwendung Ihrer Daten jederzeit widersprechen. Das Datenhandling bei Sack Fachmedien erklären wir Ihnen in unserer Datenschutzerklärung.