Building Open Source ETL Solutions with Pentaho Data Integration
E-Book, Englisch, 720 Seiten, E-Book
ISBN: 978-0-470-94242-0
Verlag: John Wiley & Sons
Format: PDF
Kopierschutz: Adobe DRM (»Systemvoraussetzungen)
This practical book is a complete guide to installing,configuring, and managing Pentaho Kettle. If you're adatabase administrator or developer, you'll first get up tospeed on Kettle basics and how to apply Kettle to create ETLsolutions--before progressing to specialized concepts such asclustering, extensibility, and data vault models. Learn how todesign and build every phase of an ETL solution.
* Shows developers and database administrators how to use theopen-source Pentaho Kettle for enterprise-level ETL processes(Extracting, Transforming, and Loading data)
* Assumes no prior knowledge of Kettle or ETL, and bringsbeginners thoroughly up to speed at their own pace
* Explains how to get Kettle solutions up and running, thenfollows the 34 ETL subsystems model, as created by the KimballGroup, to explore the entire ETL lifecycle, including all aspectsof data warehousing with Kettle
* Goes beyond routine tasks to explore how to extend Kettle andscale Kettle solutions using a distributed "cloud"
Get the most out of Pentaho Kettle and your data warehousingwith this detailed guide--from simple single table datamigration to complex multisystem clustered data integrationtasks.
Autoren/Hrsg.
Weitere Infos & Material
Introduction.
Part I Getting Started.
Chapter 1 ETL Primer.
Chapter 2 Kettle Concepts.
Chapter 3 Installation and Configuration.
Chapter 4 An Example ETL Solution--Sakila.
Part II ETL.
Chapter 5 ETL Subsystems.
Chapter 6 Data Extraction.
Chapter 7 Cleansing and Conforming.
Chapter 8 Handling Dimension Tables.
Chapter 9 Loading Fact Tables.
Chapter 10 Working with OLAP Data.
Part III Management and Deployment.
Chapter 11 ETL Development Lifecycle.
Chapter 12 Scheduling and Monitoring.
Chapter 13 Versioning and Migration.
Chapter 14 Lineage and Auditing.
Part IV Performance and Scalability.
Chapter 15 Performance Tuning.
Chapter 16 Parallelization, Clustering, and Partitioning.
Chapter 17 Dynamic Clustering in the Cloud.
Chapter 18 Real-Time Data Integration.
Part V Advanced Topics.
Chapter 19 Data Vault Management.
Chapter 20 Handling Complex Data Formats.
Chapter 21 Web Services.
Chapter 22 Kettle Integration.
Chapter 23 Extending Kettle.
Appendix A The Kettle Ecosystem.
Appendix B Kettle Enterprise Edition Features.
Appendix C Built-in Variables and Properties Reference.
Index.