E-Book, Englisch, 244 Seiten
Celko Joe Celko's Complete Guide to NoSQL
1. Auflage 2013
ISBN: 978-0-12-407220-6
Verlag: Elsevier Science & Techn.
Format: EPUB
Kopierschutz: 6 - ePub Watermark
What Every SQL Professional Needs to Know about Non-Relational Databases
E-Book, Englisch, 244 Seiten
ISBN: 978-0-12-407220-6
Verlag: Elsevier Science & Techn.
Format: EPUB
Kopierschutz: 6 - ePub Watermark
Joe Celko's Complete Guide to NoSQL provides a complete overview of non-relational technologies so that you can become more nimble to meet the needs of your organization. As data continues to explode and grow more complex, SQL is becoming less useful for querying data and extracting meaning. In this new world of bigger and faster data, you will need to leverage non-relational technologies to get the most out of the information you have. Learn where, when, and why the benefits of NoSQL outweigh those of SQL with Joe Celko's Complete Guide to NoSQL. This book covers three areas that make today's new data different from the data of the past: velocity, volume and variety. When information is changing faster than you can collect and query it, it simply cannot be treated the same as static data. Celko will help you understand velocity, to equip you with the tools to drink from a fire hose. Old storage and access models do not work for big data. Celko will help you understand volume, as well as different ways to store and access data such as petabytes and exabytes. Not all data can fit into a relational model, including genetic data, semantic data, and data generated by social networks. Celko will help you understand variety, as well as the alternative storage, query, and management frameworks needed by certain kinds of data. - Gain a complete understanding of the situations in which SQL has more drawbacks than benefits so that you can better determine when to utilize NoSQL technologies for maximum benefit - Recognize the pros and cons of columnar, streaming, and graph databases - Make the transition to NoSQL with the expert guidance of best-selling SQL expert Joe Celko
Joe Celko served 10 years on ANSI/ISO SQL Standards Committee and contributed to the SQL-89 and SQL-92 Standards. Mr. Celko is author a series of books on SQL and RDBMS for Elsevier/MKP. He is an independent consultant based in Austin, Texas. He has written over 1200 columns in the computer trade and academic press, mostly dealing with data and databases.
Autoren/Hrsg.
Weitere Infos & Material
1;Front Cover;1
2;Joe Celko's Complete Guide to NoSQL: What Every SQL Professional Needs to Know about NonRelational Databases;4
3;Copyright;5
4;Dedication;6
5;Contents;8
6;About the Author;16
7;Introduction;18
8;Chapter 1: NoSQL and Transaction Processing;22
8.1;Introduction;22
8.2;1.1. Databases Transaction Processing in the Batch Processing World;22
8.3;1.2. Transaction Processing in the Disk Processing World;23
8.4;1.3. ACID;24
8.5;1.4. Pessimistic Concurrency in Detail;26
8.5.1;1.4.1. Isolation Levels;27
8.5.2;1.4.2. Proprietary Isolation Levels;29
8.6;1.5. CAP Theorem;31
8.7;1.6. BASE;32
8.8;1.7. Server-side Consistency;34
8.9;1.8. Error Handling;34
8.10;1.9. Why SQL Does Not Work Here;35
8.11;Concluding Thoughts;35
8.12;References;35
9;Chapter 2: Columnar Databases;36
9.1;Introduction;36
9.2;2.1. History;37
9.3;2.2. How It Works;42
9.4;2.3. Query Optimizations;43
9.5;2.4. Multiple Users and Hardware;43
9.6;2.5. Doing an ALTER Statement;45
9.7;2.6. Data Warehouses and Columnar Databases;45
9.8;Concluding Thoughts;46
9.9;Reference;46
10;Chapter 3: Graph Databases;48
10.1;Introduction;48
10.2;3.1. Graph Theory Basics;49
10.2.1;3.1.1. Nodes;49
10.2.2;3.1.2. Edges;50
10.2.3;3.1.3. Graph Structures;51
10.3;3.2. RDBMS Versus Graph Database;52
10.4;3.3. Six Degrees of Kevin Bacon Problem;52
10.4.1;3.3.1. Adjacency List Model for General Graphs;52
10.4.2;3.3.2. Covering Paths Model for General Graphs;56
10.4.3;3.3.3. Real-World Data Has Mixed Relationships;59
10.5;3.4. Vertex Covering;61
10.6;3.5. Graph Programming Tools;63
10.6.1;3.5.1. Graph Databases;63
10.6.2;3.5.2. Graph Languages;64
10.6.2.1;SPARQL;64
10.6.2.2;SPASQL;65
10.6.2.3;Gremlin;65
10.6.2.4;Cypher (NEO4j);65
10.6.2.5;Trends;67
10.7;Concluding Thoughts;67
10.8;References;67
11;Chapter 4: MapReduce Model;68
11.1;Introduction;68
11.2;4.1. Hadoop Distributed File System;70
11.3;4.2. Query Languages;71
11.3.1;4.2.1. Pig Latin;71
11.3.2;4.2.2. Hive and Other Tools;81
11.4;Concluding Thoughts;83
11.5;References;83
12;Chapter 5: Streaming Databases and Complex Events;84
12.1;Introduction;84
12.2;5.1. Generational Concurrency Models;85
12.2.1;5.1.1. Optimistic Concurrency;85
12.2.2;5.1.2. Isolation Levels in Optimistic Concurrency;86
12.3;5.2. Complex Event Processing;88
12.3.1;5.2.1. Terminology for Event Processing;89
12.3.2;5.2.2. Event Processing versus State Change Constraints;91
12.3.3;5.2.3. Event Processing versus Petri Nets;92
12.4;5.3. Commercial Products;94
12.4.1;5.3.1. StreamBase 1;94
12.4.2;5.3.2. Kx 2;97
12.5;Concluding Thoughts;100
12.6;References;100
13;Chapter 6: Key–Value Stores;102
13.1;Introduction;102
13.2;6.1. Schema Versus no Schema;102
13.3;6.2. Query Versus Retrieval;103
13.4;6.3. Handling Keys;103
13.4.1;6.3.1. Berkeley DB;104
13.4.2;6.3.2. Access by Tree Indexing or Hashing;105
13.5;6.4. Handling Values;105
13.5.1;6.4.1. Arbitrary Byte Arrays;105
13.5.2;6.4.2. Small Files of Known Structure;106
13.6;6.5. Products;107
13.7;Concluding Thoughts;109
14;Chapter 7: Textbases;110
14.1;Introduction;110
14.2;7.1. Classic Document Management Systems;110
14.2.1;7.1.1. Document Indexing and Storage;111
14.2.2;7.1.2. Keyword and Keyword in Context;111
14.2.3;7.1.3. Industry Standards;113
14.2.3.1;Contextual Query Language;113
14.2.3.2;Commercial Services and Products;115
14.2.3.3;Regular Expressions;116
14.3;7.2. Text Mining and Understanding;117
14.3.1;7.2.1. Semantics versus Syntax;118
14.3.2;7.2.2. Semantic Networks;119
14.4;7.3. Language Problem;120
14.4.1;7.3.1. Unicode and ISO Standards;121
14.4.2;7.3.2. Machine Translation;121
14.5;Concluding Thoughts;122
14.6;References;123
15;Chapter 8: Geographical Data;124
15.1;Introduction;124
15.2;8.1. GIS Queries;126
15.2.1;8.1.1. Simple Location;126
15.2.2;8.1.2. Simple Distance;126
15.2.3;8.1.3. Find Quantities, Densities, and Contents within an Area;126
15.2.4;8.1.4. Proximity Relationships;127
15.2.5;8.1.5. Temporal Relationships;127
15.3;8.2. Locating Places;127
15.3.1;8.2.1. Longitude and Latitude;128
15.3.2;8.2.2. Hierarchical Triangular Mesh;129
15.3.3;8.2.3. Street Addresses;132
15.3.4;8.2.4. Postal Codes;133
15.3.5;8.2.5. ZIP Codes;133
15.3.6;8.2.6. Canadian Postal Codes;134
15.3.7;8.2.7. Postcodes in the United Kingdom;135
15.3.7.1;Postcode Formats;135
15.3.7.2;Greater London Postcodes;136
15.4;8.3. SQL Extensions for GIS;137
15.5;Concluding Thoughts;137
15.6;References;138
16;Chapter 9: Big Data and Cloud Computing;140
16.1;Introduction;140
16.2;9.1. Objections to Big Data and the Cloud;142
16.2.1;9.1.1. Cloud Computing is a Fad;142
16.2.2;9.1.2. Cloud Computing is Not as Secure as in-house Data Servers;143
16.2.3;9.1.3. Cloud Computing is Costly;143
16.2.4;9.1.4. Cloud Computing is Complicated;143
16.2.5;9.1.5. Cloud Computing is Meant for Big Companies;143
16.2.6;9.1.6. Changes Are Only Technical;144
16.2.7;9.1.7. If the Internet Goes Down, the Cloud Becomes Useless;145
16.3;9.2. Big Data and Data Mining;145
16.3.1;9.2.1. Big Data for Nontraditional Analysis;146
16.3.2;9.2.2. Big Data for Systems Consolidation;147
16.4;Concluding Thoughts;148
16.5;References;149
17;Chapter 10: Biometrics, Fingerprints, and Specialized Databases;150
17.1;Introduction;150
17.2;10.1. Naive Biometrics;151
17.3;10.2. Fingerprints;153
17.3.1;10.2.1. Classification;153
17.3.2;10.2.2. Matching;154
17.3.3;10.2.3. NIST Standards;155
17.4;10.3. DNA Identification;157
17.4.1;10.3.1. Basic Principles and Technology;158
17.5;10.4. Facial Databases;159
17.5.1;10.4.1. History;160
17.5.2;10.4.2. Who Is Using Facial Databases;162
17.5.3;10.4.3. How Good Is It?;163
17.6;Concluding Thoughts;165
17.7;References;165
18;Chapter 11: Analytic Databases;166
18.1;Introduction;166
18.2;11.1. Cubes;166
18.3;11.2. Dr. Codd’s OLAP Rules;167
18.3.1;11.2.1. Dr. Codd’s Basic Features;168
18.3.2;11.2.2. Special Features;170
18.3.3;11.2.3. Reporting Features;171
18.3.4;11.2.4. Dimension Control;171
18.4;11.3. MOLAP;172
18.5;11.4. ROLAP;172
18.6;11.5. HOLAP;173
18.7;11.6. OLAP Query Languages;173
18.8;11.7. Aggregation Operators in SQL;174
18.8.1;11.7.1. GROUP BY GROUPING SET;174
18.8.2;11.7.2. ROLLUP;175
18.8.3;11.7.3. CUBE;177
18.8.4;11.7.4. Notes about Usage;178
18.9;11.8. OLAP Operators in SQL;178
18.9.1;11.8.1. OLAP Functionality;179
18.9.1.1;Row Numbering;179
18.9.1.2;RANK and DENSE_RANK;181
18.9.1.3;Window Clause;183
18.9.2;11.8.2. NTILE(n);185
18.9.3;11.8.3. Nesting OLAP Functions;186
18.9.4;11.8.4. Sample Queries;186
18.10;11.9. Sparseness in Cubes;187
18.10.1;11.9.1. Hypercube;188
18.10.2;11.9.2. Dimensional Hierarchies;189
18.10.3;11.9.3. Drilling and Slicing;191
18.11;Concluding Thoughts;191
18.12;References;192
19;Chapter 12: Multivalued or NFNF Databases;194
19.1;Introduction;194
19.2;12.1. Nested File Structures;194
19.3;12.2. Multivalued Systems;197
19.4;12.3. NFNF Databases;199
19.5;12.4. Existing Table-Valued Extensions;203
19.5.1;12.4.1. Microsoft SQL Server;203
19.5.2;12.4.2. Oracle Extensions;203
19.6;Concluding Thoughts;205
20;Chapter 13: Hierarchical and Network Database Systems;206
20.1;Introduction;206
20.2;13.1. Types of Databases;206
20.3;13.2. Database History;207
20.3.1;13.2.1. DL/I;208
20.3.2;13.2.2. Control Blocks;209
20.3.3;13.2.3. Data Communications;209
20.3.4;13.2.4. Application Programs;209
20.3.5;13.2.5. Hierarchical Databases;210
20.3.6;13.2.6. Strengths and Weaknesses;210
20.4;13.3. Simple Hierarchical Database;211
20.4.1;13.3.1. Department Database;213
20.4.2;13.3.2. Student Database;213
20.4.3;13.3.3. Design Considerations;213
20.4.4;13.3.4. Example Database Expanded;214
20.4.5;13.3.5. Data Relationships;215
20.4.6;13.3.6. Hierarchical Sequence;216
20.4.7;13.3.7. Hierarchical Data Paths;217
20.4.8;13.3.8. Database Records;218
20.4.9;13.3.9. Segment Format;219
20.4.10;13.3.10. Segment Definitions;220
20.5;13.4. Summary;220
20.6;Concluding Thoughts;221
20.7;References;222
21;Glossary;224
22;Index;238