Resch / Roller / Benkert | High Performance Computing on Vector Systems 2009 | E-Book | www2.sack.de
E-Book

E-Book, Englisch, 250 Seiten

Resch / Roller / Benkert High Performance Computing on Vector Systems 2009


1. Auflage 2009
ISBN: 978-3-642-03913-3
Verlag: Springer
Format: PDF
Kopierschutz: Wasserzeichen (»Systemvoraussetzungen)

E-Book, Englisch, 250 Seiten

ISBN: 978-3-642-03913-3
Verlag: Springer
Format: PDF
Kopierschutz: Wasserzeichen (»Systemvoraussetzungen)



This book covers the results of the Tera op Workbench, other projects related to High Performance Computing, and the usage of HPC installations at HLRS. The Tera op Workbench project is a collaboration between the High Performance C- puting Center Stuttgart (HLRS) and NEC Deutschland GmbH (NEC-HPCE) to s- port users in achieving their research goals using High Performance Computing. The rst stage of the Tera op Workbench project (2004-2008) concentrated on user's applications and their optimization for the former ag ship of HLRS, a - node NEC SX-8 installation. During this stage, numerous individual codes, dev- oped and maintained by researchers or commercial organizations, have been a- lyzed and optimized. Within the project, several of the codes have shown the ability to outreach the TFlop/s threshold of sustained performance. This created the pos- bility for new science and a deeper understanding of the underlying physics. The second stage of the Tera op Workbench project (2008-2012) focuses on c- rent and future trends of hardware and software developments. We observe a strong tendency to heterogeneous environments on the hardware level, while at the same time, applications become increasingly heterogeneous by including multi-physics or multi-scale effects. The goal of the current studies of the Tera op Workbench is to gain insight in the developments of both components. The overall target is to help scientists to run their application in the most ef cient and most convenient way on the hardware best suited for their purposes.

Resch / Roller / Benkert High Performance Computing on Vector Systems 2009 jetzt bestellen!

Weitere Infos & Material


1;Preface;5
2;Contents;7
3; I Petaflop/s Computing;14
3.1;Lessons Learned from 1-Year Experience with SX-9 and Toward the Next Generation Vector Computing;15
3.1.1;Introduction;15
3.1.2;SX-9 System Overview;16
3.1.3;HPC Challenge Benchmark Results;18
3.1.4;Case Study Analysis of Memory-Conscious Tuning for SX-9;25
3.1.5;Multi-Vector Cores Processor Design;30
3.1.6;Summary;33
3.1.7;References;34
3.2;BSC-CNS Research and Supercomputing Resources;35
3.2.1;Overview;35
3.2.2;Supercomputing Resources at BSC;36
3.2.2.1;MareNostrum;36
3.2.2.2;MareNostrum Performance 2008;38
3.2.2.3;Shared Memory System;38
3.2.2.4;Backup and HSM Service;39
3.2.2.5;Spanish Supercomputing Network;39
3.2.2.6;PRACE Prototype;40
3.2.3;Research at BSC;42
3.3;Challenges and Opportunities of Hybrid Computing Systems;43
3.3.1;Introduction;43
3.3.2;European Context;45
3.3.3;Validation Scenario;46
3.3.4;Initial Results;47
3.3.5;Operational Requirements;49
3.3.6;Conclusions and Future Work;51
3.3.7;References;51
3.4;Going Forward with GPU Computing;52
3.4.1;Computing needs at CEA;52
3.4.2;Starting the Process;54
3.4.2.1;Available Hardware;54
3.4.2.2;Choosing a Programming Language;57
3.4.2.2.1;CUDA;57
3.4.2.2.2;OpenCL;58
3.4.2.2.3;RapidMind;58
3.4.2.2.4;HMPP;58
3.4.2.2.5;A Remark on Languages;60
3.4.2.2.6;Training Sessions;60
3.4.2.3;The System Administration Side;60
3.4.2.3.1;The Grand Challenges Strategy;61
3.4.2.3.2;Foreseen Problems;61
3.4.3;First Results;62
3.4.4;Conclusion;63
3.5;Optical Interconnection Technology for the Next Generation Supercomputers;64
3.5.1;Introduction;64
3.5.2;Components and Structure;66
3.5.3;Performance;67
3.5.4;Conclusions;69
3.5.5;References;69
3.6;HPC Architecture from Application Perspectives;70
3.6.1;Introduction;70
3.6.2;Trend of CPU Performance;72
3.6.3;Architectural Challenges;74
3.6.4;SIMD-based Approaches;75
3.6.5;Conclusions;77
3.6.6;References;78
4; II Strategies;79
4.1;A Language for Fortran Source to Source Transformation;80
4.1.1;Compiler;80
4.1.2;Self Defined Transformations;81
4.1.3;The Transformation Language;81
4.1.3.1;Transformation Variables;82
4.1.3.2;Transformation Constructs;82
4.1.3.3;Self Defined Procedures in the Transformation Code;83
4.1.3.4;Intrinsic Procedures;83
4.1.3.5;Parsing Primitives in Parsing Mode;84
4.1.4;Examples;85
4.1.5;Concluding Remarks;87
4.2;The SX-Linux Project: A Progress Report;88
4.2.1;Introduction;89
4.2.2;Project Paths;89
4.2.3;Progress and Status;91
4.2.3.1;The GNU Toolchain;91
4.2.3.1.1;Binutils;91
4.2.3.1.2;GCC;92
4.2.3.1.3;Current Toolchain Status;93
4.2.3.1.4;Future Work;94
4.2.3.2;User Space and I/O Forwarding;94
4.2.3.2.1;Newlib;94
4.2.3.2.1.1;Future of Newlib;95
4.2.3.2.2;Virtualization Layer;96
4.2.3.2.3;I/O Forwarding;97
4.2.3.2.3.1;I/O Forwarding-Current Implementation;98
4.2.3.2.3.2;I/O Forwarding Library Status;99
4.2.3.3;Kernel;99
4.2.3.3.1;Kitten LWK;100
4.2.3.3.2;Implementation and Status;100
4.2.3.3.2.1;Bootstrapping;100
4.2.3.3.2.2;Early Introspection;100
4.2.3.3.2.3;Stack and Memory Layout;101
4.2.3.3.2.4;Interrupts;102
4.2.3.3.2.5;System Calls;102
4.2.3.3.2.6;Context Switch;102
4.2.3.3.2.7;User Space;103
4.2.3.3.2.8;Status;103
4.2.4;Outlook;104
4.2.5;References;105
4.3;Development of APIs for Desktop Supercomputing;106
4.3.1;Introduction;106
4.3.2;Client APIs for GDS;108
4.3.2.1;Client APIs;108
4.3.2.2;Script Generator API;108
4.3.2.3;Implementation of Script Generator API in AEGIS;110
4.3.3;Development of GDS Application of Three-dimensional Virtual Plant Vibration Simulator;112
4.3.3.1;Three-dimensional Virtual Plant Vibration Simulator;112
4.3.3.2;Development of GDS Application of Three-dimensional Virtual Plant Vibration Simulator;113
4.3.4;Summary;114
4.3.5;References;115
4.4;The Grid Middleware on SX and Its Operation for Nation-Wide Service;117
4.4.1;Introduction;117
4.4.2;Structure of NAREGI Grid Middleware;118
4.4.2.1;Managing Resources by Using Web Services: IS, SS, and GridVM;118
4.4.2.2;Virtualizing the Computing Resources: GridVM Scheduler and GridVM Engines;119
4.4.3;Features and Issues of NAREGI Grid Middleware;119
4.4.3.1;Reservation-Type Job Scheduling;120
4.4.3.2;Virtualization and Overheads;120
4.4.3.3;Load Concentration on Management Nodes;121
4.4.3.4;Scheduling of Non-reserved Jobs;121
4.4.3.5;Maintaining Coherency and Consistency in the Web Services on the Grid;122
4.4.4;Features of NEC's NQS-II/JobManipulator Local Scheduler and Its Use at the Cybermedia Center of Osaka University;122
4.4.5;GridVM for SX;123
4.4.5.1;Creating a System to Verify the Coherence and Consistency of Web Services;123
4.4.5.2;Delegation of Reservation Table Administration Authorization by Synchronization of Tables;125
4.4.5.3;Co-existence with GRAM/MDS Interface;126
4.4.5.4;Enabling MPI/SX Job Execution;126
4.4.6;Future Issues;126
4.4.7;References;127
5; III Applications;128
5.1;From Static Domains to Graph Decomposition for Heterogeneous Cluster Programming;129
5.1.1;Introduction;129
5.1.2;Epitaxial Surface Growth;130
5.1.2.1;Introduction to Physical Model;130
5.1.2.2;Simulation;131
5.1.2.3;Domain Decomposition;132
5.1.2.4;Atomic Interaction;134
5.1.2.5;Results;135
5.1.3;Potts Model Simulations;136
5.1.3.1;Domain Decomposition;139
5.1.3.2;Results;141
5.1.4;Graph Domain Decomposition;143
5.1.4.1;Model;144
5.1.4.2;Workbalance;147
5.1.4.3;Domain Decomposition. Grouping Algorithm;148
5.1.4.4;Programming;149
5.1.4.5;Results;150
5.1.5;References;152
6; IV Computational Fluid Dynamics;154
6.1;Direct Numerical Simulations of Turbulent Shear Flows;155
6.1.1;Introduction;155
6.1.2;Numerical Method;157
6.1.3;Performance on Distributed Memory Systems;157
6.1.4;Performance on a Vector System;158
6.1.5;The `Virtual Wind Tunnel';159
6.1.5.1;Supersonic Axisymmetric Wakes;159
6.1.5.2;Turbulent Flow over Airfoil Trailing Edges;161
6.1.5.3;Compressible Mixing Layer;162
6.1.5.4;Jet Noise;163
6.1.5.5;Turbulent Spots in Supersonic Boundary Layers;164
6.1.5.6;Turbulent Breakdown of Vortex Rings;165
6.1.5.7;Wing Tip Vortex Breakdown and Far Wakes;166
6.1.6;Summary;167
6.1.7;References;168
6.2;Large-Scale Flow Computation of Complex Geometries by Building-Cube Method;170
6.2.1;Introduction;170
6.2.2;Building-Cube Method;172
6.2.2.1;Overview;172
6.2.2.2;Flow Solver;172
6.2.3;Code Optimization (Vectorization and Parallelization);173
6.2.3.1;Vectorization;173
6.2.3.2;Parallelization;174
6.2.4;Large-Scale Flow Computation;175
6.2.5;Conclusion;180
6.2.6;References;181
6.3;A New Parallel SPH Method for 3D Free Surface Flows;182
6.3.1;The SPH Approach;182
6.3.2;The MPI Parallelization with Dynamic Load-Balancing;183
6.3.3;3D Dam Break and Impact Test Problem;185
6.3.4;Mesh-Convergence Test;187
6.3.5;Application to a Realistic Mudflow;187
6.3.6;References;191
7; V Climate Modeling;192
7.1;The Agulhas System as a Prime Example for the Use of Nesting Capabilities in Ocean Modelling;193
7.1.1;Motivation;193
7.1.2;Modelling Environment;195
7.1.3;Scientific Achievements;197
7.1.4;Conclusion;199
7.1.5;References;199
7.2;Seamless Simulations in Climate Variability and HPC;201
7.2.1;Introduction;201
7.2.2;Model Description;203
7.2.2.1;The Atmosphere Component: MSSG-A;203
7.2.2.2;The Ocean Component: MSSG-O;204
7.2.2.3;Grid Configuration of MSSG;205
7.2.2.4;Differencing Schemes;205
7.2.2.5;Algebraic Multigrid Method in a Poisson Solver;206
7.2.2.6;Coupling Between MSSG-A and MSSG-O;206
7.2.3;Implementation of MSSG on the Earth Simulator;207
7.2.3.1;Coding Style;207
7.2.3.2;Distribution Architecture and Communications;207
7.2.3.3;Inter-/Intra-node Parallel Architectures and Vector Processing;208
7.2.3.4;Memory and Cost Reductions for Land Area in MSSG-O;209
7.2.3.5;Overlapped Computations in the Ocean Component;210
7.2.3.6;Coupling Scheme with High Computational Performance in MSSG;210
7.2.4;Computational Performance on MSSG on the Earth Simulator;211
7.2.4.1;Performance and Scalability;211
7.2.4.2;Cost Balance and Communication Cost;213
7.2.4.3;Efficiency of Overlapped Computation in the Oceanic Component;214
7.2.5;Simulation Results;215
7.2.5.1;Global Simulation with MSSG-A;215
7.2.5.2;Stand Alone Oceanic Component;215
7.2.5.3;Prediction of Typhoon Tracking with MSSG;216
7.2.6;Conclusions and Perspectives;220
7.2.7;References;220
8; VI Computational Physics;222
8.1;Construction of Vibration Table in an Extended World for Safety Assessment of Nuclear Power Plants;223
8.1.1;Introduction;223
8.1.2;Overview of Seismic Simulation;224
8.1.3;Seismic Simulation of Mechanical Components;226
8.1.3.1;Governing Equations;226
8.1.3.2;Balancing Domain Decomposition Method;227
8.1.3.3;Optimization of Number of Subdomains in Balancing Domain Decomposition Method;229
8.1.3.3.1;Computation Cost for Each Iteration;229
8.1.3.3.2;Prediction Curve of Total Computation Cost;229
8.1.4;Numerical Validation on a Parallel Computer;230
8.1.5;Concluding Remarks;232
8.1.6;References;232
8.2;Understanding Electron Transport in Atomic Nanowires from Large-Scale Numerical Calculations;233
8.2.1;Introduction;233
8.2.2;Computational Method;234
8.2.3;Results;236
8.2.4;Summary;241
8.2.5;References;241
8.3;Multi-scale Simulations for Laser Plasma Physics;243
8.3.1;Introduction;244
8.3.2;Numerical Methods;245
8.3.3;Radiation Hydrodynamics Code (PINOCO);246
8.3.4;Collective PIC Code (FISCOF1D and 2D);247
8.3.5;Relativistic Fokker-Planck Hydrodynamic Code (FIBMET);247
8.3.6;Distributed Computing Collaboration Protocol (DCCP);248
8.3.7;Fully Integrated Simulation of Fast Ignition;249
8.3.8;Summary;250
8.3.9;References;250



Ihre Fragen, Wünsche oder Anmerkungen
Vorname*
Nachname*
Ihre E-Mail-Adresse*
Kundennr.
Ihre Nachricht*
Lediglich mit * gekennzeichnete Felder sind Pflichtfelder.
Wenn Sie die im Kontaktformular eingegebenen Daten durch Klick auf den nachfolgenden Button übersenden, erklären Sie sich damit einverstanden, dass wir Ihr Angaben für die Beantwortung Ihrer Anfrage verwenden. Selbstverständlich werden Ihre Daten vertraulich behandelt und nicht an Dritte weitergegeben. Sie können der Verwendung Ihrer Daten jederzeit widersprechen. Das Datenhandling bei Sack Fachmedien erklären wir Ihnen in unserer Datenschutzerklärung.