Preface |
|
xix | |
Editor |
|
xxiii | |
1 Resilient HPC for 24x7x365 Weather Forecast Operations at the Australian Government Bureau of Meteorology |
|
1 | (30) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 | (1) |
|
|
4 | (3) |
|
|
5 | (1) |
|
|
6 | (1) |
|
|
7 | (1) |
|
1.3 Applications and Workloads |
|
|
7 | (6) |
|
1.3.1 Highlights of Main Applications |
|
|
8 | (2) |
|
1.3.2 2017 Case Study: From Nodes to News, TC Debbie |
|
|
10 | (1) |
|
|
11 | (1) |
|
1.3.4 SSP - Monitoring System Performance |
|
|
11 | (2) |
|
|
13 | (1) |
|
1.4.1 System Design Decisions |
|
|
13 | (1) |
|
1.5 Hardware Architecture |
|
|
14 | (4) |
|
1.5.1 Australis Processors |
|
|
14 | (1) |
|
1.5.2 Australis Node Design |
|
|
15 | (1) |
|
1.5.2.1 Australis Service Node |
|
|
15 | (1) |
|
1.5.2.2 Australis Compute Node |
|
|
16 | (1) |
|
|
16 | (1) |
|
|
17 | (1) |
|
1.5.5 Australis Interconnect |
|
|
17 | (1) |
|
1.5.6 Australis Storage and Filesystem |
|
|
17 | (1) |
|
|
18 | (3) |
|
|
18 | (1) |
|
1.6.2 Operating System Upgrade Procedure |
|
|
18 | (1) |
|
|
19 | (2) |
|
|
20 | (1) |
|
|
20 | (1) |
|
|
20 | (1) |
|
|
21 | (2) |
|
|
21 | (1) |
|
|
22 | (1) |
|
|
22 | (1) |
|
|
23 | (1) |
|
1.8.1 Oracle Hierarchical Storage Manager (SAM-QFS) |
|
|
23 | (1) |
|
|
23 | (1) |
|
|
24 | (1) |
|
|
25 | (1) |
|
1.10.1 Systems Usage Patterns |
|
|
25 | (1) |
|
|
25 | (3) |
|
1.11.1 Failover Scenarios |
|
|
26 | (1) |
|
|
26 | (1) |
|
1.11.3 Data Mover Failover |
|
|
26 | (1) |
|
|
26 | (1) |
|
|
27 | (1) |
|
|
27 | (1) |
|
|
27 | (1) |
|
|
27 | (1) |
|
1.11.5 SSH File Transfer Failover |
|
|
27 | (1) |
|
1.12 Implementing a Product Generation Platform |
|
|
28 | (3) |
2 Theta and Mira at Argonne National Laboratory |
|
31 | (32) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Alvaro Vazquez-Mayagoitia |
|
|
|
|
|
|
|
|
|
|
|
32 | (2) |
|
2.1.1 Argonne Leadership Computing Facility |
|
|
32 | (1) |
|
|
33 | (1) |
|
2.1.3 Organization of This Chapter |
|
|
34 | (1) |
|
|
34 | (2) |
|
2.2.1 Mira Facility Improvements |
|
|
34 | (1) |
|
2.2.2 Theta Facility Improvements |
|
|
35 | (1) |
|
|
36 | (12) |
|
|
37 | (4) |
|
|
38 | (1) |
|
|
39 | (1) |
|
|
39 | (1) |
|
|
40 | (1) |
|
|
41 | (1) |
|
2.3.2.1 Systems Administration of the Cray Linux Environment |
|
|
42 | (1) |
|
|
42 | (1) |
|
|
42 | (2) |
|
2.3.3.1 Programming Models |
|
|
42 | (1) |
|
2.3.3.2 Languages and Compilers |
|
|
43 | (1) |
|
2.3.4 Deployment and Acceptance |
|
|
44 | (2) |
|
|
44 | (1) |
|
|
45 | (1) |
|
2.3.5 Early Science and Transition to Operations |
|
|
46 | (2) |
|
|
48 | (7) |
|
2.4.1 Architecture and Software Summary |
|
|
49 | (1) |
|
2.4.2 Evolution of Ecosystem |
|
|
50 | (2) |
|
2.4.3 Notable Science Accomplishments |
|
|
52 | (2) |
|
|
54 | (1) |
|
|
55 | (2) |
|
|
57 | (2) |
|
|
59 | (4) |
3 Zuse Institute Berlin (ZIB) |
|
63 | (30) |
|
|
|
|
|
|
63 | (2) |
|
3.1.1 Research Center for Many-Core HPC |
|
|
64 | (1) |
|
|
64 | (1) |
|
3.2 Applications and Workloads |
|
|
65 | (2) |
|
|
66 | (1) |
|
|
66 | (1) |
|
|
67 | (1) |
|
3.3 System Hardware Architecture |
|
|
67 | (3) |
|
3.3.1 Cray TDS at ZIB with Intel Xeon Phi Processors |
|
|
68 | (1) |
|
3.3.2 Intel Xeon Phi 71xx |
|
|
69 | (1) |
|
3.3.3 Intel Xeon Phi 72xx |
|
|
69 | (1) |
|
3.4 Many-Core in HPC: The Need for Code Modernization |
|
|
70 | (19) |
|
3.4.1 High-level SIMD Vectorization |
|
|
71 | (6) |
|
3.4.2 Offloading over Fabric |
|
|
77 | (7) |
|
3.4.3 Runtime Kernel Compilation with KART |
|
|
84 | (5) |
|
|
89 | (4) |
4 The Mont-Blanc Prototype |
|
93 | (30) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
94 | (2) |
|
4.1.1 Project Context and Challenges |
|
|
94 | (2) |
|
4.1.2 Objectives and Timeline |
|
|
96 | (1) |
|
4.2 Hardware Architecture |
|
|
96 | (4) |
|
|
96 | (1) |
|
|
97 | (1) |
|
|
98 | (1) |
|
4.2.4 Performance Summary |
|
|
99 | (1) |
|
|
100 | (3) |
|
4.3.1 Development Tools Ecosystem |
|
|
101 | (1) |
|
|
102 | (1) |
|
4.4 Applications and Workloads |
|
|
103 | (5) |
|
|
104 | (1) |
|
|
105 | (1) |
|
|
106 | (1) |
|
4.4.4 Node Power Profiling |
|
|
106 | (2) |
|
4.5 Deployment and Operational Information |
|
|
108 | (2) |
|
4.5.1 Thermal Experiments |
|
|
109 | (1) |
|
4.6 Highlights of Mont-Blanc |
|
|
110 | (9) |
|
4.6.1 Reliability Study of an Unprotected RAM System |
|
|
111 | (3) |
|
4.6.2 Network Retransmission and OS Noise Study |
|
|
114 | (3) |
|
4.6.3 The Power Monitoring Tool of the Mont-Blanc System |
|
|
117 | (2) |
|
|
119 | (4) |
5 Chameleon |
|
123 | (26) |
|
|
|
|
|
|
|
|
|
124 | (2) |
|
5.1.1 A Case for a Production Testbed |
|
|
124 | (1) |
|
|
125 | (1) |
|
|
126 | (1) |
|
5.2 Hardware Architecture |
|
|
126 | (4) |
|
5.2.1 Projected Use Cases |
|
|
127 | (1) |
|
5.2.2 Phase 1 Chameleon Deployment |
|
|
127 | (2) |
|
5.2.3 Experience with Phase 1 Hardware and Future Plans |
|
|
129 | (1) |
|
|
130 | (1) |
|
|
130 | (5) |
|
|
131 | (1) |
|
|
132 | (3) |
|
|
135 | (2) |
|
|
136 | (1) |
|
|
137 | (1) |
|
|
137 | (1) |
|
5.6.1 University of Chicago Facility |
|
|
137 | (1) |
|
|
137 | (1) |
|
5.6.3 Wide-Area Connectivity |
|
|
137 | (1) |
|
5.7 System Management and Policies |
|
|
138 | (1) |
|
5.8 Statistics and Lessons Learned |
|
|
138 | (3) |
|
5.9 Research Projects Highlights |
|
|
141 | (9) |
|
5.9.1 Chameleon Slices for Wide-Area Networking Research |
|
|
141 | (1) |
|
5.9.2 Machine Learning Experiments on Chameleon |
|
|
142 | (7) |
6 CSCS and the Piz Daint System |
|
149 | (26) |
|
|
|
|
|
|
150 | (3) |
|
6.1.1 Program and Sponsor |
|
|
150 | (1) |
|
|
151 | (2) |
|
6.2 Co-designing Piz Daint |
|
|
153 | (2) |
|
6.3 Hardware Architecture |
|
|
155 | (4) |
|
6.3.1 Overview of the Cray XC50 Architecture |
|
|
155 | (1) |
|
6.3.2 Cray XC50 Hybrid Compute Node and Blade |
|
|
155 | (1) |
|
|
156 | (1) |
|
6.3.4 Scratch File System Configuration |
|
|
157 | (2) |
|
6.4 Innovative Features of Piz Daint |
|
|
159 | (4) |
|
6.4.1 New Cray Linux Environment (CLE 6.0) |
|
|
160 | (1) |
|
|
161 | (1) |
|
|
162 | (1) |
|
6.4.4 System Management and Monitoring |
|
|
162 | (1) |
|
|
163 | (4) |
|
6.5.1 Design Criteria for the Facility |
|
|
163 | (2) |
|
|
165 | (1) |
|
6.5.3 Cooling Distribution |
|
|
165 | (1) |
|
6.5.4 Electrical Distribution |
|
|
166 | (1) |
|
6.5.5 Siting the Current Piz Daint System |
|
|
166 | (1) |
|
|
166 | (1) |
|
|
166 | (1) |
|
|
166 | (1) |
|
6.6 Consolidation of Services |
|
|
167 | (4) |
|
6.6.1 High Performance Computing Service |
|
|
167 | (1) |
|
6.6.2 Visualization and Data Analysis Service |
|
|
168 | (1) |
|
|
169 | (1) |
|
|
170 | (1) |
|
6.6.5 Cray Urika-XC Analytics Software Suite Services |
|
|
170 | (1) |
|
6.6.6 Worldwide Large Hadron Collider (LHC) Computing Grid (WLCG) Services |
|
|
170 | (1) |
|
|
171 | (4) |
7 Facility Best Practices |
|
175 | (14) |
|
|
|
175 | (1) |
|
7.2 Forums That Discuss Best Practices in HPC |
|
|
176 | (1) |
|
7.3 Relevant Standards for Data Centres |
|
|
176 | (1) |
|
7.4 Most Frequently Encountered Infrastructure Challenges |
|
|
177 | (1) |
|
7.5 Compilation of Best Practices |
|
|
178 | (7) |
|
|
178 | (1) |
|
7.5.2 Tendering Processes |
|
|
179 | (1) |
|
|
180 | (1) |
|
7.5.4 Power Density and Capacity |
|
|
180 | (1) |
|
|
181 | (1) |
|
7.5.6 Electrical Infrastructure |
|
|
182 | (1) |
|
|
183 | (1) |
|
|
184 | (1) |
|
7.5.9 Measuring and Monitoring |
|
|
184 | (1) |
|
|
185 | (1) |
|
7.6 Limitations and Implications |
|
|
185 | (1) |
|
|
185 | (4) |
8 Jetstream |
|
189 | (34) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
191 | (9) |
|
8.1.1 Jetstream Motivation and Sponsor Background |
|
|
192 | (3) |
|
|
195 | (1) |
|
8.1.3 Hardware Acceptance |
|
|
196 | (1) |
|
|
197 | (1) |
|
8.1.5 Cloud Functionality Tests |
|
|
198 | (1) |
|
8.1.6 Gateway Functionality Tests |
|
|
199 | (1) |
|
8.1.7 Data Movement, Storage, and Dissemination |
|
|
199 | (1) |
|
|
200 | (1) |
|
8.2 Applications and Workloads |
|
|
200 | (3) |
|
8.2.1 Highlights of Main Applications |
|
|
201 | (2) |
|
|
203 | (1) |
|
8.4 Hardware Architecture |
|
|
203 | (1) |
|
8.4.1 Node Design and Processor Elements |
|
|
203 | (1) |
|
|
204 | (1) |
|
|
204 | (1) |
|
|
204 | (6) |
|
|
204 | (2) |
|
8.5.2 System Administration |
|
|
206 | (1) |
|
8.5.3 Schedulers and Virtualization |
|
|
206 | (1) |
|
|
207 | (1) |
|
|
208 | (1) |
|
8.5.6 User Authentication |
|
|
208 | (1) |
|
8.5.7 Allocation Software and Processes |
|
|
209 | (1) |
|
|
210 | (3) |
|
|
210 | (1) |
|
8.6.2 Jetstream Plugins for the Atmosphere Platform |
|
|
211 | (1) |
|
|
211 | (1) |
|
8.6.2.2 Allocation Sources and Special Allocations |
|
|
211 | (1) |
|
8.6.3 Globus Authentication and Data Access |
|
|
212 | (1) |
|
8.6.4 The Jetstream OpenStack API |
|
|
212 | (1) |
|
|
212 | (1) |
|
8.7 Data Center Facilities |
|
|
213 | (1) |
|
|
214 | (1) |
|
|
215 | (2) |
|
8.9.1 Jupyter and Kubernetes |
|
|
216 | (1) |
|
8.10 Artificial Intelligence Technology Education |
|
|
217 | (1) |
|
8.11 Jetstream VM Image Use for Scientific Reproducibility - Bioinformatics as an Example |
|
|
217 | (1) |
|
8.12 Running a Virtual Cluster on Jetstream |
|
|
218 | (5) |
9 Modular Supercomputing Architecture: From Idea to Production |
|
223 | (34) |
|
|
|
|
9.1 The Julich Supercomputing Centre (JSC) |
|
|
224 | (5) |
|
9.2 Supercomputing Architectures at JSC |
|
|
224 | (1) |
|
9.2.1 The Dual Supercomputer Strategy |
|
|
225 | (2) |
|
9.2.2 The Cluster-Booster Concept |
|
|
227 | (1) |
|
9.2.3 The Modular Supercomputing Architecture |
|
|
228 | (1) |
|
9.3 Applications and Workloads |
|
|
229 | (3) |
|
9.3.1 Co-design Applications in the DEEP Projects |
|
|
231 | (1) |
|
|
232 | (2) |
|
|
233 | (1) |
|
|
233 | (1) |
|
9.5 Hardware Implementation |
|
|
234 | (7) |
|
9.5.1 First Generation (DEEP) Prototype |
|
|
235 | (3) |
|
9.5.2 Second Generation (DEEP-ER) Prototype |
|
|
238 | (1) |
|
|
239 | (2) |
|
|
241 | (4) |
|
9.6.1 System Administration |
|
|
241 | (1) |
|
9.6.2 Schedulers and Resource Management |
|
|
242 | (2) |
|
9.6.3 Network-bridging Protocol |
|
|
244 | (1) |
|
9.6.4 I/O Software and File System |
|
|
244 | (1) |
|
|
245 | (4) |
|
9.7.1 Inter-module MPI Offloading |
|
|
245 | (1) |
|
9.7.2 OmpSs Abstraction Layer |
|
|
246 | (1) |
|
9.7.3 Resiliency Software |
|
|
247 | (2) |
|
9.8 Cooling and Facility Infrastructure |
|
|
249 | (1) |
|
9.9 Conclusions and Next steps |
|
|
250 | (1) |
|
|
251 | (6) |
10 SuperMUC at LRZ |
|
257 | (18) |
|
|
|
|
|
|
|
257 | (3) |
|
|
258 | (2) |
|
|
260 | (1) |
|
10.3 Applications and Workloads |
|
|
261 | (4) |
|
|
265 | (1) |
|
10.5 Data Center/Facility |
|
|
266 | (2) |
|
10.6 R&D on Energy-Efficiency at LRZ |
|
|
268 | (7) |
11 The NERSC Cori HPC System |
|
275 | (30) |
|
Katie Antypas Brian Austin |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
276 | (1) |
|
11.1.1 Sponsor and Program Background |
|
|
276 | (1) |
|
|
277 | (1) |
|
11.2 Applications and Workloads |
|
|
277 | (2) |
|
|
277 | (2) |
|
|
279 | (1) |
|
11.4 Hardware Architecture |
|
|
280 | (1) |
|
11.4.1 Node Types and Design |
|
|
280 | (1) |
|
11.4.1.1 Xeon Phi "Knights Landing" Compute Nodes |
|
|
280 | (1) |
|
11.4.1.2 Xeon "Haswell" Compute Nodes |
|
|
280 | (1) |
|
|
280 | (1) |
|
|
281 | (1) |
|
11.4.3 Storage - Burst Buffer and Lustre Filesystem |
|
|
281 | (1) |
|
|
281 | (4) |
|
11.5.1 System Software Overview |
|
|
281 | (1) |
|
11.5.2 System Management Stack |
|
|
282 | (1) |
|
11.5.3 Resource Management |
|
|
282 | (1) |
|
11.5.4 Storage Resources and Software |
|
|
283 | (1) |
|
11.5.5 Networking Resources and Software |
|
|
284 | (1) |
|
11.5.6 Containers and User-Defined Images |
|
|
284 | (1) |
|
11.6 Programming Environment |
|
|
285 | (3) |
|
11.6.1 Programming Models |
|
|
285 | (1) |
|
11.6.2 Languages and Compilers |
|
|
285 | (1) |
|
11.6.3 Libraries and Tools |
|
|
286 | (1) |
|
11.6.4 Building Software for a Heterogeneous System |
|
|
286 | (1) |
|
11.6.5 Default Mode Selection Considerations |
|
|
287 | (1) |
|
|
287 | (1) |
|
|
288 | (7) |
|
|
288 | (1) |
|
11.7.2 Optimization Strategy and Tools |
|
|
288 | (2) |
|
11.7.3 Most Effective Optimizations |
|
|
290 | (1) |
|
11.7.4 NESAP Result Overview |
|
|
291 | (1) |
|
11.7.5 Application Highlights |
|
|
291 | (4) |
|
11.7.5.1 Quantum ESPRESSO |
|
|
291 | (2) |
|
|
293 | (2) |
|
|
295 | (4) |
|
11.8.1 IO Improvement: Burst Buffer |
|
|
295 | (2) |
|
|
297 | (2) |
|
11.8.2.1 Network Connectivity to External Nodes |
|
|
298 | (1) |
|
11.8.2.2 Burst Buffer Filesystem for In-situ Workflows |
|
|
298 | (1) |
|
11.8.2.3 Real-time and Interactive Queues for Time Sensitive Analyses |
|
|
298 | (1) |
|
11.8.2.4 Scheduler and Queue Improvements to Support Data-intensive Computing |
|
|
299 | (1) |
|
|
299 | (2) |
|
11.9.1 System Utilizations |
|
|
299 | (1) |
|
11.9.2 Job Completion Statistics |
|
|
299 | (2) |
|
|
301 | (1) |
|
|
302 | (3) |
12 Lomonosov-2 |
|
305 | (26) |
|
|
|
|
|
|
|
|
|
|
|
|
305 | (4) |
|
12.1.1 HPC History of MSU |
|
|
305 | (3) |
|
12.1.2 Lomonosov-2 Supercomputer: Timeline |
|
|
308 | (1) |
|
12.2 Applications and Workloads |
|
|
309 | (2) |
|
12.2.1 Main Applications Highlights |
|
|
309 | (1) |
|
12.2.2 Benchmark Results and Rating Positions |
|
|
309 | (1) |
|
12.2.3 Users and Workloads |
|
|
310 | (1) |
|
|
311 | (2) |
|
12.4 System Software and Programming Systems |
|
|
313 | (2) |
|
|
315 | (2) |
|
12.5.1 Communication Network |
|
|
315 | (1) |
|
12.5.2 Auxiliary InfiniBand Network |
|
|
315 | (1) |
|
12.5.3 Management and Service Network |
|
|
316 | (1) |
|
|
317 | (1) |
|
12.7 Engineering Infrastructure |
|
|
318 | (5) |
|
12.7.1 Infrastructure Support |
|
|
318 | (1) |
|
12.7.2 Power Distribution |
|
|
318 | (2) |
|
12.7.3 Engineering Equipment |
|
|
320 | (1) |
|
12.7.4 Overall Cooling System |
|
|
320 | (2) |
|
12.7.5 Cooling Auxiliary IT Equipment |
|
|
322 | (1) |
|
|
322 | (1) |
|
|
323 | (1) |
|
12.8 Efficiency of the Supercomputer Center |
|
|
323 | (8) |
13 Electra |
|
331 | (24) |
|
|
|
|
|
|
|
|
|
|
|
332 | (1) |
|
13.2 NASA Requirements for Supercomputing |
|
|
333 | (1) |
|
13.3 Supercomputing Capabilities: Conventional Facilities |
|
|
333 | (4) |
|
|
333 | (1) |
|
|
334 | (1) |
|
13.3.3 Network Connectivity |
|
|
334 | (1) |
|
|
335 | (1) |
|
13.3.5 Visualization and Hyperwall |
|
|
336 | (1) |
|
13.3.6 Primary NAS Facility |
|
|
336 | (1) |
|
13.4 Modular Supercomputing Facility |
|
|
337 | (5) |
|
13.4.1 Limitations of the Primary NAS Facility |
|
|
337 | (1) |
|
13.4.2 Expansion and Integration Strategy |
|
|
337 | (1) |
|
|
338 | (1) |
|
|
338 | (1) |
|
13.4.5 Power, Cooling, Network |
|
|
339 | (1) |
|
13.4.6 Facility Operations and Maintenance |
|
|
340 | (1) |
|
13.4.7 Environmental Impact |
|
|
341 | (1) |
|
13.5 Electra Supercomputer |
|
|
342 | (2) |
|
|
342 | (1) |
|
13.5.2 I/O Subsystem Architecture |
|
|
343 | (1) |
|
|
344 | (1) |
|
|
344 | (1) |
|
13.6.2 Resource Allocation and Scheduling |
|
|
344 | (1) |
|
|
344 | (1) |
|
13.7 Application Benchmarking and Performance |
|
|
345 | (2) |
|
13.8 Utilization Statistics of HECC Resources |
|
|
347 | (1) |
|
13.9 System Operations and Maintenance |
|
|
348 | (2) |
|
13.9.1 Administration Tools |
|
|
348 | (1) |
|
13.9.2 Monitoring, Diagnosis, and Repair Tools |
|
|
349 | (1) |
|
13.9.3 System Enhancements and Maintenance |
|
|
350 | (1) |
|
13.10 Featured Application |
|
|
350 | (2) |
|
|
352 | (3) |
14 Bridges: Converging HPC, AI, and Big Data for Enabling Discovery |
|
355 | (30) |
|
|
|
|
|
356 | (3) |
|
14.1.1 Sponsor/Program Background |
|
|
357 | (1) |
|
|
358 | (1) |
|
14.2 Applications and Workloads |
|
|
359 | (6) |
|
14.2.1 Highlights of Main Applications and Data |
|
|
360 | (1) |
|
14.2.2 Artificial Intelligence |
|
|
361 | (1) |
|
|
362 | (1) |
|
|
363 | (1) |
|
|
364 | (1) |
|
|
365 | (1) |
|
14.4 Hardware Architecture |
|
|
366 | (4) |
|
14.4.1 Processors and Accelerators |
|
|
366 | (2) |
|
|
368 | (1) |
|
|
369 | (1) |
|
|
369 | (1) |
|
|
369 | (1) |
|
|
370 | (2) |
|
|
370 | (1) |
|
|
371 | (1) |
|
14.5.3 System Administration |
|
|
371 | (1) |
|
|
372 | (1) |
|
|
372 | (1) |
|
14.6.1 Virtualization and Containers |
|
|
372 | (1) |
|
|
373 | (3) |
|
14.7.1 User Environment Customization |
|
|
373 | (1) |
|
14.7.2 Programming Models |
|
|
374 | (1) |
|
14.7.3 Languages and Compilers |
|
|
374 | (1) |
|
|
374 | (1) |
|
|
374 | (1) |
|
|
375 | (1) |
|
14.7.7 Domain-Specific Frameworks and Libraries |
|
|
375 | (1) |
|
14.7.8 Gateways, Workflows, and Distributed Applications |
|
|
375 | (1) |
|
14.8 Storage, Visualization, and Analytics |
|
|
376 | (1) |
|
14.8.1 Community Datasets and Big Data as a Service |
|
|
376 | (1) |
|
|
376 | (1) |
|
|
377 | (1) |
|
14.10.1 Reliability and Uptime |
|
|
377 | (1) |
|
14.11 Science Highlights: Bridges-Enabled Breakthroughs |
|
|
377 | (2) |
|
14.11.1 Artificial Intelligence and Big Data |
|
|
377 | (1) |
|
|
378 | (1) |
|
|
379 | (6) |
15 Stampede at TACC |
|
385 | (16) |
|
|
|
|
385 | (3) |
|
15.1.1 Program Background |
|
|
386 | (1) |
|
15.1.2 Lessons Learned on the Path to Stampede 2 |
|
|
386 | (2) |
|
15.2 Workload and the Design of Stampede 2 |
|
|
388 | (2) |
|
15.2.1 Science Highlights |
|
|
389 | (1) |
|
15.3 System Configuration |
|
|
390 | (2) |
|
15.3.1 Processors and Memory |
|
|
390 | (1) |
|
|
391 | (1) |
|
15.3.3 Disk I/O Subsystem |
|
|
391 | (1) |
|
15.3.4 Non-volatile Memory |
|
|
392 | (1) |
|
|
392 | (2) |
|
15.4.1 System Performance Monitoring and Administration |
|
|
392 | (1) |
|
15.4.2 Job Submission and System Health |
|
|
393 | (1) |
|
15.4.3 Application Development Tools |
|
|
393 | (1) |
|
15.5 Visualization and Analytics |
|
|
394 | (1) |
|
15.5.1 Visualization on Stampede 2 |
|
|
394 | (1) |
|
|
395 | (1) |
|
15.6 Datacenter, Layout, and Cybersecurity |
|
|
395 | (1) |
|
15.6.1 System Layout and Phased Deployment |
|
|
396 | (1) |
|
15.6.2 Cybersecurity and Identity Management |
|
|
396 | (1) |
|
|
396 | (5) |
16 Oakforest-PACS |
|
401 | (22) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
402 | (1) |
|
|
402 | (1) |
|
16.3 Applications and Workloads |
|
|
403 | (4) |
|
|
403 | (1) |
|
|
404 | (2) |
|
|
406 | (3) |
|
|
406 | (1) |
|
|
407 | (1) |
|
|
407 | (1) |
|
16.5 Hardware Architecture |
|
|
408 | (1) |
|
|
409 | (3) |
|
16.6.1 Basic System Software |
|
|
409 | (1) |
|
|
409 | (3) |
|
|
412 | (6) |
|
16.7.1 Basic Programming Environment |
|
|
412 | (1) |
|
16.7.2 XcalableMP: A PGAS Parallel Programming Language for Parallel Many-core Processor System |
|
|
413 | (10) |
|
16.7.2.1 Overview of XcalableMP |
|
|
413 | (1) |
|
16.7.2.2 OpenMP and XMP Tasklet Directive |
|
|
414 | (1) |
|
16.7.2.3 Multi-tasking Execution Model in XcalableMP between Nodes |
|
|
415 | (1) |
|
16.7.2.4 Preliminary Performance Evaluation on Oakforest-PACS |
|
|
416 | (1) |
|
16.7.2.5 Communication Optimization for Many-Core Clusters |
|
|
417 | (1) |
|
|
418 | (1) |
|
16.9 Data Center/Facility |
|
|
419 | (4) |
17 CHPC in South Africa |
|
423 | (28) |
|
|
Werner Janse Van Rensburg |
|
|
|
|
|
|
|
|
423 | (3) |
|
17.1.1 Sponsor/Program Background |
|
|
423 | (1) |
|
17.1.2 Business Case of the Installation of Lengau |
|
|
424 | (1) |
|
|
425 | (1) |
|
17.2 Applications and Workloads |
|
|
426 | (10) |
|
17.2.1 Highlights of Main Applications |
|
|
426 | (1) |
|
|
427 | (9) |
|
17.2.2.1 Computational Mechanics |
|
|
428 | (2) |
|
|
430 | (1) |
|
17.2.2.3 Computational Chemistry |
|
|
430 | (3) |
|
|
433 | (3) |
|
|
436 | (2) |
|
17.4 Storage, Visualisation and Analytics |
|
|
438 | (1) |
|
17.5 Data Center/Facility |
|
|
438 | (1) |
|
|
439 | (8) |
|
17.7 Square Kilometer Array |
|
|
447 | (4) |
Index |
|
451 | |