Klientu atbalsts: 27018494

Grāmatu iegāde | Jauns profils | Ienākt

E-grāmata: Advanced Backend Code Optimization

5.00/5 (2 ratings by Goodreads)

Sid Touati, Benoit Dupont de Dinechin

Citi formāti

Other digital carrier (Cena: 166,55 €) - 04-Jun-2014

Formāts: PDF+DRM
Izdošanas datums: 03-Jun-2014
Izdevniecība: ISTE Ltd and John Wiley & Sons Inc
Valoda: eng
ISBN-13: 9781118648940

Citas grāmatas par šo tēmu:

Computer programming / software development

Formāts - PDF+DRM
Cena: 165,30 €*
* ši ir gala cena, t.i., netiek piemērotas nekādas papildus atlaides
Ielikt grozā
Pievienot vēlmju sarakstam
Šī e-grāmata paredzēta tikai personīgai lietošanai. E-grāmatas nav iespējams atgriezt un nauda par iegādātajām e-grāmatām netiek atmaksāta.
Bibliotēkām

Formāts: PDF+DRM
Izdošanas datums: 03-Jun-2014
Izdevniecība: ISTE Ltd and John Wiley & Sons Inc
Valoda: eng
ISBN-13: 9781118648940

Citas grāmatas par šo tēmu:

Computer programming / software development

DRM restrictions

Kopēšana (kopēt/ievietot):

nav atļauts
Drukāšana:

nav atļauts
Lietošana:

Digitālo tiesību pārvaldība (Digital Rights Management (DRM))
Izdevējs ir piegādājis šo grāmatu šifrētā veidā, kas nozīmē, ka jums ir jāinstalē bezmaksas programmatūra, lai to atbloķētu un lasītu. Lai lasītu šo e-grāmatu, jums ir jāizveido Adobe ID. Vairāk informācijas šeit. E-grāmatu var lasīt un lejupielādēt līdz 6 ierīcēm (vienam lietotājam ar vienu un to pašu Adobe ID).

Nepieciešamā programmatūra
Lai lasītu šo e-grāmatu mobilajā ierīcē (tālrunī vai planšetdatorā), jums būs jāinstalē šī bezmaksas lietotne: PocketBook Reader (iOS / Android)

Lai lejupielādētu un lasītu šo e-grāmatu datorā vai Mac datorā, jums ir nepieciešamid Adobe Digital Editions (šī ir bezmaksas lietotne, kas īpaši izstrādāta e-grāmatām. Tā nav tas pats, kas Adobe Reader, kas, iespējams, jau ir jūsu datorā.)

Jūs nevarat lasīt šo e-grāmatu, izmantojot Amazon Kindle.

This book is a summary of more than a decade of research in the area of backend optimization. It contains the latest fundamental research results in this field. While existing books are often more oriented toward Masters students, this book is aimed more towards professors and researchers as it contains more advanced subjects.
It is unique in the sense that it contains information that has not previously been covered by other books in the field, with chapters on phase ordering in optimizing compilation; register saturation in instruction level parallelism; code size reduction for software pipelining; memory hierarchy effects and instruction level parallelism.
Other chapters provide the latest research results in well-known topics such as register need, and software pipelining and periodic register allocation.

Introduction

xiii

Part 1 Prolog: Optimizing Compilation

(22)

Chapter 1 On the Decidability of Phase Ordering in Optimizing Compilation

(20)

1.1 Introduction to the phase ordering problem

(2)

1.2 Background on phase ordering

(2)

1.2.1 Performance modeling and prediction

(1)

1.2.2 Some attempts in phase ordering

(1)

1.3 Toward a theoretical model for the phase ordering problem

(5)

1.3.1 Decidability results

(3)

1.3.2 Another formulation of the phase ordering problem

(1)

1.4 Examples of decidable simplified cases

(4)

1.4.1 Models with compilation costs

(1)

1.4.2 One-pass generative compilers

(3)

1.5 Compiler optimization parameter" space exploration

(4)

1.5.1 Toward a theoretical model

(2)

1.5.2 Examples of simplified decidable cases

(1)

1.6 Conclusion on phase ordering in optimizing compilation

(3)

Part 2 Instruction Scheduling

(96)

Chapter 2 Instruction Scheduling Problems and Overview

(14)

2.1 VLIW instruction scheduling problems

(4)

2.1.1 Instruction scheduling and register allocation in a code generator

(2)

2.1.2 The block and pipeline VLIW instruction scheduling problems

(2)

2.2 Software pipelining

(6)

2.2.1 Cyclic, periodic and pipeline scheduling problems

(3)

2.2.2 Modulo instruction scheduling problems and techniques

(3)

2.3 Instruction scheduling and register allocation

(4)

2.3.1 Register instruction scheduling problem solving approaches

(4)

Chapter 3 Applications of Machine Scheduling to Instruction Scheduling

(12)

3.1 Advances in machine scheduling

(4)

3.1.1 Parallel machine scheduling problems

(2)

3.1.2 Parallel machine scheduling extensions and relaxations

(2)

3.2 List scheduling algorithms

(4)

3.2.1 List scheduling algorithms and list scheduling priorities

(2)

3.2.2 The scheduling algorithm of Leung, Palem and Pnueli

(2)

3.3 Time-indexed scheduling problem formulations

(4)

3.3.1 The non-preemptive time-indexed RCPSP formulation

(1)

3.3.2 Time-indexed formulation for the modulo RPISP

(3)

Chapter 4 Instruction Scheduling Before Register Allocation

(26)

4.1 Instruction scheduling for an ILP processor: case of a VLIW architecture

(16)

4.1.1 Minimum cumulative register lifetime modulo scheduling

(3)

4.1.2 Resource modeling in instruction scheduling problems

(2)

4.1.3 The modulo insertion scheduling theorems

(2)

4.1.4 Insertion scheduling in a backend compiler

(2)

4.1.5 Example of an industrial production compiler from STMicroelectronics

(4)

4.1.6 Time-indexed formulation of the modulo RCISP

(3)

4.2 Large neighborhood search for the resource-constrained modulo scheduling problem

(1)

4.3 Resource-constrained modulo scheduling problem

(3)

4.3.1 Resource-constrained cyclic scheduling problems

(1)

4.3.2 Resource-constrained modulo scheduling problem statement

(1)

4.3.3 Solving resource-constrained modulo scheduling problems

(1)

4.4 Time-indexed integer programming formulations

(3)

4.4.1 The non-preemptive time-indexed RCPSP formulation

(1)

4.4.2 The classic modulo scheduling integer programming formulation

(1)

4.4.3 A new time-indexed formulation for modulo scheduling

(1)

4.5 Large neighborhood search heuristic

(2)

4.5.1 Variables and constraints in time-indexed formulations

(1)

4.5.2 A large neighborhood search heuristic for modulo scheduling

(1)

4.5.3 Experimental results with a production compiler

(1)

4.6 Summary and conclusions

(1)

Chapter 5 Instruction Scheduling After Register Allocation

(14)

5.1 Introduction

(2)

5.2 Local instruction scheduling

(5)

5.2.1 Acyclic instruction scheduling

(1)

5.2.2 Scoreboard Scheduling principles

(2)

5.2.3 Scoreboard Scheduling implementation

(2)

5.3 Global instruction scheduling

(3)

5.3.1 Postpass inter-region scheduling

(2)

5.3.2 Inter-block Scoreboard Scheduling

(1)

5.3.3 Characterization of fixed points

(1)

5.4 Experimental results

(2)

5.5 Conclusions

(2)

Chapter 6 Dealing in Practice with Memory Hierarchy Effects and Instruction Level Parallelism

(28)

6.1 The problem of hardware memory disambiguation at runtime

(12)

6.1.1 Introduction

(1)

6.1.2 Related work

(1)

6.1.3 Experimental environment

(1)

6.1.4 Experimentation methodology

(1)

6.1.5 Precise experimental study of memory hierarchy performance

(5)

6.1.6 The effectiveness of load/store vectorization

100

(3)

6.1.7 Conclusion on hardware memory disambiguation mechanisms

103

(1)

6.2 Data preloading and prefetching

104

(15)

6.2.1 Introduction

104

(1)

6.2.2 Related work

105

(2)

6.2.3 Problems of optimizing cache effects at the instruction level

107

(2)

6.2.4 Target processor description

109

(1)

6.2.5 Our method of instruction-level code optimization

110

(6)

6.2.6 Experimental results

116

(1)

6.2.7 Conclusion on prefetching and preloading at instruction level

117

(2)

Part 3 Register Optimization

119

(112)

Chapter 7 The Register Need of a Fixed Instruction Schedule

121

(20)

7.1 Data dependence graph and processor model for register optimization

122

(1)

7.1.1 NUAL and UAL semantics

122

(1)

7.2 The acyclic register need

123

(2)

7.3 The periodic register need

125

(4)

7.3.1 Software pipelining, periodic scheduling and cyclic scheduling

125

(2)

7.3.2 The circular lifetime intervals

127

(2)

7.4 Computing the periodic register need

129

(3)

7.5 Some theoretical results on the periodic register need

132

(7)

7.5.1 Minimal periodic register need versus initiation interval

133

(1)

7.5.2 Computing the periodic register sufficiency

133

(1)

7.5.3 Stage scheduling under register constraints

134

(5)

7.6 Conclusion on the register requirement

139

(2)

Chapter 8 The Register Saturation

141

(18)

8.1 Motivations on the register saturation concept

141

(3)

8.2 Computing the acyclic register saturation

144

(9)

8.2.1 Characterizing the register saturation

146

(3)

8.2.2 Efficient algorithmic heuristic for register saturation computation

149

(2)

8.2.3 Experimental efficiency of Greedy-k

151

(2)

8.3 Computing the periodic register saturation

153

(4)

8.3.1 Basic integer linear variables

154

(1)

8.3.2 Integer linear constraints

154

(2)

8.3.3 Linear objective function

156

(1)

8.4 Conclusion on the register saturation

157

(2)

Chapter 9 Spill Code Reduction

159

(18)

9.1 Introduction to register constraints in software pipelining

159

(1)

9.2 Related work in periodic register allocation

160

(2)

9.3 SIRA: schedule independant register allocation

162

(7)

9.3.1 Reuse graphs

162

(2)

9.3.2 DDG associated with reuse graph

164

(2)

9.3.3 Exact SIRA with integer linear programming

166

(2)

9.3.4 SIRA-with fixed reuse edges

168

(1)

9.4 SIRALINA: an efficient polynomial heuristic for SIRA

169

(4)

9.4.1 Integer variables for the linear problem

170

(1)

9.4.2 Step 1: the scheduling problem

170

(2)

9.4.3 Step 2: the linear assignment problem

172

(1)

9.5 Experimental results with SIRA

173

(2)

9.6 Conclusion on spill code reduction

175

(2)

Chapter 10 Exploiting the Register Access Delays Before Instruction Scheduling

177

(14)

10.1 Introduction

177

(2)

10.2 Problem description of DDG circuits with non-positive distances

179

(1)

10.3 Necessary and sufficient condition to avoid non-positive circuits

180

(2)

10.4 Application to the SIRA framework

182

(5)

10.4.1 Recall on SIRALINA heuristic

183

(1)

10.4.2 Step 1: the scheduling problem for a fixed II

183

(1)

10.4.3 Step 2: the linear assignment problem

184

(1)

10.4.4 Eliminating non-positive circuits in SIRALINA

184

(2)

10.4.5 Updating reuse distances

186

(1)

10.5 Experimental results on eliminating non-positive circuits

187

(1)

10.6 Conclusion on non-positive circuit elimination

188

(3)

Chapter 11 Loop Unrolling Degree Minimization for Periodic Register Allocation

191

(40)

11.1 Introduction

191

(4)

11.2 Background

195

(9)

11.2.1 Loop unrolling after SWP with modulo variable expansion

196

(1)

11.2.2 Meeting graphs (MG)

197

(3)

11.2.3 SIRA, reuse graphs and loop unrolling

200

(4)

11.3 Problem description of unroll factor minimization for unscheduled loops

204

(1)

11.4 Algorithmic solution for unroll factor minimization: single register type

205

(8)

11.4.1 Fixed loop unrolling problem

206

(1)

11.4.2 Solution for the fixed loop unrolling problem

207

(2)

11.4.3 Solution for LCM-MIN problem

209

(4)

11.5 Unroll factor minimization in the presence of multiple register types

213

(8)

11.5.1 Search space for minimal kernel loop unrolling

217

(1)

11.5.2 Generalization of the fixed loop unrolling problem in the presence of multiple register types

218

(1)

11.5.3 Algorithmic solution for the loop unrolling minimization (LUM, problem 11.1)

219

(2)

11.6 Unroll factor reduction for already scheduled loops

221

(3)

11.6.1 Improving algorithm 11.4 (LCM-MIN) for the meeting graph framework

224

(1)

11.7 Experimental results

224

(2)

11.8 Related work

226

(2)

11.8.1 Rotating register files

226

(1)

11.8.2 Inserting move operations

227

(1)

11.8.3 Loop unrolling after software pipelining

228

(1)

11.8.4 Code generation for multidimensional loops

228

(1)

11.9 Conclusion on loop unroll degree minimization

228

(3)

Part 4 Epilog: Performance, Open Problems

231

(26)

Chapter 12 Statistical Performance Analysis: The Speedup-Test Protocol

233

(24)

12.1 Code performance variation

233

(3)

12.2 Background and notations

236

(3)

12.3 Analyzing the statistical significance of the observed speedups

239

(5)

12.3.1 The speedup of the observed average execution time

239

(2)

12.3.2 The speedup of the observed median execution time, as well as individual runs

241

(3)

12.4 The Speedup-Test software

244

(2)

12.5 Evaluating the proportion of accelerated benchmarks by a confidence interval

246

(2)

12.6 Experiments and applications

248

(5)

12.6.1 Comparing the performances of compiler optimization levels

249

(1)

12.6.2 Testing the performances of parallel executions of OpenMP applications

250

(1)

12.6.3 Comparing the efficiency of two compilers

251

(2)

12.6.4 The impact of the Speedup-Test protocol over some observed speedups

253

(1)

12.7 Related work

253

(2)

12.7.1 Observing execution times variability

253

(1)

12.7.2 Program performance evaluation in presence of variability

254

(1)

12.8 Discussion and conclusion on the Speedup-Test protocol

255

(2)

Conclusion

257

(6)

Appendix 1 Presentation of the Benchmarks used in our Experiments

263

(8)

Appendix 2 Register Saturation Computation on Stand-alone DDG

271

(8)

Appendix 3 Efficiency of SIRA on the Benchmarks

279

(14)

Appendix 4 Efficiency of Non-Positive Circuit Elimination in the SIRA Framework

293

(10)

Appendix 5 Loop Unroll Degree Minimization: Experimental Results

303

(12)

Appendix 6 Experimental Efficiency of Software Data Preloading and Prefetching for Embedded VLIW

315

(4)

Appendix 7 Appendix of the Speedup-Test Protocol

319

(2)

Bibliography

321

(24)

Lists of Figures, Tables and Algorithms

345

(8)

Index

353

Sid Touati is Full professor at the university of Nice Sophia-Antipolis, France.

Biežāk uzdotie jautājumi par e-grāmatām

Permanent link: https://www.kriso.lv/db/97811186489402e.html

E-grāmata: Advanced Backend Code Optimization

DRM restrictions

Kopēšana (kopēt/ievietot):

Drukāšana:

Lietošana:

Konts un iestatījumi

Meklēšana

Meklēt datubāzē

Refine By

Tēmas Ebook Subjects

Izvēlieties iepirkumu grozu