Atjaunināt sīkdatņu piekrišanu

E-grāmata: Dependable Computing: Design and Assessment

  • Formāts: PDF+DRM
  • Sērija : IEEE Press
  • Izdošanas datums: 16-Apr-2024
  • Izdevniecība: Standards Information Network
  • Valoda: eng
  • ISBN-13: 9781119743446
Citas grāmatas par šo tēmu:
  • Formāts - PDF+DRM
  • Cena: 135,62 €*
  • * ši ir gala cena, t.i., netiek piemērotas nekādas papildus atlaides
  • Ielikt grozā
  • Pievienot vēlmju sarakstam
  • Šī e-grāmata paredzēta tikai personīgai lietošanai. E-grāmatas nav iespējams atgriezt un nauda par iegādātajām e-grāmatām netiek atmaksāta.
  • Formāts: PDF+DRM
  • Sērija : IEEE Press
  • Izdošanas datums: 16-Apr-2024
  • Izdevniecība: Standards Information Network
  • Valoda: eng
  • ISBN-13: 9781119743446
Citas grāmatas par šo tēmu:

DRM restrictions

  • Kopēšana (kopēt/ievietot):

    nav atļauts

  • Drukāšana:

    nav atļauts

  • Lietošana:

    Digitālo tiesību pārvaldība (Digital Rights Management (DRM))
    Izdevējs ir piegādājis šo grāmatu šifrētā veidā, kas nozīmē, ka jums ir jāinstalē bezmaksas programmatūra, lai to atbloķētu un lasītu. Lai lasītu šo e-grāmatu, jums ir jāizveido Adobe ID. Vairāk informācijas šeit. E-grāmatu var lasīt un lejupielādēt līdz 6 ierīcēm (vienam lietotājam ar vienu un to pašu Adobe ID).

    Nepieciešamā programmatūra
    Lai lasītu šo e-grāmatu mobilajā ierīcē (tālrunī vai planšetdatorā), jums būs jāinstalē šī bezmaksas lietotne: PocketBook Reader (iOS / Android)

    Lai lejupielādētu un lasītu šo e-grāmatu datorā vai Mac datorā, jums ir nepieciešamid Adobe Digital Editions (šī ir bezmaksas lietotne, kas īpaši izstrādāta e-grāmatām. Tā nav tas pats, kas Adobe Reader, kas, iespējams, jau ir jūsu datorā.)

    Jūs nevarat lasīt šo e-grāmatu, izmantojot Amazon Kindle.

Dependable Computing

Covering dependability from software and hardware perspectives

Dependable Computing: Design and Assessment looks at both the software and hardware aspects of dependability.

This book:

  • Provides an in-depth examination of dependability/fault tolerance topics
  • Describes dependability taxonomy, and briefly contrasts classical techniques with their modern counterparts or extensions
  • Walks up the system stack from the hardware logic via operating systems up to software applications with respect to how they are hardened for dependability
  • Describes the use of measurement-based analysis of computing systems
  • Illustrates technology through real-life applications
  • Discusses security attacks and unique dependability requirements for emerging applications, e.g., smart electric power grids and cloud computing
  • Finally, using critical societal applications such as autonomous vehicles, large-scale clouds, and engineering solutions for healthcare, the book illustrates the emerging challenges faced in making artificial intelligence (AI) and its applications dependable and trustworthy.

This book is suitable for those studying in the fields of computer engineering and computer science. Professionals who are working within the new reality to ensure dependable computing will find helpful information to support their efforts. With the support of practical case studies and use cases from both academia and real-world deployments, the book provides a journey of developments that include the impact of artificial intelligence and machine learning on this ever-growing field. This book offers a single compendium that spans the myriad areas in which dependability has been applied, providing theoretical concepts and applied knowledge with content that will excite a beginner, and rigor that will satisfy an expert. Accompanying the book is an online repository of problem sets and solutions, as well as slides for instructors, that span the chapters of the book.

About the Authors xxiii

Preface xxv

Acknowledgments xxvii

About the Companion Website xxix

1 Dependability Concepts and Taxonomy 1

1.1 Introduction 1

1.2 Placing Classical Dependability Techniques in Perspective 2

1.3 Taxonomy of Dependable Computing 4

1.3.1 Faults, Errors, and Failures 5

1.4 Fault Classes 6

1.5 The Fault Cycle and Dependability Measures 6

1.6 Fault and Error Classification 7

1.7 Mean Time Between Failures 11

1.8 User- perceived System Dependability 13

1.9 Technology Trends and Failure Behavior 14

1.10 Issues at the Hardware Level 15

1.11 Issues at the Platform Level 17

1.12 What is Unique About this Book? 18

1.13 Overview of the Book 19

References 20

2 Classical Dependability Techniques and Modern Computing Systems: Where and How Do They Meet? 25

2.1 Illustrative Case Studies of Design for Dependability 25

2.2 Cloud Computing: A Rapidly Expanding Computing Paradigm 31

2.3 New Application Domains 37

2.4 Insights 52

References 52

3 Hardware Error Detection and Recovery Through Hardware- Implemented Techniques 57

3.1 Introduction 57

3.2 Redundancy Techniques 58

3.3 Watchdog Timers 67

3.4 Information Redundancy 69

3.5 Capability and Consistency Checking 93

3.6 Insights 93

References 96

4 Processor Level Error Detection and Recovery 101

4.1 Introduction 101

4.2 Logic- level Techniques 104

4.3 Error Protection in the Processors 115

4.4 Academic Research on Hardware- level Error Protection 122

4.5 Insights 134

References 137

5 Hardware Error Detection Through Software- Implemented Techniques 141

5.1 Introduction 141

5.2 Duplication- based Software Detection Techniques 142

5.3 Control- Flow Checking 146

5.4 Heartbeats 166

5.5 Assertions 173

5.6 Insights 174

References 175

6 Software Error Detection and Recovery Through Software Analysis 179

6.1 Introduction 179

6.2 Diverse Programming 183

6.3 Static Analysis Techniques 194

6.4 Error Detection Based on Dynamic Program Analysis 217

6.5 Processor- Level Selective Replication 233

6.6 Runtime Checking for Residual Software Bugs 239

6.7 Data Audit 242

6.8 Application of Data Audit Techniques 246

6.9 Insights 252

References 253

7 Measurement- based Analysis of System Software: Operating System Failure Behavior 261

7.1 Introduction 261

7.2 MVS (Multiple Virtual Storage) 262

7.3 Experimental Analysis of OS Dependability 273

7.4 Behavior of the Linux Operating System in the Presence of Errors 275

7.5 Evaluation of Process Pairs in Tandem GUARDIAN 295

7.6 Benchmarking Multiple Operating Systems: A Case Study Using Linux on Pentium, Solaris on SPARC, and AIX on POWER 308

7.7 Dependability Overview of the Cisco Nexus Operating System 326

7.8 Evaluating Operating Systems: Related Studies 330

7.9 Insights 331

References 332

8 Reliable Networked and Distributed Systems 337

8.1 Introduction 337

8.2 System Model 339

8.3 Failure Models 340

8.4 Agreement Protocols 342

8.5 Reliable Broadcast 346

8.6 Reliable Group Communication 351

8.7 Replication 358

8.8 Replication of Multithreaded Applications 370

8.9 Atomic Commit 396

8.10 Opportunities and Challenges in Resource- Disaggregated Cloud Data Centers 400

References 405

9 Checkpointing and Rollback Error Recovery 413

9.1 Introduction 413

9.2 Hardware- Implemented Cache- Based Schemes Checkpointing 415

9.3 Memory- Based Schemes 421

9.4 Operating- System- Level Checkpointing 424

9.5 Compiler- Assisted Checkpointing 432

9.6 Error Detection and Recovery in Distributed Systems 438

9.7 Checkpointing Latency Modeling 451

9.8 Checkpointing in Main Memory Database Systems (MMDB) 455

9.9 Checkpointing in Distributed Database Systems 463

9.10 Multithreaded Checkpointing 468

References 470

10 Checkpointing Large- Scale Systems 475

10.1 Introduction 475

10.2 Checkpointing Techniques 476

10.3 Checkpointing in Selected Existing Systems 484

10.4 Modeling- Coordinated Checkpointing for Large- Scale Supercomputers 492

10.5 Checkpointing in Large- Scale Systems: A Simulation Study 502

10.6 Cooperative Checkpointing 506

References 508

11 Internals of Fault Injection Techniques 511

11.1 Introduction 511

11.2 Historical View of Software Fault Injection 513

11.3 Fault Model Attributes 517

11.4 Compile- Time Fault Injection 517

11.5 Runtime Fault Injection 521

11.6 Simulation- Based Fault Injection 529

11.7 Dependability Benchmark Attributes 530

11.8 Architecture of a Fault Injection Environment: NFTAPE Fault/Error Injection Framework Configured to Evaluate Linux OS 531

11.9 ML- Based Fault Injection: Evaluating Modern Autonomous Vehicles 547

11.10 Insights and Concluding Remarks 574

References 574

12 Measurement- Based Analysis of Large- Scale Clusters: Methodology 585

12.1 Introduction 585

12.2 Related Research 587

12.3 Steps in Field Failure Data Analysis 594

12.4 Failure Event Monitoring and Logging 597

12.5 Data Processing 608

12.6 Data Analysis 622

12.7 Estimation of Empirical Distributions 634

12.8 Dependency Analysis 641

References 651

13 Measurement- Based Analysis of Large Systems: Case Studies 667

13.1 Introduction 667

13.2 Case Study I: Failure Characterization of a Production Software- as- a- Service Cloud Platform 667

13.3 Case Study II: Analysis of Blue Waters System Failures 686

13.4 Case Study III: Autonomous Vehicles: Analysis of Human- Generated Data 710

References 737

14 The Future: Dependable and Trustworthy AI Systems 745

14.1 Introduction 745

14.2 Building Trustworthy AI Systems 748

14.3 Offline Identification of Deficiencies 753

14.4 Online Detection and Mitigation 769

14.5 Trust Model Formulation 772

14.6 Modeling the Trustworthiness of Critical Applications 775

14.7 Conclusion: How Can We Make AI Systems Trustworthy? 786

References 788

Index 797

Ravishankar K. Iyer is a Professor of Engineering and Interim Vice Chancellor for Research at the University of Illinois at Urbana-Champaign. He holds appointments in the Department of Electrical and Computer Engineering and the Department of Computer Science and he is Co-Director of the Center for Reliable and High-Performance Computing at the Coordinated Science Laboratory (CSL) and Chief Scientist at the Information Trust Institute. Professor Iyer is a Fellow of the AAAS, the IEEE and the ACM.

Zbigniew T. Kalbarczyk is a Research Professor at CSL at the University of Illinois at Urbana-Champaign. His major research areas are the design and validation of reliable and secure computing systems, and the development of methods and tools for designing and experimental assessment of reliable and secure systems. Professor Kalbarczyk is a member of the IEEE and the Computer Society.

Nithin M. Nakka earned his MS and PhD in Electrical and Computer Engineering from the University of Illinois at Urbana-Champaign. He is a Software Engineer at Nextest Systems in San Jose, CA. Dr. Nakka was a research assistant professor at the Center for Ultrascale Computing and Information Security at Northwestern University.