Forewords |
|
xv | |
Preface |
|
xix | |
Part I. Tenet 1: Availability: Maintaining Availability in Modern Applications |
|
|
1 Understanding, Measuring, and Improving Your Availability |
|
|
3 | (20) |
|
Availability Versus Reliability |
|
|
4 | (1) |
|
What Causes Poor Availability? |
|
|
5 | (1) |
|
|
6 | (2) |
|
|
7 | (1) |
|
Planned Outages Are Still Outages |
|
|
8 | (1) |
|
Availability by the Numbers |
|
|
8 | (1) |
|
Improving Your Availability When It Slips |
|
|
8 | (7) |
|
Measure and Track Your Current Availability |
|
|
9 | (1) |
|
Automate Your Manual Processes |
|
|
10 | (4) |
|
|
14 | (1) |
|
Keep on Top of Availability in Your Changing and Growing Application |
|
|
14 | (1) |
|
Five Focuses to Improve Application Availability |
|
|
15 | (7) |
|
Focus #1: Build with Failure in Mind |
|
|
16 | (1) |
|
Focus #2: Always Think About Scaling |
|
|
17 | (1) |
|
|
18 | (2) |
|
Focus #4: Monitor Availability |
|
|
20 | (1) |
|
Focus #5: Respond to Availability Issues in a Predictable and Defined Way |
|
|
21 | (1) |
|
|
22 | (1) |
|
2 Two Mistakes High-Having Room to Recover from Mistakes |
|
|
23 | (14) |
|
|
24 | (8) |
|
Scenario #1: Losing a Node |
|
|
25 | (2) |
|
Scenario #2: Problems During Upgrades |
|
|
27 | (1) |
|
Scenario #3: Data Center Resiliency |
|
|
28 | (2) |
|
Scenario #4: Hidden Shared Failure Types |
|
|
30 | (1) |
|
Scenario #5: Failure Loops |
|
|
31 | (1) |
|
Managing Your Applications |
|
|
32 | (1) |
|
|
32 | (5) |
Part II. Tenet 2: Modern Application Architecture: Using Services |
|
|
|
37 | (16) |
|
The Monolith Application Versus the Service-Based Application |
|
|
37 | (6) |
|
|
40 | (2) |
|
|
42 | (1) |
|
|
43 | (1) |
|
What Should Be a Service? |
|
|
43 | (1) |
|
|
44 | (5) |
|
Guideline #1: Specific Business Requirements |
|
|
44 | (1) |
|
Guideline #2: Distinct and Separable Team Ownership |
|
|
45 | (2) |
|
Guideline #3: Naturally Separable Data |
|
|
47 | (1) |
|
Guideline #4: Shared Capabilities/Data |
|
|
48 | (1) |
|
|
49 | (1) |
|
|
49 | (1) |
|
Finding the Right Balance |
|
|
50 | (3) |
|
|
53 | (6) |
|
Stateless Services-Services Without Data |
|
|
53 | (1) |
|
Stateful Services-Services with Data |
|
|
53 | (1) |
|
|
54 | (3) |
|
Timely Handling of Growing Pains |
|
|
57 | (2) |
|
5 Dealing with Service Failures |
|
|
59 | (14) |
|
Cascading Service Failures |
|
|
59 | (2) |
|
Responding to a Service Failure |
|
|
61 | (2) |
|
|
61 | (1) |
|
|
62 | (1) |
|
|
62 | (1) |
|
|
63 | (3) |
|
|
66 | (3) |
|
|
66 | (1) |
|
|
66 | (1) |
|
Fail as Early as Possible |
|
|
67 | (1) |
|
|
68 | (1) |
|
|
69 | (4) |
Part Ill. Tenet 3: Organization: Scaling Your Organization for Modern Applications |
|
|
6 Service Ownership-STOSA |
|
|
73 | (8) |
|
Single Team Owned Service Architecture |
|
|
73 | (2) |
|
Advantages of a STOSA Application and Organization |
|
|
75 | (1) |
|
What Does It Mean to "Own" a Service? |
|
|
76 | (2) |
|
Using Core Teams and Services |
|
|
78 | (1) |
|
|
79 | (2) |
|
|
81 | (12) |
|
|
81 | (1) |
|
|
82 | (3) |
|
Assigning Service Tier Labels to Services |
|
|
83 | (2) |
|
|
85 | (2) |
|
|
87 | (4) |
|
|
88 | (1) |
|
|
88 | (1) |
|
|
89 | (2) |
|
|
91 | (2) |
|
8 Service-Level Agreements |
|
|
93 | (14) |
|
|
94 | (2) |
|
External Versus Internal SLAs |
|
|
96 | (2) |
|
Why Are Internal SLAs Important? |
|
|
96 | (2) |
|
SLAs for Problem Diagnosis |
|
|
98 | (1) |
|
Performance Measurements for SLAs |
|
|
99 | (4) |
|
|
99 | (1) |
|
|
100 | (3) |
|
|
103 | (1) |
|
How Many and Which Internal SLAs? |
|
|
103 | (1) |
|
Why Internal SLAs Are Important |
|
|
104 | (3) |
Part IV. Tenet 4: Risk: Risk Management for Modern Applications |
|
|
9 Using Risk Management When Architecting for Scale |
|
|
107 | (20) |
|
|
107 | (3) |
|
|
108 | (1) |
|
|
108 | (1) |
|
|
109 | (1) |
|
|
109 | (1) |
|
Likelihood Versus Severity |
|
|
110 | (4) |
|
The Top 10 List: Low Likelihood, Low Severity Risk |
|
|
111 | (1) |
|
The Order Database: Low Likelihood, High Severity Risk |
|
|
111 | (1) |
|
Custom Fonts: High Likelihood, Low Severity Risk |
|
|
112 | (1) |
|
T-Shirt Photos: High Likelihood, High Severity Risk |
|
|
113 | (1) |
|
|
114 | (8) |
|
|
116 | (1) |
|
|
117 | (3) |
|
Using the Risk Matrix for Planning |
|
|
120 | (1) |
|
Maintaining the Risk Matrix |
|
|
120 | (2) |
|
|
122 | (2) |
|
|
124 | (1) |
|
|
125 | (1) |
|
Improving Our Risk Situation |
|
|
125 | (2) |
|
|
127 | (6) |
|
Staging Versus Production Environments |
|
|
127 | (2) |
|
Staging/Test Environments |
|
|
127 | (2) |
|
|
129 | (1) |
|
Concerns with Running Game Days in Production |
|
|
129 | (2) |
|
|
131 | (2) |
|
11 Building Systems with Reduced Risk |
|
|
133 | (12) |
|
Technique #1: Introduce Redundancy |
|
|
134 | (2) |
|
|
134 | (2) |
|
Redundancy Improvements That Increase Complexity |
|
|
136 | (1) |
|
Technique #2: Understand Independence |
|
|
136 | (2) |
|
Technique #3: Manage Security |
|
|
138 | (1) |
|
Technique #4: Encourage Simplicity |
|
|
138 | (1) |
|
Technique #5: Build in Self-Repair |
|
|
139 | (1) |
|
Technique #6: Standardize on Operational Processes |
|
|
140 | (1) |
|
|
141 | (4) |
Part V. Tenet 5: Cloud: Utilizing the Cloud |
|
|
12 Getting Started Architecting for Scale with the Cloud |
|
|
145 | (12) |
|
Six Levels of Cloud Maturity |
|
|
146 | (3) |
|
Level 1: Experimenting with the Cloud |
|
|
147 | (1) |
|
Level 2: Securing the Cloud |
|
|
147 | (1) |
|
Level 3: Using Servers and Applications in the Cloud |
|
|
147 | (1) |
|
Level 4: Enabling Value-Added Managed Services |
|
|
148 | (1) |
|
Level 5: Enabling Cloud-Unique Services |
|
|
148 | (1) |
|
|
149 | (1) |
|
Organization Versus Application Maturity Level |
|
|
149 | (1) |
|
|
149 | (2) |
|
Trap #1: Not Trusting Cloud Security |
|
|
150 | (1) |
|
Trap #2: Performing Cloud Migration via Lift-and-Shift |
|
|
150 | (1) |
|
Trap #3: The Lure of Serverless-Depending Too Much on the Hype |
|
|
151 | (1) |
|
When and How to Use Multiple Clouds |
|
|
151 | (5) |
|
Defining What We Mean by Multiple Clouds |
|
|
152 | (3) |
|
Which Model? Which Cloud? |
|
|
155 | (1) |
|
|
156 | (1) |
|
13 Five Industry Trends Changed by the Cloud |
|
|
157 | (4) |
|
What Has Changed in the Cloud? |
|
|
157 | (2) |
|
Change #1: Acceptance of Microservice-Based Architectures |
|
|
157 | (1) |
|
Change #2: Smaller, More Specialized Cloud Services |
|
|
158 | (1) |
|
Change #3: Greater Focus on the Application |
|
|
158 | (1) |
|
Change #4: The Micro Startup |
|
|
158 | (1) |
|
Change #5: Security and Compliance Has Matured |
|
|
159 | (1) |
|
|
159 | (2) |
|
14 Types of SaaS and Tenancy |
|
|
161 | (6) |
|
Comparing Managed Hosting and Different Types of SaaS |
|
|
161 | (4) |
|
|
162 | (1) |
|
|
163 | (1) |
|
|
164 | (1) |
|
Mixing Different Types of SaaS |
|
|
165 | (1) |
|
Common SaaS Characteristics |
|
|
165 | (1) |
|
SaaS Versus Managed Hosting |
|
|
165 | (1) |
|
|
166 | (1) |
|
15 Distributing Your Application in the AWS Cloud |
|
|
167 | (10) |
|
|
168 | (1) |
|
|
168 | (1) |
|
|
169 | (1) |
|
|
169 | (1) |
|
|
169 | (4) |
|
Availability Zones Are Not Data Centers |
|
|
173 | (1) |
|
Maintaining Location Diversity for Availability Reasons |
|
|
174 | (2) |
|
AWS-Mapping Availability Zones in Multiple Accounts |
|
|
175 | (1) |
|
Distributing Your Application |
|
|
176 | (1) |
|
16 Managed Infrastructure |
|
|
177 | (8) |
|
Structure of Cloud-Based Services |
|
|
177 | (6) |
|
|
178 | (2) |
|
Server-Based Managed Resource |
|
|
180 | (1) |
|
Serverless Managed Resource |
|
|
181 | (2) |
|
Implications of Using Managed Versus Non-Managed Resources |
|
|
183 | (1) |
|
|
184 | (1) |
|
17 Cloud Resource Allocation |
|
|
185 | (10) |
|
Usage-Based Resources Allocation |
|
|
186 | (2) |
|
Allocated-Capacity Resource Allocation |
|
|
188 | (5) |
|
|
189 | (1) |
|
Automated Allocation of Resource Capacity |
|
|
190 | (1) |
|
Issues with Automatic Allocation |
|
|
190 | (2) |
|
Dynamic Allocation, Dynamic Cost |
|
|
192 | (1) |
|
Pros and Cons of Usage-Based Versus Allocated-Capacity |
|
|
193 | (2) |
|
18 Serverless and Functions as a Service |
|
|
195 | (6) |
|
Example Application #1: Event Processing |
|
|
196 | (1) |
|
Example Application #2: Mobile Backend |
|
|
197 | (1) |
|
Example Application #3: Internet of Things Data Intake |
|
|
197 | (1) |
|
Advantages and Disadvantages of FaaS |
|
|
198 | (1) |
|
Serverless Hype and the Future of FaaS |
|
|
199 | (2) |
|
|
201 | (12) |
|
|
202 | (1) |
|
|
203 | (1) |
|
What Should Be in the Edge Versus the Cloud? |
|
|
203 | (3) |
|
How Do We Decide? The Driverless Car |
|
|
204 | (2) |
|
Edge Scaling Isn't the Same as Cloud Scaling |
|
|
206 | (3) |
|
Criteria for Using Edge Versus Cloud |
|
|
208 | (1) |
|
Eight Keys to Success in the Edge |
|
|
209 | (3) |
|
#1: Be Smart About What Goes on the Edge |
|
|
209 | (1) |
|
#2: Don't Ignore DevOps Principles in the Edge |
|
|
209 | (1) |
|
#3: Nail a Highly Distributed Deployment Strategy |
|
|
209 | (1) |
|
#4: Reduce Versioning as Much as Possible |
|
|
210 | (1) |
|
#5: Reduce Per Node Provisioning and Configuration Options |
|
|
210 | (1) |
|
#6: Scaling Is an Edge Issue, Not Just a Cloud Issue |
|
|
211 | (1) |
|
#7: Nail Monitoring and Analytics |
|
|
211 | (1) |
|
#8: The Edge Is Not Magic |
|
|
211 | (1) |
|
|
212 | (1) |
|
20 Geographic Impact on Using the Cloud |
|
|
213 | (8) |
|
Cloud Matters Everywhere, But at Different Levels |
|
|
213 | (1) |
|
Replacement Mentality Impacts How You Adopt Cloud |
|
|
214 | (1) |
|
Which Cloud Is Most Important? |
|
|
215 | (1) |
|
Important Technologies Differ |
|
|
216 | (1) |
|
Data Sovereignty Is Universal |
|
|
216 | (1) |
|
|
217 | (4) |
Part VI. Conclusion |
|
|
21 Putting It All Together |
|
|
221 | (6) |
|
|
221 | (1) |
|
|
222 | (1) |
|
|
222 | (1) |
|
|
222 | (1) |
|
|
223 | (1) |
|
|
223 | (4) |
Index |
|
227 | |