Acknowledgements |
|
ix | |
About the Authors |
|
xi | |
Preface |
|
xiii | |
|
|
1 | (4) |
|
1.1 Scientific Applications in the Cloud |
|
|
1 | (2) |
|
1.2 Key Issues of This Research |
|
|
3 | (1) |
|
1.3 Overview of This Book |
|
|
3 | (2) |
|
|
5 | (10) |
|
2.1 Data Management of Scientific Applications in Traditional Distributed Systems |
|
|
5 | (5) |
|
2.1.1 Data Management in Grid |
|
|
6 | (2) |
|
2.1.2 Data Management in Grid Workflows |
|
|
8 | (1) |
|
2.1.3 Data Management in Other Distributed Systems |
|
|
9 | (1) |
|
2.2 Cost-Effectiveness of Scientific Applications in the Cloud |
|
|
10 | (2) |
|
2.2.1 Cost-Effectiveness of Deploying Scientific Applications in the Cloud |
|
|
10 | (1) |
|
2.2.2 Trade-Off Between Computation and Storage in the Cloud |
|
|
11 | (1) |
|
2.3 Data Provenance in Scientific Applications |
|
|
12 | (1) |
|
|
12 | (3) |
|
3 Motivating Example and Research Issues |
|
|
15 | (8) |
|
|
15 | (2) |
|
|
17 | (2) |
|
3.2.1 Requirements and Challenges of Deploying Scientific Applications in the Cloud |
|
|
17 | (1) |
|
3.2.2 Bandwidth Cost of Deploying Scientific Applications in the Cloud |
|
|
18 | (1) |
|
|
19 | (2) |
|
3.3.1 Cost Model for Data Set Storage in the Cloud |
|
|
19 | (1) |
|
3.3.2 Minimum Cost Benchmarking Approaches |
|
|
20 | (1) |
|
3.3.3 Cost-Effective Storage Strategies |
|
|
20 | (1) |
|
|
21 | (2) |
|
4 Cost Model of Data Set Storage in the Cloud |
|
|
23 | (6) |
|
4.1 Classification of Application Data in the Cloud |
|
|
23 | (1) |
|
4.2 Data Provenance and DDG |
|
|
23 | (2) |
|
4.3 Data Set Storage Cost Model in the Cloud |
|
|
25 | (2) |
|
|
27 | (2) |
|
5 Minimum Cost Benchmarking Approaches |
|
|
29 | (36) |
|
5.1 Static On-Demand Minimum Cost Benchmarking Approach |
|
|
30 | (13) |
|
5.1.1 CTT-SP Algorithm for Linear DDG |
|
|
30 | (2) |
|
5.1.2 Minimum Cost Benchmarking Algorithm for DDG with One Block |
|
|
32 | (1) |
|
5.1.2.1 Constructing CTT for DDG with One Block |
|
|
33 | (1) |
|
5.1.2.2 Setting Weights to Different Types of Edges |
|
|
34 | (2) |
|
5.1.2.3 Steps of Finding MCSS for DDG with One Sub-Branch in One Block |
|
|
36 | (2) |
|
5.1.3 Minimum Cost Benchmarking Algorithm for General DDG |
|
|
38 | (1) |
|
5.1.3.1 General CTT-SP Algorithm for Different Situations |
|
|
38 | (1) |
|
5.1.3.2 Pseudo-Code of General CTT-SP Algorithm |
|
|
39 | (4) |
|
5.2 Dynamic On-the-Fly Minimum Cost Benchmarking Approach |
|
|
43 | (21) |
|
|
44 | (1) |
|
5.2.1.1 Different MCSSs of a DDG_LS in a Solution Space |
|
|
44 | (1) |
|
5.2.1.2 Range of MCSSs' Cost Rates for a DDG_LS |
|
|
45 | (2) |
|
5.2.1.3 Distribution of MCSSs in the PSS of a DDG_LS |
|
|
47 | (3) |
|
5.2.2 Algorithms for Calculating PSS of a DDG_LS |
|
|
50 | (3) |
|
5.2.3 PSS for a General DDG (or DDG Segment) |
|
|
53 | (1) |
|
5.2.3.1 Three-Dimensional PSS of DDG Segment with Two Branches |
|
|
54 | (2) |
|
5.2.3.2 High-Dimensional PSS of a General DDG |
|
|
56 | (2) |
|
5.2.4 Dynamic On-the-Fly Minimum Cost Benchmarking |
|
|
58 | (1) |
|
5.2.4.1 Minimum Cost Benchmarking by Merging and Saving PSSs in a Hierarchy |
|
|
58 | (3) |
|
5.2.4.2 Updating of the Minimum Cost Benchmark on the Fly |
|
|
61 | (3) |
|
|
64 | (1) |
|
6 Cost-Effective Data Set Storage Strategies |
|
|
65 | (10) |
|
6.1 Data-Accessing Delay and Users' Preferences in Storage Strategies |
|
|
65 | (1) |
|
6.2 Cost-Rate-Based Storage Strategy |
|
|
66 | (3) |
|
6.2.1 Algorithms for the Strategy |
|
|
67 | (1) |
|
6.2.1.1 Algorithm for Deciding Newly Generated Data Sets' Storage Status |
|
|
67 | (1) |
|
6.2.1.2 Algorithm for Deciding Stored Data Sets' Storage Status Due to Usage Frequencies Change |
|
|
68 | (1) |
|
6.2.1.3 Algorithm for Deciding Regenerated Data Sets' Storage Status |
|
|
68 | (1) |
|
6.2.2 Cost-Effectiveness Analysis |
|
|
69 | (1) |
|
6.3 Local-Optimisation-Based Storage Strategy |
|
|
69 | (5) |
|
6.3.1 Algorithms and Rules for the Strategy |
|
|
70 | (1) |
|
6.3.1.1 Enhanced CTT-SP Algorithm for Linear DDG |
|
|
70 | (2) |
|
6.3.1.2 Rules in the Strategy |
|
|
72 | (1) |
|
6.3.2 Cost-Effectiveness Analysis |
|
|
73 | (1) |
|
|
74 | (1) |
|
7 Experiments and Evaluations |
|
|
75 | (16) |
|
7.1 Experiment Environment |
|
|
75 | (1) |
|
7.2 Evaluation of Minimum Cost Benchmarking Approaches |
|
|
75 | (7) |
|
7.2.1 Cost-Effectiveness Evaluation of the Minimum Cost Benchmark |
|
|
76 | (1) |
|
7.2.2 Efficiency Evaluation of Two Benchmarking Approaches |
|
|
77 | (5) |
|
7.3 Evaluation of Cost-Effective Storage Strategies |
|
|
82 | (4) |
|
7.3.1 Cost-Effectiveness of Two Storage Strategies |
|
|
82 | (2) |
|
7.3.2 Efficiency Evaluation of Two Storage Strategies |
|
|
84 | (2) |
|
7.4 Case Study of Pulsar Searching Application |
|
|
86 | (4) |
|
7.4.1 Utilisation of Minimum Cost Benchmarking Approaches |
|
|
86 | (1) |
|
7.4.2 Utilisation of Cost-Effective Storage Strategies |
|
|
87 | (3) |
|
|
90 | (1) |
|
8 Conclusions and Contributions |
|
|
91 | (4) |
|
|
91 | (1) |
|
8.2 Key Contributions of This Book |
|
|
92 | (3) |
Appendix A Notation Index |
|
95 | (2) |
Appendix B Proofs of Theorems, Lemmas and Corollaries |
|
97 | (10) |
Appendix C Method of Calculating and X Based an Users' Extra Budget |
|
107 | (2) |
Bibliography |
|
109 | |