Preface |
|
xv | |
About the Editors |
|
xxiii | |
|
|
xxv | |
|
|
1 | (108) |
|
1 Anonymity Technologies for Privacy-Preserving Data Publishing and Mining |
|
|
3 | (32) |
|
|
|
|
|
3 | (5) |
|
1.1.1 Privacy vs. Utility |
|
|
5 | (1) |
|
1.1.2 Attacks and Countermeasures |
|
|
5 | (1) |
|
1.1.3 Privacy-Preserving Data Mining and Statistical Disclosure Control |
|
|
6 | (1) |
|
1.1.4 Anonymity in Data Protection Laws |
|
|
6 | (1) |
|
1.1.5 Anonymity in Data Protection Laws |
|
|
6 | (2) |
|
|
8 | (1) |
|
1.2 Anonymity for Data Publishing and Mining |
|
|
8 | (5) |
|
1.2.1 Anonymity by Randomization |
|
|
8 | (3) |
|
1.2.2 Anonymity by Indistinguishability |
|
|
11 | (2) |
|
1.3 Statistical Disclosure Control |
|
|
13 | (8) |
|
1.3.1 Non-Perturbative Masking Techniques |
|
|
14 | (1) |
|
1.3.2 Perturbative Masking Techniques |
|
|
15 | (3) |
|
1.3.3 Fully Synthetic Techniques |
|
|
18 | (2) |
|
1.3.4 Partially Synthetic Techniques |
|
|
20 | (1) |
|
1.4 Anonymity in Privacy Regulations |
|
|
21 | (3) |
|
1.4.1 Privacy Laws in Canada |
|
|
21 | (1) |
|
1.4.2 Privacy Laws in the United States |
|
|
22 | (1) |
|
1.4.3 Privacy Laws in the European Union |
|
|
23 | (1) |
|
1.5 Anonymity in Complex Data |
|
|
24 | (3) |
|
|
27 | (1) |
|
|
28 | (7) |
|
2 Privacy Preservation in the Publication of Sparse Multidimensional Data |
|
|
35 | (24) |
|
|
|
|
|
35 | (3) |
|
|
38 | (16) |
|
2.2.1 Multirelational k-Anonymity |
|
|
40 | (4) |
|
2.2.2 l-Diversity for Sparse Multidimensional Data |
|
|
44 | (4) |
|
|
48 | (5) |
|
|
53 | (1) |
|
2.3 Privacy Preservation in Web Logs |
|
|
54 | (1) |
|
|
55 | (1) |
|
|
56 | (3) |
|
3 Knowledge Hiding in Emerging Application Domains |
|
|
59 | (30) |
|
|
|
59 | (2) |
|
|
61 | (3) |
|
|
64 | (6) |
|
3.3.1 Frequent Itemset Hiding |
|
|
66 | (4) |
|
3.4 Association Rule Hiding |
|
|
70 | (2) |
|
3.4.1 Association Rule Hiding |
|
|
70 | (2) |
|
3.4.2 Variations of Association Rule Mining |
|
|
72 | (1) |
|
3.5 Sequential Pattern Hiding |
|
|
72 | (3) |
|
|
73 | (1) |
|
3.5.2 Saptio-Temporal Sequences |
|
|
74 | (1) |
|
3.6 Classification Rule Hiding |
|
|
75 | (2) |
|
3.7 Other Knowledge-Hiding Domains |
|
|
77 | (4) |
|
|
77 | (1) |
|
|
78 | (1) |
|
|
78 | (1) |
|
|
79 | (1) |
|
|
80 | (1) |
|
3.8 Meta-Knowledge Hiding |
|
|
81 | (1) |
|
|
82 | (1) |
|
|
83 | (6) |
|
4 Condensation-Based Methods in Emerging Application Domains |
|
|
89 | (20) |
|
|
|
|
89 | (3) |
|
|
92 | (1) |
|
4.3 An Overview of Mining and Privacy-Preserving Publishing of Sequence Data |
|
|
93 | (1) |
|
4.4 Condensation-Based Methods |
|
|
94 | (9) |
|
4.4.1 Privacy Metrics and Attack Models |
|
|
95 | (1) |
|
4.4.2 Comparison of Condensation-Based and Other Privacy Approaches |
|
|
96 | (1) |
|
4.4.3 Condensation-Based Methods for Tabular Data |
|
|
97 | (2) |
|
4.4.4 Condensation-Based Methods for Strings |
|
|
99 | (1) |
|
4.4.5 Condensation-Based Methods for Sequential Patterns |
|
|
100 | (1) |
|
4.4.6 Condensation-Based Methods for Trajectory Data |
|
|
100 | (2) |
|
4.4.7 Comparison of the Methods |
|
|
102 | (1) |
|
4.5 Conclusions and Open Research Problems |
|
|
103 | (1) |
|
|
104 | (5) |
|
|
109 | (52) |
|
5 Catch, Clean, and Release: A Survey of Obstacles and Opportunities for Network Trace Sanitization |
|
|
111 | (32) |
|
|
|
|
|
|
111 | (7) |
|
5.1.1 Challenges for Trace Collection and Sharing |
|
|
112 | (1) |
|
5.1.2 Real-World Network Trace Sharing Efforts |
|
|
113 | (1) |
|
|
114 | (1) |
|
5.1.4 Database Sanitization and Privacy-Preserving Data Mining |
|
|
115 | (1) |
|
5.1.5 Chapter Organization |
|
|
116 | (2) |
|
|
118 | (5) |
|
5.2.1 Sanitization Techniques |
|
|
118 | (3) |
|
|
121 | (2) |
|
|
123 | (2) |
|
5.4 Evaluation of Sanitization |
|
|
125 | (3) |
|
5.5 Challenges and Open Problems |
|
|
128 | (2) |
|
5.5.1 Quantifying Sensitive Information |
|
|
128 | (1) |
|
5.5.2 Metrics for Evaluating Sanitization Results |
|
|
129 | (1) |
|
5.5.3 Interpreting Sanitization Results |
|
|
130 | (1) |
|
5.6 Case Study: Dartmouth Internet Security Testbed |
|
|
130 | (3) |
|
5.7 Summary and Conclusion |
|
|
133 | (2) |
|
|
135 | (8) |
|
6 Output Privacy in Stream Mining |
|
|
143 | (18) |
|
|
|
|
143 | (2) |
|
6.2 Grand Map of Privacy-Preserving Data Mining |
|
|
145 | (3) |
|
|
145 | (1) |
|
|
146 | (2) |
|
6.3 From Statistical Databases to Data Mining |
|
|
148 | (3) |
|
6.3.1 Reactive Approaches |
|
|
149 | (1) |
|
6.3.2 Proactive Approaches |
|
|
149 | (2) |
|
6.4 Case Study: Frequent-Pattern Mining over Streams |
|
|
151 | (6) |
|
6.4.1 Concepts and Models |
|
|
151 | (1) |
|
6.4.2 Breaches and Attacks |
|
|
152 | (2) |
|
6.4.3 Butterfly: A Proactive Solution |
|
|
154 | (2) |
|
|
156 | (1) |
|
6.5 Roadmap for Future Research |
|
|
157 | (1) |
|
|
157 | (1) |
|
|
158 | (3) |
|
III Saptio-Temporal land Mobility Data |
|
|
161 | (78) |
|
7 Privacy Issues in Spatio-Temporal Data Mining |
|
|
163 | (20) |
|
|
|
|
163 | (2) |
|
|
165 | (1) |
|
7.3 A Taxonomy of the Privacy Methodologies |
|
|
166 | (12) |
|
7.3.1 Data Perturbation and Obfuscation |
|
|
167 | (6) |
|
7.3.2 Privacy-Aware Knowledge Sharing |
|
|
173 | (2) |
|
7.3.3 Sequential Pattern Hiding |
|
|
175 | (3) |
|
7.4 Roadmap and Future Trends |
|
|
178 | (1) |
|
7.5 Summary and Conclusions |
|
|
179 | (1) |
|
|
180 | (3) |
|
8 Probabilistic Grid-Based Approaches for Privacy-Preserving Data Mining on Moving Object Trajectories |
|
|
183 | (28) |
|
|
|
|
|
183 | (2) |
|
|
185 | (4) |
|
8.2.1 General Privacy Concepts |
|
|
185 | (1) |
|
|
186 | (1) |
|
8.2.3 Spatio-Temporal Data Mining |
|
|
187 | (2) |
|
8.3 Spatio-Temporal Anonymization |
|
|
189 | (4) |
|
8.3.1 Definition of Location Privacy |
|
|
189 | (2) |
|
8.3.2 Practical "Cut-Enclose" Implementation |
|
|
191 | (1) |
|
8.3.3 Problems with Existing Methods |
|
|
192 | (1) |
|
8.4 Grid-Based Anonymization Framework |
|
|
193 | (9) |
|
8.4.1 Grid-Based Anonymization |
|
|
193 | (2) |
|
8.4.2 System Architecture |
|
|
195 | (2) |
|
8.4.3 Finding Dense Spatio-Temporal Areas |
|
|
197 | (1) |
|
8.4.4 Frequent Route Mining |
|
|
198 | (2) |
|
8.4.5 Multi-Grid Extension |
|
|
200 | (2) |
|
|
202 | (5) |
|
8.5.1 Experimental Setup and Evaluation Criteria |
|
|
202 | (1) |
|
8.5.2 Dense ST-Area Queries |
|
|
203 | (1) |
|
8.5.3 Frequent Route Queries |
|
|
204 | (3) |
|
8.6 Conclusions and Future Work |
|
|
207 | (1) |
|
|
207 | (4) |
|
9 Privacy and Anonymity in Location Data Management |
|
|
211 | (28) |
|
|
|
|
|
|
9.1 Illustration of the Problem |
|
|
211 | (4) |
|
9.1.1 Typical Application Scenarios and Intuitive Threats |
|
|
213 | (1) |
|
9.1.2 The Need for Specialized Defense Techniques |
|
|
214 | (1) |
|
|
215 | (10) |
|
|
215 | (3) |
|
|
218 | (2) |
|
|
220 | (5) |
|
9.3 Emerging Techniques for Online Privacy Preservation in Location Data Management |
|
|
225 | (7) |
|
9.3.1 Historical Anonymization |
|
|
225 | (3) |
|
9.3.2 Location Obfuscation in Friend Finder Services |
|
|
228 | (4) |
|
9.4 Conclusions and Open Issues |
|
|
232 | (2) |
|
|
234 | (5) |
|
|
239 | (42) |
|
10 Privacy Preservation on Time Series |
|
|
241 | (24) |
|
|
|
|
|
|
241 | (2) |
|
|
243 | (3) |
|
10.2.1 Discrete Wavelet Decomposition |
|
|
244 | (1) |
|
10.2.2 Compressibility and Shrinkage |
|
|
245 | (1) |
|
10.3 Privacy and Compression |
|
|
246 | (5) |
|
10.3.1 Intuition and Motivation |
|
|
246 | (2) |
|
|
248 | (3) |
|
10.4 Compressible Perturbation |
|
|
251 | (7) |
|
|
251 | (1) |
|
10.4.2 Batch Perturbation |
|
|
252 | (4) |
|
10.4.3 Streaming Perturbation |
|
|
256 | (2) |
|
|
258 | (2) |
|
|
260 | (1) |
|
|
261 | (4) |
|
11 A Segment-Based Approach to Preserve Privacy in Time Series Data Mining |
|
|
265 | (16) |
|
|
|
|
265 | (1) |
|
11.2 Preserving Privacy in Time Series Data Mining |
|
|
266 | (4) |
|
11.2.1 Properties of Time Series |
|
|
267 | (1) |
|
11.2.2 Attacks on Time Series Privacy |
|
|
267 | (1) |
|
11.2.3 Methods fro Preserving Privacy |
|
|
268 | (2) |
|
11.3 A Segment-Based Approach for Preserving Privacy |
|
|
270 | (1) |
|
11.4 Experimental Results and Performance Evaluation |
|
|
271 | (6) |
|
11.4.1 Privacy Attacks and Classification Algorithm |
|
|
271 | (1) |
|
11.4.2 Performance Metrics |
|
|
272 | (1) |
|
11.4.3 Experimental Results in Privacy Preservation |
|
|
273 | (2) |
|
11.4.4 Experimental Results on Classification |
|
|
275 | (2) |
|
|
277 | (1) |
|
|
277 | (4) |
|
|
281 | (70) |
|
12 A Survey of Challenges and Solutions for Privacy in Clinical Genomics Data Mining |
|
|
283 | (32) |
|
|
|
|
|
283 | (2) |
|
12.2 Data Sharing, Policy, and Potential Misuse |
|
|
285 | (3) |
|
12.2.1 Policies and Protections |
|
|
286 | (2) |
|
12.2.2 Implications for Family Members |
|
|
288 | (1) |
|
12.3 Technical Protections and Weaknesses |
|
|
288 | (8) |
|
12.3.1 A Model of Re-Identification |
|
|
289 | (1) |
|
12.3.2 From Genotype to Phenotype and Back Again |
|
|
290 | (2) |
|
|
292 | (1) |
|
12.3.4 Following the Bread Crumbs |
|
|
292 | (2) |
|
12.3.5 Vulnerability Is a Matter of Context |
|
|
294 | (1) |
|
12.3.6 I Am My Sister's Keeper: Familial Inference of SNP Genotypes |
|
|
294 | (2) |
|
12.4 Achieving Formal Protection |
|
|
296 | (5) |
|
12.4.1 Protecting Privacy by Thwarting Linkage |
|
|
297 | (2) |
|
12.4.2 Protecting Privacy by Preventing Uniqueness |
|
|
299 | (2) |
|
12.5 Secure Multiparty Computation in the Biomedical Realm |
|
|
301 | (1) |
|
12.6 Discussion and the Future |
|
|
301 | (3) |
|
|
304 | (1) |
|
|
305 | (10) |
|
13 Privacy-Aware Health Information Sharing |
|
|
315 | (36) |
|
|
|
|
|
|
|
315 | (2) |
|
|
317 | (2) |
|
|
319 | (9) |
|
|
320 | (5) |
|
|
325 | (3) |
|
13.4 Privacy-Aware Information Sharing for Classification Analysis |
|
|
328 | (13) |
|
13.4.1 The Problem: Multi-QID k-Anonymity for Classification Analysis |
|
|
328 | (2) |
|
13.4.2 Masking Operations |
|
|
330 | (1) |
|
13.4.3 The Algorithm: Top-Down Refinement (TDR) |
|
|
331 | (2) |
|
13.4.4 The HL7-Compliant Data Structure for the Blood Usage Record |
|
|
333 | (1) |
|
13.4.5 Experimental Results |
|
|
334 | (6) |
|
13.4.6 Comparing with Other Algorithms |
|
|
340 | (1) |
|
13.5 Privacy-Aware Information Sharing for Cluster Analysis |
|
|
341 | (4) |
|
13.5.1 The Problem: Multi-QID k-Anonymity for Cluster Analysis |
|
|
342 | (1) |
|
13.5.2 The Solution Framework |
|
|
342 | (2) |
|
|
344 | (1) |
|
13.6 Conclusions and Extensions |
|
|
345 | (1) |
|
|
346 | (5) |
|
|
351 | (42) |
|
14 Issues with Privacy Preservation in Query Log Mining |
|
|
353 | (16) |
|
|
|
|
|
|
353 | (1) |
|
14.2 Private Information in Query Logs: Setting the Scene |
|
|
354 | (5) |
|
14.2.1 Which Information Is Private? |
|
|
355 | (2) |
|
14.2.2 Characteristics of a Query Log |
|
|
357 | (1) |
|
14.2.3 Sharing a Query Log |
|
|
358 | (1) |
|
14.2.4 The Risk of External Sources |
|
|
358 | (1) |
|
14.3 k-Anonymity in Query Log Privacy Protection |
|
|
359 | (3) |
|
14.4 Anonymization Techniques for Query Logs |
|
|
362 | (3) |
|
|
362 | (1) |
|
14.4.2 Aggregation and Generalization |
|
|
363 | (1) |
|
|
364 | (1) |
|
14.4.4 Adding Noise to Query Log Data |
|
|
364 | (1) |
|
14.5 Conclusions and Future Work |
|
|
365 | (1) |
|
|
365 | (4) |
|
15 Preserving Privacy in Web Recommender Systems |
|
|
369 | (24) |
|
|
|
|
|
|
|
369 | (2) |
|
15.2 Taxonomy of Web Personalization and Recommendation |
|
|
371 | (10) |
|
15.2.1 Content-Based Filtering |
|
|
373 | (1) |
|
15.2.2 Collaborative Filtering |
|
|
374 | (2) |
|
15.2.3 Item-Based Collaborative Filtering |
|
|
376 | (1) |
|
15.2.4 Recommending by Clustering Unordered User Sessions |
|
|
376 | (2) |
|
15.2.5 Recommending through Association Analysis of Unordered User Sessions/Profiles |
|
|
378 | (1) |
|
15.2.6 Recommending by Clustering Ordered User Sessions |
|
|
378 | (2) |
|
15.2.7 Recommending through Sequential Analysis of Ordered User Sessions/Profiles |
|
|
380 | (1) |
|
|
381 | (2) |
|
15.3.1 Privacy-Preserving Features of πSUGGEST |
|
|
382 | (1) |
|
15.4 πSUGGEST and Privacy |
|
|
383 | (4) |
|
|
386 | (1) |
|
|
387 | (1) |
|
|
388 | (1) |
|
|
389 | (4) |
|
|
393 | (106) |
|
16 The Social Web and Privacy: Practices, Reciprocity and Conflict Detection in Social Networks |
|
|
395 | (38) |
|
|
|
|
395 | (3) |
|
16.2 Approaching Privacy in Social Networks |
|
|
398 | (8) |
|
16.2.1 Data I: Personal Data |
|
|
398 | (1) |
|
16.2.2 Privacy as Hiding: Confidentiality |
|
|
398 | (2) |
|
16.2.3 Privacy as Control: Informational Self-Determination |
|
|
400 | (1) |
|
16.2.4 Privacy as Practice: Identity Construction |
|
|
401 | (2) |
|
16.2.5 Privacy in Social Network Sites: Deriving Requirements from Privacy Concerns |
|
|
403 | (1) |
|
16.2.6 Data II: Relational Information and Transitive Access Control |
|
|
404 | (2) |
|
16.3 Relational Information, Transitive Access Control and Conflicts |
|
|
406 | (8) |
|
16.3.1 Transitive Access Control and Relational Information |
|
|
406 | (1) |
|
16.3.2 Inconsistency and Reciprocity Conflicts with TAC and RI |
|
|
407 | (2) |
|
16.3.3 Formal Definitions |
|
|
409 | (5) |
|
16.4 Social Network Construction and Conflict Analysis |
|
|
414 | (8) |
|
16.4.1 Constructing the Graph with Tokens for Permissions |
|
|
414 | (3) |
|
16.4.2 Relationship Building and Information Discovery in Different Types of Social Networks |
|
|
417 | (4) |
|
|
421 | (1) |
|
16.5 Data Mining and Feedback for Awareness Tools |
|
|
422 | (5) |
|
16.5.1 Toward Conflict Avoidance and Resolution: Feedback and Trust Mechanisms |
|
|
425 | (1) |
|
16.5.2 Desing Choices in Feedback Mechanisms Based on Data Mining |
|
|
426 | (1) |
|
16.6 Conclusions and Outlook |
|
|
427 | (2) |
|
|
429 | (4) |
|
17 Privacy Protection of Personal Data in Social Networks |
|
|
433 | (26) |
|
|
|
|
|
|
433 | (1) |
|
17.2 Privacy Issues in Online Social Networks |
|
|
434 | (3) |
|
17.3 Access Control for Online Social Networks |
|
|
437 | (7) |
|
17.3.1 Challenges in Access Control for Online Social Networks |
|
|
437 | (2) |
|
17.3.2 Overview of the Literature |
|
|
439 | (2) |
|
17.3.3 Semantic-Based Access Control in Online Social Networks |
|
|
441 | (3) |
|
17.4 Privacy Issues in Relationship-Based Access Control Enforcement |
|
|
444 | (7) |
|
17.4.1 Challenges in Privacy-Aware Access Control in Online Social Networks |
|
|
444 | (2) |
|
17.4.2 Privacy-Aware Access Control in Online Social Networks |
|
|
446 | (5) |
|
17.5 Preventing Private Infromation Inference |
|
|
451 | (3) |
|
17.5.1 Overview of the Literatures |
|
|
451 | (1) |
|
17.5.2 Overview of a Typical Inference Attack on Social Networks |
|
|
452 | (2) |
|
17.6 Conclusion and Research Challenges |
|
|
454 | (2) |
|
|
456 | (3) |
|
18 Analyzing Private Network Data |
|
|
459 | (40) |
|
|
|
|
|
459 | (5) |
|
18.1.1 How Are Networks Analyzed? |
|
|
462 | (1) |
|
18.1.2 Why Should Network Data Be Kept Private? |
|
|
463 | (1) |
|
18.1.3 Are Privacy and Utility Compatible? |
|
|
464 | (1) |
|
|
464 | (1) |
|
18.2 Attacks on Anonymized Networks |
|
|
464 | (11) |
|
18.2.1 Threats: Re-Identification and Edge Disclosure |
|
|
466 | (1) |
|
18.2.2 Adversary Knowledge |
|
|
467 | (1) |
|
|
468 | (5) |
|
18.2.4 Attack Effectiveness |
|
|
473 | (2) |
|
18.3 Algorithms for Private Data Publication |
|
|
475 | (10) |
|
18.3.1 Directed Alteration of Networks |
|
|
476 | (3) |
|
18.3.2 Network Generalization |
|
|
479 | (4) |
|
18.3.3 Randomly Altering Networks |
|
|
483 | (2) |
|
18.4 Algorithms for Private Query Answering |
|
|
485 | (7) |
|
18.4.1 Differential Privacy |
|
|
486 | (1) |
|
18.4.2 Differential Privacy for Networks |
|
|
487 | (1) |
|
18.4.3 Algorithm for Differentially Private Query Answering |
|
|
488 | (1) |
|
18.4.4 Network Analysis under Differential Privacy |
|
|
489 | (3) |
|
18.5 Conclusion and Future Issues |
|
|
492 | (1) |
|
|
493 | (6) |
Index |
|
499 | |