Foreword |
|
xiii | |
Preface |
|
xv | |
Editors |
|
xix | |
Contributors |
|
xxi | |
List of Reviewers |
|
xxxi | |
List of Figures |
|
xxxiii | |
List of Tables |
|
xxxvii | |
I Introduction |
|
1 | (18) |
|
1 Mining User Generated Content and Its Applications |
|
|
3 | (16) |
|
|
|
|
1.1 The Web and Web Trends |
|
|
3 | (4) |
|
1.1.1 The Emergence of the World Wide Web (WWW): From Connected Computers to Linked Documents |
|
|
3 | (2) |
|
1.1.2 The Prevailingness of Web 2.0: From "Read-Only" to Read-and-Write-Interaction |
|
|
5 | (1) |
|
|
6 | (1) |
|
1.2 Defining User Generated Content |
|
|
7 | (2) |
|
1.3 A Brief History of Creating, Searching, and Mining User Generated Content |
|
|
9 | (1) |
|
|
10 | (1) |
|
1.5 User Generated Content: Concepts and Bottlenecks |
|
|
11 | (3) |
|
1.6 Organization of the Book |
|
|
14 | (3) |
|
1.7 Mining User Generated Content: Broader Context |
|
|
17 | (2) |
II Mining Different Media |
|
19 | (108) |
|
|
21 | (22) |
|
|
|
|
|
|
2.1 Research on Social Annotations |
|
|
22 | (1) |
|
2.2 Techniques in Social Annotations |
|
|
23 | (8) |
|
2.2.1 Problem Formulation |
|
|
24 | (1) |
|
2.2.2 Social Annotation Propagation |
|
|
25 | (1) |
|
2.2.2.1 Social Propagation-Multiple Annotations |
|
|
26 | (1) |
|
2.2.2.2 Social Propagation-Multiple Link Types |
|
|
27 | (1) |
|
2.2.2.3 Social Propagation-Constraint |
|
|
27 | (1) |
|
|
28 | (1) |
|
|
29 | (1) |
|
2.2.3.1 Scalability of the Propagation |
|
|
29 | (1) |
|
2.2.3.2 Propagation through More Links |
|
|
30 | (1) |
|
2.2.3.3 Propagation with More Constraints |
|
|
30 | (1) |
|
2.2.3.4 Propagating More Information |
|
|
31 | (1) |
|
2.3 Application of Social Annotations |
|
|
31 | (10) |
|
2.3.1 Social Annotation for Personalized Search |
|
|
31 | (1) |
|
2.3.1.1 Analysis of Folksonomy |
|
|
32 | (1) |
|
2.3.1.2 A Personalized Search Framework |
|
|
33 | (1) |
|
2.3.1.3 Topic Space Selection |
|
|
34 | (1) |
|
2.3.1.4 Interest and Topic Adjusting via a Bipartite Collaborative Link Structure |
|
|
35 | (2) |
|
2.3.2 Hierarchical Semantics from Social Annotations |
|
|
37 | (1) |
|
2.3.2.1 Algorithm Overview |
|
|
39 | (2) |
|
|
41 | (2) |
|
3 Sentiment Analysis in UGC |
|
|
43 | (24) |
|
|
|
43 | (1) |
|
|
44 | (2) |
|
|
44 | (1) |
|
3.2.2 Levels of Granularity |
|
|
45 | (1) |
|
3.3 Major Issues in Sentiment Analysis |
|
|
46 | (18) |
|
|
46 | (1) |
|
3.3.2 Important Sentiment Features |
|
|
47 | (1) |
|
3.3.2.1 Single Word Features |
|
|
48 | (1) |
|
3.3.2.2 Part-of-Speech Based Features |
|
|
50 | (1) |
|
3.3.2.3 N-Grams, Phrases, and Patterns |
|
|
52 | (1) |
|
3.3.2.4 Other Sentiment Features |
|
|
55 | (1) |
|
3.3.2.5 Recommendation for Selecting Sentiment Features |
|
|
57 | (1) |
|
3.3.3 Sentiment Scoring and Classification |
|
|
57 | (1) |
|
3.3.3.1 Ad Hoc Rule-Based Approach |
|
|
57 | (1) |
|
3.3.3.2 Supervised Learning Approach |
|
|
58 | (1) |
|
3.3.3.3 Semisupervised Learning (Bootstrapping) |
|
|
60 | (4) |
|
|
64 | (3) |
|
4 Mining User Generated Data for Music Information Retrieval |
|
|
67 | (30) |
|
|
|
|
|
4.1 Introduction to Music Information Retrieval (MIR) |
|
|
68 | (2) |
|
4.1.1 User Generated Content in MIR Research |
|
|
68 | (2) |
|
4.1.2 Organization of the Chapter |
|
|
70 | (1) |
|
|
70 | (6) |
|
4.2.1 Similarity Measurement |
|
|
72 | (3) |
|
4.2.2 Information Extraction |
|
|
75 | (1) |
|
|
76 | (3) |
|
4.3.1 Similarity Measurement |
|
|
76 | (2) |
|
4.3.2 Popularity Estimation |
|
|
78 | (1) |
|
4.4 Explicit User Ratings |
|
|
79 | (4) |
|
4.4.1 Characteristics of Explicit Rating Datasets |
|
|
80 | (1) |
|
4.4.2 Matrix Factorization Models |
|
|
81 | (2) |
|
4.5 Peer-to-Peer Networks |
|
|
83 | (5) |
|
|
84 | (3) |
|
4.5.2 Peer Similarity Measurement |
|
|
87 | (1) |
|
4.5.3 Recommendation Systems |
|
|
87 | (1) |
|
4.5.4 Popularity Estimation |
|
|
88 | (1) |
|
|
88 | (5) |
|
4.6.1 Similarity Measurement |
|
|
90 | (2) |
|
4.6.2 Use of Social Tags in MIR |
|
|
92 | (1) |
|
|
93 | (1) |
|
4.7.1 Music Recommendation |
|
|
93 | (1) |
|
4.7.2 Playlist Generation |
|
|
93 | (1) |
|
|
94 | (3) |
|
5 Graph and Network Pattern Mining |
|
|
97 | (30) |
|
|
|
Mostafa Haghir Chehreghani |
|
|
|
|
98 | (1) |
|
|
99 | (3) |
|
5.3 Transactional Graph Pattern Mining |
|
|
102 | (15) |
|
5.3.1 The Graph Pattern Mining Problem |
|
|
102 | (2) |
|
5.3.2 Basic Pattern Mining Techniques |
|
|
104 | (3) |
|
5.3.3 Graph Mining Settings |
|
|
107 | (2) |
|
|
109 | (1) |
|
5.3.4.1 Enumeration Complexity |
|
|
109 | (1) |
|
5.3.4.2 Complexity Results |
|
|
110 | (1) |
|
5.3.4.3 Optimization Techniques |
|
|
111 | (1) |
|
5.3.5 Condensed Representations |
|
|
112 | (1) |
|
5.3.5.1 Free and Closed Patterns |
|
|
113 | (1) |
|
5.3.5.2 Selection of Informative Patterns |
|
|
115 | (1) |
|
5.3.6 Transactional Graph Mining Systems |
|
|
115 | (2) |
|
5.4 Single Network Mining |
|
|
117 | (8) |
|
|
117 | (1) |
|
5.4.1.1 Network Property Measures |
|
|
118 | (1) |
|
|
118 | (2) |
|
5.4.2 Pattern Matching in a Single Network |
|
|
120 | (1) |
|
5.4.2.1 Matching Small Patterns |
|
|
120 | (1) |
|
5.4.2.2 Exact Pattern Matching |
|
|
121 | (1) |
|
5.4.2.3 Approximative Algorithms for Pattern Matching |
|
|
121 | (1) |
|
5.4.2.4 Algorithms for Approximate Pattern Matching |
|
|
122 | (1) |
|
5.4.3 Pattern Mining Support Measures |
|
|
122 | (2) |
|
|
124 | (1) |
|
|
125 | (1) |
|
|
125 | (1) |
|
|
126 | (1) |
|
|
126 | (1) |
III Mining and Searching Different Types of UGC |
|
127 | (96) |
|
6 Knowledge Extraction from Wikis/BBS/Blogs/News Web Sites |
|
|
129 | (38) |
|
|
|
|
|
|
|
|
130 | (5) |
|
|
132 | (1) |
|
6.1.2 Important Challenges |
|
|
133 | (1) |
|
6.1.3 Organization of the Chapter |
|
|
134 | (1) |
|
6.2 Entity Recognition and Expansion |
|
|
135 | (8) |
|
|
135 | (1) |
|
6.2.2 Entity Set Expansion |
|
|
136 | (1) |
|
|
137 | (1) |
|
6.2.2.2 Entity Extraction |
|
|
140 | (1) |
|
6.2.2.3 Result Refinement |
|
|
143 | (1) |
|
|
143 | (1) |
|
|
143 | (11) |
|
|
143 | (1) |
|
6.3.2 Predefined Relation Extraction |
|
|
144 | (1) |
|
6.3.2.1 Identify Relations between the Given Entities |
|
|
144 | (1) |
|
6.3.2.2 Identify Entity Pairs for Given Relation Types |
|
|
145 | (1) |
|
6.3.2.3 Evaluations on Predefined Relation Extraction |
|
|
146 | (1) |
|
6.3.3 Open Domain Relation Extraction |
|
|
147 | (1) |
|
6.3.3.1 Relation Extraction in Structured/Semistructured Web Pages |
|
|
148 | (1) |
|
6.3.3.2 Relation Extraction from Unstructured Texts |
|
|
150 | (3) |
|
|
153 | (1) |
|
|
154 | (1) |
|
6.4 Named Entity Disambiguation |
|
|
154 | (12) |
|
|
154 | (2) |
|
6.4.2 Evaluation of Entity Disambiguation |
|
|
156 | (1) |
|
|
156 | (1) |
|
6.4.2.2 TAC KBP Evaluation |
|
|
156 | (1) |
|
6.4.3 Clustering-Based Entity Disambiguation |
|
|
157 | (1) |
|
6.4.3.1 Entity Mention Similarity Computation Based on Textual Features |
|
|
157 | (1) |
|
6.4.3.2 Entity Mention Similarity Computation Based on Social Networks |
|
|
158 | (1) |
|
6.4.3.3 Entity Mention Similarity Computation Based on Background Knowledge |
|
|
158 | (3) |
|
6.4.4 Entity-Linking Based Entity Disambiguation |
|
|
161 | (1) |
|
6.4.4.1 Independent Entity Linking |
|
|
161 | (1) |
|
6.4.4.2 Collective Entity Linking |
|
|
163 | (2) |
|
6.4.5 Summary and Future Work |
|
|
165 | (1) |
|
|
166 | (1) |
|
7 User Generated Content Search |
|
|
167 | (22) |
|
|
|
|
|
168 | (1) |
|
7.2 Overview of State-of-the-Art |
|
|
168 | (7) |
|
|
168 | (1) |
|
|
169 | (1) |
|
7.2.1.2 Ranking Blog Posts |
|
|
170 | (1) |
|
7.2.1.3 Blog-Specific Features |
|
|
170 | (1) |
|
7.2.1.4 Blog Representations |
|
|
171 | (1) |
|
|
171 | (1) |
|
7.2.2.1 Microblog Expansion |
|
|
172 | (1) |
|
7.2.2.2 Microblog Search Engines |
|
|
173 | (1) |
|
7.2.2.3 Microblogs as Aids to Standard Searches |
|
|
173 | (1) |
|
|
173 | (1) |
|
7.2.3.1 Social Tags for Text Search |
|
|
174 | (1) |
|
7.2.3.2 Social Tags for Image Search |
|
|
175 | (1) |
|
7.3 Social Tags for Query Expansion |
|
|
175 | (11) |
|
7.3.1 Problem Formulation |
|
|
176 | (1) |
|
|
177 | (2) |
|
7.3.2 Experimental Evaluation |
|
|
179 | (1) |
|
7.3.2.1 Methodology and Settings |
|
|
179 | (1) |
|
|
180 | (1) |
|
7.3.2.3 Findings and Discussion |
|
|
183 | (3) |
|
|
186 | (3) |
|
8 Annotating Japanese Blogs with Syntactic and Affective Information |
|
|
189 | (34) |
|
|
|
|
|
|
|
|
190 | (1) |
|
|
191 | (8) |
|
8.2.1 Large-Scale Corpora |
|
|
191 | (4) |
|
|
195 | (4) |
|
8.3 YACIS Corpus Compilation |
|
|
199 | (4) |
|
8.4 YACIS Corpus Annotation |
|
|
203 | (14) |
|
|
203 | (1) |
|
8.4.1.1 Syntactic Information Annotation Tools |
|
|
203 | (1) |
|
8.4.1.2 Affective Information Annotation Tools . |
|
|
204 | (4) |
|
8.4.2 YACIS Corpus Statistics |
|
|
208 | (1) |
|
8.4.2.1 Syntactic Information |
|
|
208 | (1) |
|
8.4.2.2 Affective Information |
|
|
211 | (6) |
|
|
217 | (2) |
|
8.5.1 Emotion Object Ontology Generation |
|
|
217 | (1) |
|
8.5.2 Moral Consequence Retrieval |
|
|
218 | (1) |
|
|
219 | (1) |
|
8.7 Conclusions and Future Work |
|
|
219 | (2) |
|
|
221 | (2) |
IV Applications |
|
223 | (106) |
|
9 Question-Answering of UGC |
|
|
225 | (34) |
|
|
|
225 | (4) |
|
9.2 Question-Answering by Searching Questions |
|
|
229 | (2) |
|
|
231 | (12) |
|
9.3.1 Query Likelihood Language Models |
|
|
232 | (3) |
|
9.3.2 Exploiting Category Information |
|
|
235 | (3) |
|
9.3.3 Structured Question Search |
|
|
238 | (1) |
|
9.3.3.1 Topic-Focus Mixture Model |
|
|
239 | (1) |
|
9.3.3.2 Entity-Based Translation Model |
|
|
239 | (1) |
|
9.3.3.3 Syntactic Tree Matching |
|
|
240 | (3) |
|
9.4 Question Quality, Answer Quality, and User Expertise |
|
|
243 | (12) |
|
9.4.1 Defining Quality and Expertise |
|
|
244 | (2) |
|
9.4.2 Indicators of Quality and Expertise |
|
|
246 | (3) |
|
9.4.3 Modeling Quality and Expertise |
|
|
249 | (1) |
|
9.4.3.1 Question Utility-Aware Retrieval Model |
|
|
249 | (1) |
|
9.4.3.2 Answer Quality-Aware Retrieval Model |
|
|
250 | (1) |
|
9.4.3.3 Expertise-Aware Retrieval Model |
|
|
250 | (1) |
|
9.4.3.4 Quality and Expertise-Aware Retrieval Model |
|
|
252 | (3) |
|
|
255 | (4) |
|
|
259 | (28) |
|
|
|
|
259 | (1) |
|
10.2 Automatic Text Summarization: A Brief Overview |
|
|
260 | (2) |
|
10.3 Why Is User Generated Content a Challenge? |
|
|
262 | (3) |
|
10.4 Text Summarization of UGC |
|
|
265 | (6) |
|
10.4.1 Summarizing Online Reviews |
|
|
265 | (2) |
|
10.4.2 Blog Summarization |
|
|
267 | (1) |
|
10.4.3 Summarizing Very Short UGC |
|
|
268 | (3) |
|
10.5 Structured, Sentiment-Based Summarization of UGC |
|
|
271 | (2) |
|
10.6 Keyword-Based Summarization of UGC |
|
|
273 | (1) |
|
10.7 Visual Summarization of UGC |
|
|
274 | (4) |
|
10.8 Evaluating UGC Summaries |
|
|
278 | (2) |
|
10.8.1 Training Data, Evaluation, and Crowdsourcing |
|
|
279 | (1) |
|
10.9 Outstanding Challenges |
|
|
280 | (4) |
|
10.9.1 Spatio-Temporal Summaries |
|
|
280 | (2) |
|
10.9.2 Exploiting Implicit UGC Semantics |
|
|
282 | (1) |
|
|
282 | (1) |
|
|
283 | (1) |
|
|
283 | (1) |
|
|
284 | (1) |
|
|
285 | (2) |
|
|
287 | (32) |
|
|
|
|
|
|
|
11.1 Recommendation Techniques |
|
|
289 | (9) |
|
11.1.1 Collaborative Filtering-Based Recommendations |
|
|
290 | (2) |
|
11.1.2 Demographic Recommendations |
|
|
292 | (1) |
|
11.1.3 Content-Based Recommendations |
|
|
293 | (1) |
|
11.1.4 Knowledge-Based Recommendations |
|
|
294 | (2) |
|
|
296 | (2) |
|
11.2 Exploiting Query Logs for Recommending Related Queries |
|
|
298 | (5) |
|
11.2.1 Query Logs as Sources of Information |
|
|
299 | (1) |
|
11.2.2 A Graph-Based Model for Query Suggestion |
|
|
300 | (3) |
|
11.3 Exploiting Photo Sharing and Wikipedia for Touristic Recommendations |
|
|
303 | (4) |
|
11.3.1 Flickr and Wikipedia as Sources of Information |
|
|
303 | (1) |
|
11.3.2 From Flickr and Wikipedia to Touristic Recommendations via Center-Piece Computation |
|
|
304 | (3) |
|
11.4 Exploiting Twitter and Wikipedia for News Recommendations |
|
|
307 | (6) |
|
11.4.1 The Blogosphere as a Source of Information |
|
|
308 | (2) |
|
11.4.2 Using the Real-Time Web for Pergonalized News Recommendations |
|
|
310 | (3) |
|
11.5 Recommender Systems for Tags |
|
|
313 | (4) |
|
11.5.1 Social Tagging Platforms as Sources of Information |
|
|
313 | (2) |
|
11.5.2 Recommending Correctly Spelled Tags |
|
|
315 | (2) |
|
|
317 | (2) |
|
12 Conclusions and a Road Map for Future Developments |
|
|
319 | (10) |
|
|
|
|
12.1 Summary of the Main Findings |
|
|
319 | (2) |
|
|
321 | (8) |
|
12.2.1 Processing Community Languages |
|
|
321 | (1) |
|
12.2.2 Image and Video Processing |
|
|
322 | (1) |
|
|
323 | (1) |
|
12.2.4 Aggregation and Linking of UGC |
|
|
323 | (2) |
|
12.2.5 Legal Considerations |
|
|
325 | (2) |
|
12.2.6 Information Credibility |
|
|
327 | (2) |
Bibliography |
|
329 | (68) |
Index |
|
397 | |