Atjaunināt sīkdatņu piekrišanu

E-grāmata: Experiment and Evaluation in Information Retrieval Models [Taylor & Francis e-book]

  • Formāts: 282 pages, 42 Tables, black and white; 32 Line drawings, black and white; 20 Halftones, black and white; 52 Illustrations, black and white
  • Izdošanas datums: 27-Jul-2017
  • Izdevniecība: CRC Press
  • ISBN-13: 9781315392622
Citas grāmatas par šo tēmu:
  • Taylor & Francis e-book
  • Cena: 195,66 €*
  • * this price gives unlimited concurrent access for unlimited time
  • Standarta cena: 279,51 €
  • Ietaupiet 30%
  • Formāts: 282 pages, 42 Tables, black and white; 32 Line drawings, black and white; 20 Halftones, black and white; 52 Illustrations, black and white
  • Izdošanas datums: 27-Jul-2017
  • Izdevniecība: CRC Press
  • ISBN-13: 9781315392622
Citas grāmatas par šo tēmu:
Experiment and Evaluation in Information Retrieval Models explores different algorithms for the application of evolutionary computation to the field of information retrieval (IR). As well as examining existing approaches to resolving some of the problems in this field, results obtained by researchers are critically evaluated in order to give readers a clear view of the topic.

In addition, this book covers Algorithmic Solutions to the Problems in Advanced IR Concepts, including Feature Selection for Document Ranking, web page classification and recommendation, Facet Generation for Document Retrieval, Duplication Detection and seeker satisfaction in question answering community Portals.

Written with students and researchers in the field on information retrieval in mind, this book is also a useful tool for researchers in the natural and social sciences interested in the latest developments in the fast-moving subject area.

Key features:

Focusing on recent topics in Information Retrieval research, Experiment and Evaluation in Information Retrieval Models explores the following topics in detail:











Searching in social media





Using semantic annotations





Ranking documents based on Facets





Evaluating IR systems offline and online





The role of evolutionary computation in IR





Document and term clustering,





Image retrieval





Design of user profiles for IR





Web page classification and recommendation





Relevance feedback approach for Document and image retrieval
Preface xiii
Acknowledgments xvii
About the Author xix
Section I Foundations
1 Introduction
3(8)
1.1 Motivation
3(1)
1.1.1 Web Search
4(1)
1.2 Evolutionary Search and IR
4(1)
1.3 Applications of IR
5(6)
1.3.1 Other Search Applications
7(4)
Section II Preliminaries
2 Preliminaries
11(8)
2.1 Information Retrieval
11(1)
2.2 Information Retrieval versus Data Retrieval
12(1)
2.3 Information Retrieval (IR) versus Information Extraction (IE)
12(1)
2.4 Components of an Information Retrieval System
13(6)
2.4.1 Document Processing
13(2)
2.4.2 Query Processing
15(1)
2.4.3 Retrieval and Feedback Generation Component
15(4)
3 Contextual and Conceptual Information Retrieval
19(8)
3.1 Context Search
19(4)
3.1.1 Need for Contextual Search
19(1)
3.1.2 Graphical Representation of Context-Based Search
19(1)
3.1.3 Architecture of Context-Based Indexing
20(2)
3.1.4 Approaches for Context Search
22(1)
3.1.4.1 Searching Based on Explicitly Specifying User Context
22(1)
3.1.4.2 Searching Based on Automatically Derived Context
22(1)
3.1.5 Traditional Method for Context-Based Search: User Profile-Based Context Search
22(1)
3.2 Conceptual Search
23(4)
3.2.1 The Semantic Web
23(1)
3.2.2 Ontology
23(1)
3.2.3 Approaches to Conceptual Search
24(1)
3.2.4 Types of Conceptual Structures
24(1)
3.2.5 Features of Conceptual Structures
25(1)
3.2.6 Framework for Concept-Based Search
25(1)
3.2.7 Concept Chain Graphs
26(1)
4 Information Retrieval Models
27(12)
4.1 Boolean Model
27(1)
4.2 Vector Model
28(1)
4.2.1 The Vector Space Model
28(1)
4.2.2 Similarity Measures
28(1)
4.2.2.1 Cosine Similarity
28(1)
4.2.2.2 Jaccard Coefficient
29(1)
4.2.2.3 Dice Coefficient
29(1)
4.3 Fixing the Term Weights
29(2)
4.3.1 Term Frequency
30(1)
4.3.2 Inverse Document Frequency
30(1)
4.3.3 tf-idf
30(1)
4.4 Probabilistic Models
31(2)
4.4.1 Probabilistic Ranking Principle (PRP)
31(1)
4.4.2 Binary Independence Retrieval (BIR) Model
32(1)
4.4.3 The Probabilistic Indexing Model
33(1)
4.5 Language Model
33(6)
4.5.1 Multinomial Distributions Model
34(1)
4.5.2 The Query Likelihood Model
35(1)
4.5.3 Extended Language Modeling Approaches
36(1)
4.5.4 Translation Model
36(1)
4.5.5 Comparisons with Traditional Probabilistic IR Approaches
37(2)
5 Evaluation of Information Retrieval Systems
39(8)
5.1 Ranked and Unranked Results
39(1)
5.1.1 Relevance
39(1)
5.2 Unranked Retrieval System
39(4)
5.2.1 Precision
39(1)
5.2.2 Recall
40(1)
5.2.3 Accuracy
40(1)
5.2.4 F-Measure
41(1)
5.2.5 G-Measure
41(1)
5.2.6 Prevalence
42(1)
5.2.7 Error Rate
42(1)
5.2.8 Fallout
43(1)
5.2.9 Miss Rate
43(1)
5.3 Ranked Retrieval System
43(4)
5.3.1 Precision and Recall Curves
43(1)
5.3.2 Average Precision
44(1)
5.3.3 Precision at k
44(1)
5.3.4 R-Precision
44(1)
5.3.5 Mean Average Precision (MAP)
45(1)
5.3.6 Breakeven Point
45(1)
5.3.7 ROC Curve
46(1)
5.3.7.1 Relationship between PR and ROC Curves
46(1)
6 Fundamentals of Evolutionary Algorithms
47(12)
6.1 Combinatorial Optimization Problems
47(1)
6.1.1 Heuristics
47(1)
6.1.2 Metaheuristics
48(1)
6.1.3 Case-Based Reasoning (CBR)
48(1)
6.2 Evolutionary Programming
48(1)
6.3 Evolutionary Computation
49(1)
6.3.1 Single-Objective Optimization
50(1)
6.3.2 Multi-Objective Optimization
50(1)
6.4 Role of Evolutionary Algorithms in Information Retrieval
50(1)
6.5 Evolutionary Algorithms
51(8)
6.5.1 Firefly Algorithm
51(1)
6.5.2 Particle Swarm Optimization
52(1)
6.5.3 Genetic Algorithms
52(1)
6.5.4 Genetic Programming
53(1)
6.5.5 Applications of Genetic Programming
54(1)
6.5.6 Simulated Annealing
54(1)
6.5.7 Harmony Search
55(1)
6.5.8 Differential Evolution
55(1)
6.5.9 Tabulated Search
56(3)
Section III Demand of Evolutionary Algorithms in IR
7 Demand of Evolutionary Algorithms in Information Retrieval
59(32)
7.1 Document Ranking
59(1)
7.1.1 Retrieval Effectiveness
59(1)
7.2 Relevance Feedback Approach
60(4)
7.2.1 Relevance Feedback in Text IR
61(1)
7.2.1.1 Query Expansion
62(1)
7.2.2 Relevance Feedback in Content-Based Image Retrieval
62(1)
7.2.3 Relevance Feedback in Region-Based Image Retrieval
63(1)
7.3 Term-Weighting Approaches
64(1)
7.3.1 Term Frequency
65(1)
7.3.2 Inverse Document Frequency
65(1)
7.4 Document Retrieval
65(1)
7.5 Feature Selection Approach
66(2)
7.5.1 Filter Method for Feature Selection
67(1)
7.5.2 Wrapper Method for Feature Selection
67(1)
7.5.3 Embedded Method for Feature Selection
67(1)
7.6 Image Retrieval
68(12)
7.6.1 Content-Based Image Retrieval
69(4)
7.6.1.1 Feature Extraction
71(1)
7.6.1.2 Color Descriptor
71(1)
7.6.1.3 Texture Descriptor
72(1)
7.6.1.4 Shape Descriptor
73(1)
7.6.1.5 Similarity Measure
73(1)
7.6.2 Region-Based Image Retrieval
73(2)
7.6.2.1 Image Segmentation
74(1)
7.6.2.2 Similarity Measure
75(1)
7.6.3 Image Summarization
75(16)
7.6.3.1 Multimodal Image Collection Summarization
76(1)
7.6.3.2 Bag of Words
77(2)
7.6.3.3 Dictionary Learning for Calculating Sparse Approximately
79(1)
7.7 Web-Based Recommendation System
80(1)
7.8 Web Page Classification
81(2)
7.9 Facet Generation
83(1)
7.10 Duplicate Detection System
84(2)
7.11 Improvisation of Seeker Satisfaction in Community Question Answering Systems
86(1)
7.12 Abstract Generation
87(4)
Section IV Model Formulations of Information Retrieval Techniques
8 TABU Annealing: An Efficient and Scalable Strategy for Document Retrieval
91(8)
8.1 Simulated Annealing
91(2)
8.1.1 The Simulated Annealing Algorithm
92(1)
8.1.2 Cooling Schedules
92(1)
8.2 TABU Annealing Algorithm
93(1)
8.3 Empirical Results and Discussion
94(5)
9 Efficient Latent Semantic Indexing-Based Information Retrieval Framework Using Particle Swarm Optimization and Simulated Annealing
99(14)
9.1 Architecture of Proposed Information Retrieval System
99(1)
9.2 Methodology and Solutions
100(6)
9.2.1 Text Preprocessing
100(1)
9.2.2 Dimensionality Reduction
101(2)
9.2.2.1 Dimensionality Reduction Using Latent Semantic Indexing
101(1)
9.2.2.2 Query Conversion Using LSI
102(1)
9.2.3 Clustering of Dimensionally Reduced Documents
103(3)
9.2.3.1 Background of Particle Swarm Optimization (PSO) Algorithm
103(2)
9.2.3.2 Background of K-Means
105(1)
9.2.3.3 Hybrid PSO + K-Means Algorithm
106(1)
9.2.4 Simulated Annealing for Document Retrieval
106(1)
9.3 Experimental Results and Discussion
106(7)
9.3.1 Performance Evaluation for Clustering
106(2)
9.3.2 Performance Evaluation for Document Retrieval
108(5)
10 Music-Inspired Optimization Algorithm: Harmony-TABU for Document Retrieval Using Rhetorical Relations and Relevance Feedback
113(12)
10.1 The Basic Harmony Search Clustering Algorithm
113(3)
10.1.1 Basic Structure of Harmony Search Algorithm
113(1)
10.1.2 Representation of Documents and Queries
113(1)
10.1.3 Representation of Solutions
114(1)
10.1.4 Features of Harmony Search
114(1)
10.1.5 Initialize the Problem and HS Parameters
115(1)
10.1.6 Harmony Memory Initialization
115(1)
10.1.7 New Harmony Improvisation
115(1)
10.1.8 Hybridization
116(1)
10.1.9 Evaluation of Solutions
116(1)
10.2 Harmony-TABU Algorithm
116(2)
10.3 Relevance Feedback and Query Expansion in IR
118(3)
10.3.1 Presentation Term Selection
118(1)
10.3.2 Direct Term Feedback (TFB)
119(1)
10.3.3 Cluster Feedback (CFB)
120(1)
10.3.4 Term-Cluster Feedback (TCFB)
120(1)
10.4 Empirical Results and Discussion
121(2)
10.4.1 Document Collections
121(1)
10.4.2 Experimental Setup
121(2)
10.5 Rhetorical Structure
123(1)
10.6 Abstract Generation
123(2)
11 Evaluation of Light Inspired Optimization Algorithm-Based Image Retrieval
125(10)
11.1 Query Selection and Distance Calculation
126(1)
11.2 Optimization Using a Stochastic Firefly Algorithm
127(2)
11.2.1 Agents Initialization and Fitness Evaluation
127(1)
11.2.2 Variation in Brightness of Firefly
127(1)
11.2.3 Strategy for Searching New Swarms
127(2)
11.3 Experimental Setup
129(1)
11.4 Visual Signature
129(1)
11.5 Performance Measures
130(1)
11.6 Parameter Settings of Firefly Algorithm
130(1)
11.7 Performance Evaluation
131(4)
12 An Evolutionary Approach for Optimizing Content-Based Image Retrieval Using Support Vector Machine
135(8)
12.1 Relevance Feedback Learning via Support Vector Machine
136(1)
12.2 Optimization Using a Stochastic Firefly Algorithm
137(2)
12.3 Image Database
139(1)
12.4 Baselines
139(1)
12.5 Comparison Methods
140(3)
13 An Application of Firefly Algorithm to Region-Based Image Retrieval
143(8)
13.1 Image Retrieval
144(2)
13.1.1 Image Segmentation
144(1)
13.1.2 Image Representation
144(1)
13.1.3 Similarity Measure
144(2)
13.2 Optimization Using a Stochastic Firefly Algorithm
146(1)
13.2.1 Firefly Agent's Initialization and Fitness Evaluation
146(1)
13.2.2 Attraction toward New Firefly
146(1)
13.2.3 Movement of Fireflies
147(1)
13.3 Image Databases
147(1)
13.4 Performance Evaluation
148(3)
14 An Evolutionary Approach for Optimizing Region-Based Image Retrieval Using Support Vector Machine
151(10)
14.1 Region-Based Image Retrieval
151(2)
14.2 Behavior of Fireflies
153(1)
14.3 Why Is the Firefly Algorithm So Efficient?
153(1)
14.4 Machine Learning
154(1)
14.5 Support Vector Machines
155(1)
14.6 Optimization of SVM by PSO
155(2)
14.6.1 SVM-Based RF
156(1)
14.7 Optimization Using a Stochastic Firefly Algorithm
157(1)
14.8 Image Databases
157(1)
14.8.1 COIL Database
157(1)
14.8.2 The Corel Database
158(1)
14.9 Baselines
158(1)
14.9.1 The Proposed SVM: FA Approach
158(1)
14.10 Discussion
159(2)
14.10.1 Comparison of FA with PSO and GA
160(1)
15 Optimization of Sparse Dictionary Model for Multimodal Image Summarization Using Firefly Algorithm
161(12)
15.1 Image Representation
162(1)
15.2 Problem Formulation
163(2)
15.3 Optimization of Dictionary Learning
165(1)
15.4 Sparse Coding
166(1)
15.5 Iterative Dictionary Selection Stage
167(1)
15.6 Performance Analysis
167(6)
15.6.1 Experiment Setup
167(1)
15.6.2 Experimental Specification
168(1)
15.6.3 Baseline Algorithms
168(1)
15.6.4 Mean Square Error Performance
168(5)
Section V Algorithmic Solutions to the Problems in Advanced IR Concepts
16 A Dynamic Feature Selection Method for Document Ranking with Relevance Feedback Approach
173(12)
16.1 Overview
173(1)
16.2 Feature Selection Procedures
173(4)
16.2.1 Markov Random Field (MRF) Model for Feature Selection
175(1)
16.2.2 Correlation-Based Feature Selection
175(1)
16.2.3 Count Difference-Based Feature Selection
176(1)
16.3 Proposed Approach for Feature Selection
177(2)
16.3.1 Feature Generalization with Association Rule Induction
178(1)
16.3.2 Ranking
178(1)
16.3.2.1 Document Ranking Using BM25 Weighting Function
179(1)
16.3.2.2 Expectation Maximization for Relevance Feedback
179(1)
16.4 Empirical Results and Discussion
179(6)
16.4.1 Dataset Used for Feature Selection
179(1)
16.4.2 n-Gram Generation
180(1)
16.4.3 Evaluation
180(5)
17 TDCCREC: An Efficient and Scalable Web-Based Recommendation System
185(12)
17.1 Recommendation Methodologies
185(5)
17.1.1 Learning Automata (LA)
186(1)
17.1.2 Weighted Association Rule
187(1)
17.1.3 Content-Based Recommendation
188(1)
17.1.4 Collaborative Filtering-Based Recommendation
189(1)
17.2 Proposed Approach: Truth Discovery-Based Content and Collaborative Recommender System (TDCCREC)
190(3)
17.3 Empirical Results and Discussion
193(4)
18 An Automatic Facet Generation Framework for Document Retrieval
197(8)
18.1 Baseline Approach
198(1)
18.1.1 Drawbacks
198(1)
18.2 Greedy Algorithm
198(1)
18.2.1 Drawbacks
199(1)
18.3 Feedback Language Model
199(1)
18.4 Proposed Method: Automatic Facet Generation Framework (AFGF)
200(2)
18.5 Empirical Results and Discussion
202(3)
19 ASPDD: An Efficient and Scalable Framework for Duplication Detection
205(8)
19.1 Duplication Detection Techniques
205(5)
19.1.1 Prior Work
207(1)
19.1.1.1 Similarity Measures
207(1)
19.1.1.2 Shingling Techniques
207(1)
19.1.2 Proposed Approach (ASPDD)
208(2)
19.2 Empirical Results and Discussion
210(3)
20 Improvisation of Seeker Satisfaction in Yahoo! Community Question Answering Using Automatic Ranking, Abstract Generation, and History Updation
213(18)
20.1 The Asker Satisfaction Problem
214(1)
20.2 Community Question Answering Problems
214(2)
20.3 Methodologies
216(4)
20.4 Experimental Setup
220(5)
20.5 Empirical Results and Discussion
225(6)
Section VI Findings and Summary
21 Findings and Summary of Text Information Retrieval
Chapters
231(4)
21.1 Findings and Summary
231(2)
21.2 Future Directions
233(2)
22 Findings and Summary of Image Retrieval and Assessment of Image Mining Systems
Chapters
235(14)
22.1 Experimental Setup
235(1)
22.2 Results and Discussions
236(1)
22.3 Findings 1: Average Precision-Recall Curves of Proposed Image Retrieval Systems for Pascal Database
237(1)
22.4 Findings 2: Average Precision and Average Recall of Proposed Methods for Different Semantic Classes
238(2)
22.5 Findings 3: Average Precision and Average Recall of Top-Ranked Results after the Ninth Feedback for Corel Database
240(1)
22.6 Findings 4: Average Precision of Top-Ranked Results after the Ninth Feedback for IR with Summarization and IR without Summarization
241(1)
22.7 Findings 5: Average Execution Time of Proposed Methods
242(1)
22.8 Findings 6: Performance Analysis of Top Retrieval Results Obtained with the Proposed Image Retrieval Systems
243(2)
22.9 Summary
245(1)
22.10 Future Scope
246(3)
Appendix: Abbreviations, Acronyms and Symbols 249(8)
Bibliography 257(22)
Index 279
Dr. K. Latha is an Assistant Professor of Computer Science and Engineering Department, Anna University,Tiruchirappalli,TamilNadu,India. She is a graduate of B.E (ECE) from Bharathidasan University, and M.E (CSE) from Madurai Kamaraj University. She earned her Doctorate from Anna University Chennai. Her areas of interest include Information Retrieval, Data mining, Text mining, Web mining, Cloud computing and Network security. She has 16 years of teaching experience and produced several doctorates in Anna University, Chennai. She has published around 80 papers in International journals and conferences and received best paper award for 5 papers conducted by IEEE. She delivered special lectures, keynotes, Presidential Address and acted as a resource and chairperson in many International conferences. She received leaders charity award in academic excellence in Madurai Kamaraj University for Master of engineering in CSE. She is closely associated with IIT Kanpur. She has received grant from AICTE, TEQIP and organized several workshops, seminar, and conferences. She has been appointed as a Research Advisory Council member by Anna University, Chennai. She has generated over 5 lakhs in research funding that includes a TEQIP II Sponsored Young faculty research support scheme. She is a Principal Investigator of research proposal entitled Sustaining Ecosystem from Species Extinction using Data Mining Techniques under YFRSS. She has authored two Engineering technical scientific books on DATA WAREHOUSING AND DATA MINING FOR ENGINEERING APPLICATIONS and SYSTEM PROGRAMMING AND ORGANIZATION have been accepted by IK International Publishing House Pvt Ltd and Narosa Publishing House International Private Limited, Publishers of Science, Technology and Medicine, New- Delhi. She has acted as an Expert in the Constitution of Oral Examination board for the conduct of Viva-Voce in respect of the Research Scholars for Ph.D Programme, Anna University,Chennai .She has been appointed as a Member of Inspection Committee for Inspection-CAI Anna University, Chennai-Affiliation-Consideration of granting of provisional /permanent affiliation for Engineering Colleges.