Preface to the 2nd edition |
|
IX | |
CHAPTER 1 Natural language processing |
|
1 | |
|
|
2 | |
|
|
5 | |
|
1.2.1 Syntax and semantics |
|
|
5 | |
|
1.2.2 Pragmatics and context |
|
|
6 | |
|
|
7 | |
|
1.2.4 Tasks and supertasks |
|
|
8 | |
|
|
11 | |
|
1.3.1 Sentence delimiters and tokenizers |
|
|
11 | |
|
1.3.2 Stemmers and taggers |
|
|
13 | |
|
1.3.3 Noun phrase and name recognizers |
|
|
16 | |
|
1.3.4 Parsers and grammars |
|
|
17 | |
|
|
20 | |
CHAPTER 2 Document retrieval |
|
23 | |
|
2.1 Information retrieval |
|
|
24 | |
|
|
25 | |
|
|
27 | |
|
|
27 | |
|
|
30 | |
|
2.3.3 Probabilistic retrieval |
|
|
33 | |
|
|
40 | |
|
2.4 Evaluating search engines |
|
|
45 | |
|
|
45 | |
|
|
46 | |
|
2.4.3 Relevance judgments |
|
|
48 | |
|
2.4.4 Total system evaluation |
|
|
51 | |
|
2.5 Attempts to enhance search performance |
|
|
52 | |
|
2.5.1 Query expansion and thesauri |
|
|
52 | |
|
2.5.2 Query expansion from relevance information* |
|
|
55 | |
|
2.6 The future of Web searching |
|
|
59 | |
|
2.6.1 Leveraging link structure 6o |
|
|
|
2.6.2 Ranking and reranking documents |
|
|
63 | |
|
2.6.3 The future of online search |
|
|
64 | |
CHAPTER 3 Information extraction |
|
69 | |
|
3.1 The message understanding conferences |
|
|
70 | |
|
|
73 | |
|
|
73 | |
|
|
74 | |
|
3.3 Finite automata in FASTUS |
|
|
75 | |
|
3.3.1 Finite state machines and regular languages |
|
|
76 | |
|
3.3.2 Finite state machines as parsers |
|
|
78 | |
|
|
88 | |
|
3.4 Context-free grammars |
|
|
92 | |
|
3.4.1 Analyzing case reports |
|
|
92 | |
|
3.4.2 Pushdown automata and context free grammars |
|
|
94 | |
|
3.4.3 Parsing with a dynamic programming algorithm |
|
|
97 | |
|
3.4.4 Coping with incompleteness and ambiguity |
|
|
102 | |
|
3.4.5 Template filling and conflict detection |
|
|
103 | |
|
3.5 Limitations of current technology and future research |
|
|
104 | |
|
3.5.1 Explicit versus implicit statements |
|
|
106 | |
|
3.5.2 Machine learning for information extraction |
|
|
107 | |
|
3.5.3 Statistical language models for information extraction |
|
|
108 | |
|
3.6 Summary of information extraction |
|
|
110 | |
CHAPTER 4 Text categorization |
|
113 | |
|
4.1 Overview of categorization tasks |
|
|
115 | |
|
4.2 Handcrafted rule based methods |
|
|
120 | |
|
4.3 Inductive learning for text classification |
|
|
122 | |
|
4.3.1 Naive Bayes classifiers |
|
|
123 | |
|
4.3.2 Linear classifiers* |
|
|
129 | |
|
4.3.3 Decision trees and decision lists |
|
|
137 | |
|
4.4 Nearest neighbor algorithms |
|
|
144 | |
|
4.5 Combining classifiers |
|
|
147 | |
|
4.6 Evaluation of text categorization systems |
|
|
154 | |
|
|
154 | |
|
|
156 | |
|
4.6.3 Evaluating effectiveness |
|
|
161 | |
CHAPTER 5 Text mining |
|
163 | |
|
|
164 | |
|
5.2 Resolving reference and coreference |
|
|
168 | |
|
5.2.1 Named entity recognition |
|
|
170 | |
|
5.2.2 The coreference task |
|
|
178 | |
|
5.3 Automatic summarization |
|
|
183 | |
|
5.3.1 Summarization tasks |
|
|
184 | |
|
5.3.2 Constructing summaries from document fragments |
|
|
188 | |
|
5.3.3 Multi-document summarization (MDS) |
|
|
196 | |
|
5.3.4 Topic detection and tracking |
|
|
199 | |
|
5.3.5 Multimedia and multilingual summarization |
|
|
204 | |
|
5.4 Testing of automatic summarization programs |
|
|
204 | |
|
5.4.1 Evaluation issues in summarization research |
|
|
205 | |
|
5.4.2 Building a corpus for training and testing |
|
|
207 | |
|
5.4.3 Summarization meets question answering at DUC |
|
|
208 | |
|
5.5 Prospects for text mining and NLP |
|
|
210 | |
References |
|
215 | |
Index |
|
227 | |