Atjaunināt sīkdatņu piekrišanu

Multimodal Interactive Pattern Recognition and Applications [Hardback]

  • Formāts: Hardback, 274 pages, height x width: 235x155 mm, weight: 606 g, XVI, 274 p., 1 Hardback
  • Izdošanas datums: 28-May-2011
  • Izdevniecība: Springer London Ltd
  • ISBN-10: 0857294784
  • ISBN-13: 9780857294784
Citas grāmatas par šo tēmu:
  • Hardback
  • Cena: 91,53 €*
  • * ši ir gala cena, t.i., netiek piemērotas nekādas papildus atlaides
  • Standarta cena: 107,69 €
  • Ietaupiet 15%
  • Grāmatu piegādes laiks ir 3-4 nedēļas, ja grāmata ir uz vietas izdevniecības noliktavā. Ja izdevējam nepieciešams publicēt jaunu tirāžu, grāmatas piegāde var aizkavēties.
  • Daudzums:
  • Ielikt grozā
  • Piegādes laiks - 4-6 nedēļas
  • Pievienot vēlmju sarakstam
  • Formāts: Hardback, 274 pages, height x width: 235x155 mm, weight: 606 g, XVI, 274 p., 1 Hardback
  • Izdošanas datums: 28-May-2011
  • Izdevniecība: Springer London Ltd
  • ISBN-10: 0857294784
  • ISBN-13: 9780857294784
Citas grāmatas par šo tēmu:
Here is a different approach to pattern recognition systems, in which users of a system are involved during the recognition process, and examines a range of advanced multimodal interactions between machine and users, including handwriting, speech and gesture.

This book presents a different approach to pattern recognition (PR) systems, in which users of a system are involved during the recognition process. This can help to avoid later errors and reduce the costs associated with post-processing. The book also examines a range of advanced multimodal interactions between the machine and the users, including handwriting, speech and gestures. Features: presents an introduction to the fundamental concepts and general PR approaches for multimodal interaction modeling and search (or inference); provides numerous examples and a helpful Glossary; discusses approaches for computer-assisted transcription of handwritten and spoken documents; examines systems for computer-assisted language translation, interactive text generation and parsing, relevance-based image retrieval, and interactive document layout analysis; reviews several full working prototypes of multimodal interactive PR applications, including live demonstrations that can be publicly accessed on the Internet.
1 General Framework
1(46)
1.1 Introduction
2(1)
1.2 Classical Pattern Recognition Paradigm
3(6)
1.2.1 Decision Theory and Pattern Recognition
7(2)
1.3 Interactive Pattern Recognition and Multimodal Interaction
9(12)
1.3.1 Using the Human Feedback Directly
11(1)
1.3.2 Explicitly Taking Interaction History into Account
12(1)
1.3.3 Interaction with Deterministic Feedback
12(3)
1.3.4 Interactive Pattern Recognition and Decision Theory
15(1)
1.3.5 Multimodal Interaction
16(4)
1.3.6 Feedback Decoding and Adaptive Learning
20(1)
1.4 Interaction Protocols and Assessment
21(6)
1.4.1 General Types of Interaction Protocols
22(2)
1.4.2 Left-to-Right Interactive-Predictive Processing
24(1)
1.4.3 Active Interaction
24(1)
1.4.4 Interaction with Weaker Feedback
25(1)
1.4.5 Interaction Without Input Data
25(1)
1.4.6 Assessing IPR Systems
26(1)
1.4.7 User Effort Estimation
26(1)
1.5 IPR Search and Confidence Estimation
27(8)
1.5.1 "Word" Graphs
28(5)
1.5.2 Confidence Estimation
33(2)
1.6 Machine Learning Paradigms for IPR
35(12)
1.6.1 Online Learning
36(4)
1.6.2 Active Learning
40(1)
1.6.3 Semi-Supervised Learning
41(1)
1.6.4 Reinforcement Learning
41(2)
References
43(4)
2 Computer Assisted Transcription: General Framework
47(14)
2.1 Introduction
47(1)
2.2 Common Statistical Framework for HTR and ASR
48(2)
2.3 Common Statistical Framework for CATTI and CATS
50(2)
2.4 Adapting the Language Model
52(1)
2.5 Search and Decoding Methods
52(6)
2.5.1 Viterbi-Based Implementation
53(1)
2.5.2 Word-Graph Based Implementation
54(4)
2.6 Assessment Measures
58(3)
References
58(3)
3 Computer Assisted Transcription of Text Images
61(38)
3.1 Computer Assisted Transcription of Text Images: CATTI
62(1)
3.2 CATTI Search Problem
63(3)
3.2.1 Word-Graph-Based Search Approach
64(1)
3.2.2 Word Graph Error-Correcting Parsing
64(2)
3.3 Increasing Interaction Ergonomics in CATTI: PA-CATTI
66(4)
3.3.1 Language Model and Search
68(2)
3.4 Multimodal Computer Assisted Transcription of Text Images: MM-CATTI
70(5)
3.4.1 Language Model and Search for MM-CATTI
73(2)
3.5 Non-interactive HTR Systems
75(6)
3.5.1 Main Off-Line HTR System Overview
75(4)
3.5.2 On-Line HTR Subsystem Overview
79(2)
3.6 Tasks, Experiments and Results
81(13)
3.6.1 HTR Corpora
82(6)
3.6.2 Results
88(6)
3.7 Conclusions
94(5)
References
96(3)
4 Computer Assisted Transcription of Speech Signals
99(20)
4.1 Computer Assisted Transcription of Audio Streams
100(1)
4.2 Foundations of CATS
100(1)
4.3 Introduction to Automatic Speech Recognition
101(2)
4.3.1 Speech Acquisition
101(1)
4.3.2 Pre-process and Feature Extraction
102(1)
4.3.3 Statistical Speech Recognition
102(1)
4.4 Search in CATS
103(1)
4.5 Word-Graph-Based CATS
103(4)
4.5.1 Error Correcting Prefix Parsing
104(1)
4.5.2 A General Model for Probabilistic Prefix Parsing
105(2)
4.6 Experimental Results
107(6)
4.6.1 Corpora
108(1)
4.6.2 Error Measures
109(1)
4.6.3 Experiments
109(1)
4.6.4 Results
110(3)
4.7 Multimodality in CATS
113(2)
4.8 Experimental Results
115(1)
4.8.1 Corpora
115(1)
4.8.2 Experiments
116(1)
4.9 Conclusions
116(3)
References
117(2)
5 Active Interaction and Learning in Handwritten Text Transcription
119(16)
5.1 Introduction
119(2)
5.2 Confidence Measures
121(1)
5.3 Adaptation from Partially Supervised Transcriptions
122(1)
5.4 Active Interaction and Active Learning
122(2)
5.5 Balancing Error and Supervision Effort
124(2)
5.6 Experiments
126(6)
5.6.1 User Interaction Model
126(1)
5.6.2 Sequential Transcription Tasks
127(1)
5.6.3 Adaptation from Partially Supervised Transcriptions
128(1)
5.6.4 Active Interaction and Learning
129(1)
5.6.5 Balancing User Effort and Recognition Error
130(2)
5.7 Conclusions
132(3)
References
132(3)
6 Interactive Machine Translation
135(18)
6.1 Introduction
136(2)
6.1.1 Statistical Machine Translation
136(2)
6.2 Interactive Machine Translation
138(3)
6.2.1 Interactive Machine Translation with Confidence Estimation
140(1)
6.3 Search in Interactive Machine Translation
141(3)
6.3.1 Word-Graph Generation
141(1)
6.3.2 Error-Correcting Parsing
142(1)
6.3.3 Search for n-Best Completions
143(1)
6.4 Tasks, Experiments and Results
144(5)
6.4.1 Pre- and Post-processing
145(1)
6.4.2 Tasks
145(1)
6.4.3 Evaluation Measures
145(1)
6.4.4 Results
146(2)
6.4.5 Results Using Confidence Information
148(1)
6.5 Conclusions
149(4)
References
150(3)
7 Multi-Modality for Interactive Machine Translation
153(16)
7.1 Introduction
153(1)
7.2 Making Use of Weaker Feedback
154(3)
7.2.1 Non-explicit Positioning Pointer Actions
154(2)
7.2.2 Interaction-Explicit Pointer Actions
156(1)
7.3 Correcting Errors with Speech Recognition
157(3)
7.3.1 Unconstrained Speech Decoding (DEC)
158(1)
7.3.2 Prefix-Conditioned Speech Decoding (DEC-PREF)
159(1)
7.3.3 Prefix-Conditioned Speech Decoding (IMT-PREF)
159(1)
7.3.4 Prefix Selection (IMT-SEL)
160(1)
7.4 Correcting Errors with Handwritten Text Recognition
160(2)
7.5 Tasks, Experiments and Results
162(4)
7.5.1 Results when Incorporating Weaker Feedback
162(1)
7.5.2 Results for Speech as Input Feedback
163(2)
7.5.3 Results for Handwritten Text as Input Feedback
165(1)
7.6 Conclusions
166(3)
References
167(2)
8 Incremental and Adaptive Learning for Interactive Machine Translation
169(10)
8.1 Introduction
169(1)
8.2 On-Line Learning
170(4)
8.2.1 Concept of On-Line Learning
170(1)
8.2.2 Basic IMT System
171(1)
8.2.3 Online IMT System
172(2)
8.3 Related Topics
174(1)
8.3.1 Active Learning on IMT via Confidence Measures
174(1)
8.3.2 Bayesian Adaptation
174(1)
8.4 Results
175(1)
8.5 Conclusions
176(3)
References
176(3)
9 Interactive Parsing
179(16)
9.1 Introduction
180(2)
9.2 Interactive Parsing Framework
182(2)
9.3 Confidence Measures in IP
184(2)
9.4 IP in Left-to-Right Depth-First Order
186(2)
9.4.1 Efficient Calculation of the Next Best Tree
187(1)
9.5 IP Experimentation
188(3)
9.5.1 User Simulation Subsystem
188(1)
9.5.2 Evaluation Metrics
189(1)
9.5.3 Experimental Results
190(1)
9.6 Conclusions
191(4)
References
192(3)
10 Interactive Text Generation
195(14)
10.1 Introduction
195(2)
10.1.1 Interactive Text Generation and Interactive Pattern Recognition
196(1)
10.2 Interactive Text Generation at the Word Level
197(8)
10.2.1 N-Gram Language Modeling
198(1)
10.2.2 Searching for a Suffix
199(1)
10.2.3 Optimal Greedy Prediction of Suffixes
199(4)
10.2.4 Dealing with Sentence Length
203(1)
10.2.5 Word-Level Experiments
204(1)
10.3 Predicting at Character Level
205(2)
10.3.1 Character-Level Experiments
205(2)
10.4 Conclusions
207(2)
References
207(2)
11 Interactive Image Retrieval
209(18)
11.1 Introduction
209(1)
11.2 Relevance Feedback for Image Retrieval
210(8)
11.2.1 Probabilistic Interaction Model
210(3)
11.2.2 Greedy Approximation Relevance Feedback Algorithm
213(1)
11.2.3 A Simplified Version of GARF
214(1)
11.2.4 Experiments
214(1)
11.2.5 Image Feature Extraction
215(1)
11.2.6 Baseline Methods
216(2)
11.2.7 Discussion
218(1)
11.3 Multimodal Relevance Feedback
218(9)
11.3.1 Fusion by Refining
219(1)
11.3.2 Early Fusion
219(1)
11.3.3 Late Fusion
220(2)
11.3.4 Proposed Approach: Dynamic Linear Fusion
222(1)
11.3.5 Experiments
223(2)
11.3.6 Discussion
225(1)
References
225(2)
12 Prototypes and Demonstrators
227(40)
12.1 Introduction
228(3)
12.1.1 Passive, Left-to-Right Protocol
228(2)
12.1.2 Passive, Desultory Protocol
230(1)
12.1.3 Active Protocol
231(1)
12.1.4 Prototype Evaluation
231(1)
12.2 MM-IHT: Multimodal Interactive Handwritten Transcription
231(8)
12.2.1 Prototype Description
232(1)
12.2.2 Technology
233(2)
12.2.3 Evaluation
235(4)
12.3 IST: Interactive Speech Transcription
239(3)
12.3.1 Prototype Description
240(1)
12.3.2 Technology
241(1)
12.3.3 Evaluation
242(1)
12.4 IMT: Interactive Machine Translation
242(4)
12.4.1 Prototype Description
243(1)
12.4.2 Technology
244(2)
12.4.3 Evaluation
246(1)
12.5 ITG: Interactive Text Generation
246(5)
12.5.1 Prototype Description
247(2)
12.5.2 Technology
249(1)
12.5.3 Evaluation
250(1)
12.6 MM-IP: Multimodal Interactive Parsing
251(4)
12.6.1 Prototype Description
251(3)
12.6.2 Technology
254(1)
12.6.3 Evaluation
255(1)
12.7 GIDOC: GIMP-Based Interactive Document Transcription
255(6)
12.7.1 Prototype Description
255(5)
12.7.2 Technology
260(1)
12.7.3 Evaluation
260(1)
12.8 RISE: Relevant Image Search Engine
261(3)
12.8.1 Prototype Description
261(1)
12.8.2 Technology
262(2)
12.8.3 Evaluation
264(1)
12.9 Conclusions
264(3)
References
265(2)
Glossary 267(4)
Index 271