Customer Support: +372 7440010

Help | New account | Log In

E-book: Similar Languages, Varieties, and Dialects: A Computational Perspective

3.00/5 (2 ratings by Goodreads)

Edited by Marcos Zampieri (Rochester Institute of Technology, New York), Edited by Preslav Nakov

Format: EPUB+DRM
Series: Studies in Natural Language Processing
Pub. Date: 02-Sep-2021
Publisher: Cambridge University Press
Language: eng
ISBN-13: 9781108584753

Other books in subject:

Artificial intelligence

Format - EPUB+DRM
Price: 83,97 €*
* the price is final i.e. no additional discount will apply
Add to basket
Add to Wishlist
This ebook is for personal use only. E-Books are non-refundable.

Format: EPUB+DRM
Series: Studies in Natural Language Processing
Pub. Date: 02-Sep-2021
Publisher: Cambridge University Press
Language: eng
ISBN-13: 9781108584753

Other books in subject:

Artificial intelligence

DRM restrictions

Copying (copy/paste):

not allowed
Printing:

not allowed
Usage:

Digital Rights Management (DRM)
The publisher has supplied this book in encrypted form, which means that you need to install free software in order to unlock and read it. To read this e-book you have to create Adobe ID More info here. Ebook can be read and downloaded up to 6 devices (single user with the same Adobe ID).

Required software
To read this ebook on a mobile device (phone or tablet) you'll need to install this free app: PocketBook Reader (iOS / Android)

To download and read this eBook on a PC or Mac you need Adobe Digital Editions (This is a free app specially developed for eBooks. It's not the same as Adobe Reader, which you probably already have on your computer.)

You can't read this ebook with Amazon Kindle

Language resources and computational models are becoming increasingly important for the study of language variation. A main challenge of this interdisciplinary field is that linguistics researchers may not be familiar with these helpful computational tools and many NLP researchers are often not familiar with language variation phenomena. This essential reference introduces researchers to the necessary computational models for processing similar languages, varieties, and dialects. In this book, leading experts tackle the inherent challenges of the field by balancing a thorough discussion of the theoretical background with a meaningful overview of state-of-the-art language technology. The book can be used in a graduate course, or as a supplementary text for courses on language variation, dialectology, and sociolinguistics or on computational linguistics and NLP. Part 1 covers the linguistic fundamentals of the field such as the question of status and language variation. Part 2 discusses data collection and pre-processing methods. Finally, Part 3 presents NLP applications such as speech processing, machine translation, and language-specific issues in Arabic and Chinese.

Reviews

'Variation is a key aspect of human language, and yet it has been too often overlooked in computational linguistics. The book edited by Marcos Zampieri and Preslav Nakov is an important step towards filling this gap with top-level contributions that offer a new alliance between natural language processing and linguistic theory to understand this complex phenomenon and its impact on applications.' Alessandro Lenci, University of Pisa

More info

Introduces core topics in language variation and the computational methods applied to similar languages, varieties, and dialects.

List of Contributors

Foreword

xiii

Introduction

xvi

Part I Fundamentals

1 Language Variation

(1)

James A. Walker

1.1 Introduction: Defining Language Variation

(2)

1.2 Types of Linguistic Variables

(2)

1.3 Dimensions of Variation

(8)

1.4 Conclusion

(2)

2 Phonetic Variation in Dialects

(1)

Rachael Tatman

2.1 Introduction

(5)

2.2 Vowels

(1)

2.3 Consonants

(1)

2.4 Suprasegmentals

(1)

2.5 Conclusion

(3)

3 Similar Languages, Varieties, and Dialects: Status and Variation

(24)

Miriam Meyerhoff

Steffen Klaere

3.1 Introduction

(1)

3.2 Language: More Than Communication

(1)

3.3 Language and Dialect

(4)

3.4 Creoles as a Class of Natural Languages

(2)

3.5 Introducing the Corpus of Spoken Bequia Creole English

(2)

3.6 Transforming Spoken Word into Categorical Data

(2)

3.7 Feature Interrelationship

(2)

3.8 Graphical Models to Visualise Interactions

(2)

3.9 Feature Relations at a Marginalised Level

(1)

3.10 Feature Interrelationship within Communities

(1)

3.11 Community Distinction

(1)

3.12 Speaker-Specific Creole Frequency

(3)

3.13 Conclusions: Principled Methods for Exploring Systematic Dialects within Languages

(3)

4 Mutual Intelligibility

(45)

Charlotte Gooskens

Vincent J. Van Heuven

4.1 Introduction

(7)

4.2 How to Measure Intelligibility

(9)

4.3 Extra-linguistic and Para-linguistic Factors Influencing Intelligibility

(3)

4.4 Linguistic Determinants of Intelligibility

(13)

4.5 Relationship between Intelligibility and Language Trees

(4)

4.6 Conclusions, Discussion, and Desiderata for Future Research

(9)

5 Dialectology for Computational Linguists

(25)

John Nerbonne

Wilbert Heeringa

Jelena Prokic

Martijn Wieling

5.1 Introduction

(1)

5.2 Dialectology

(2)

5.3 Dialectometry

(2)

5.4 Edit Distance on Phonetic Transcriptions

100

(4)

5.5 Geography of Distributions

104

(4)

5.6 Validation

108

(3)

5.7 Emerging Opportunities and Issues

111

(1)

5.8 Conclusions

112

(9)

Part II Methods and Resources

6 Data Collection and Representation for Similar Languages, Varieties and Dialects

121

(17)

Tanja Samardzic

Nikola Ljubesic

6.1 Representing Language Variability in Corpora

122

(2)

6.2 Types of Micro-Variation and the Corresponding Data Collection Procedures

124

(9)

6.3 Privacy and Linguistic Micro-Variation

133

(1)

6.4 Conclusion

134

(4)

7 Adaptation of Morphosyntactic Taggers

138

(29)

Yves Scherrer

7.1 Introduction

138

(7)

7.2 Model Transfer Methods

145

(4)

7.3 Normalization and Other Data Transfer Methods

149

(7)

7.4 Tagger Adaptation and Multilingual Models

156

(2)

7.5 Conclusions

158

(9)

8 Sharing Dependency Parsers between Similar Languages

167

(20)

Zeljko Agic

8.1 Introduction

167

(5)

8.2 A New Hope? Notable Exceptions

172

(6)

8.3 To Conclude: A Glimpse of the Future

178

(9)

Part III Applications and Language Specific Issues

9 Dialect and Similar Language Identification

187

(17)

Marcos Zampieri

9.1 Introduction

187

(2)

9.2 A Supervised Text Classification Problem

189

(1)

9.3 Collecting Data

190

(2)

9.4 Competitions

192

(3)

9.5 DSL Shared Task 2015

195

(3)

9.6 Conclusion and Future Perspectives

198

(6)

10 Dialect Variation on Social Media

204

(15)

Dong Nguyen

10.1 Introduction

204

(2)

10.2 Social Media for Dialect Research

206

(3)

10.3 Processing Data

209

(1)

10.4 Patterns in Social Media

210

(3)

10.5 Future Outlook

213

(6)

11 Machine Translation between Similar Languages

219

(35)

Preslav Nakov

Jorg Tiedemann

11.1 Introduction

219

(1)

11.2 Models and Approaches

219

(4)

11.3 Character-Level Machine Translation

223

(5)

11.4 Closely Related Languages as MT Pivots

228

(6)

11.5 Bitext Combination

234

(4)

11.6 Language Adaptation

238

(5)

11.7 Other Approaches

243

(1)

11.8 Applications and Future Directions

244

(2)

11.9 Conclusions

246

(8)

12 Automatic Spoken Dialect Identification

254

(25)

Pedro A. Torres-Carrasquillo

Bengt J. Borg Strom

12.1 Introduction

254

(1)

12.2 Background

255

(5)

12.3 Resources for Dialect Identification

260

(5)

12.4 State of the Art

265

(4)

12.5 Standarized Evaluations and Recent Performance

269

(3)

12.6 Challenges and Future Outlook

272

(7)

13 Arabic Dialect Processing

279

(25)

Nizar Habash

13.1 Introduction

279

(1)

13.2 Arabic and Its Variants

279

(3)

13.3 Challenges of Arabic Dialect Processing

282

(3)

13.4 Arabic Dialect Resources

285

(4)

13.5 Arabic Dialect Processing Tools and Applications

289

(4)

13.6 Conclusion and Outlook

293

(11)

14 Computational Processing of Varieties of Chinese: Comparable Corpus-Driven Approaches to Light Verb Variation

304

Menghan Jiang

Hongzhi Xu

Jingxia Lin

Dingxu Shi

Chu-Ren Huang

14.1 Introduction

304

(6)

14.2 Computational Approaches to Language Variations in Chinese and Other Languages

310

(6)

14.3 Classification of Varieties of Mandarin Chinese

316

(3)

14.4 Conclusion

319

Dr. Marcos Zampieri is an assistant professor at the Rochester Institute of Technology, where he teaches courses in linguistics and natural language processing. He received his PhD for Saarland University in Germany with a thesis on computational models applied to pluricentric languages. Dr. Zampieri is one of the organizers of the well-established VarDial workshop series on NLP for Similar Languages, Varieties, and Dialects. His research deals with the application of computational models to large collections of texts. He has worked on a variety of topics including language acquisition and variation, (machine) translation and post-editing, and social media mining. Dr. Preslav Nakov is Principal Scientist at Qatar Computing Research Institute at Hamad Bin Khalifa University. He leads the Tanbih mega-project, developed in collaboration with MIT. He co-authored a book on Semantic Relations between Nominals, two books on computer algorithms, and many research papers in top-tier conferences and journals. He received the Young Researcher Award at RANLP'2011. He was also the first to receive the Bulgarian President's John Atanasoff award, named after the inventor of the first automatic electronic digital computer. Dr. Nakov's research was featured in over 100 news outlets, including Forbes, Boston Globe, and MIT Technology Review.

More information about ebooks

Permanent link: https://www.kriso.lv/db/97811085847536e.html

Keywords:

E-book: Similar Languages, Varieties, and Dialects: A Computational Perspective

DRM restrictions

Copying (copy/paste):

Printing:

Usage:

Reviews

More info

Account & settings

Search

Search database

Refine By

Subjects Ebook Subjects

Choose shopping cart