Klientu atbalsts: 27018494

Grāmatu iegāde | Jauns profils | Ienākt

E-grāmata: Analyzing Textual Information: From Words to Meanings through Numbers

4.50/5 (4 ratings by Goodreads)

Johannes Ledolter, Lea S. VanderVelde

Formāts: PDF+DRM
Sērija : Quantitative Applications in the Social Sciences
Izdošanas datums: 05-May-2021
Izdevniecība: SAGE Publications Inc
Valoda: eng
ISBN-13: 9781544390048

Citas grāmatas par šo tēmu:

Formāts - PDF+DRM
Cena: 34,49 €*
* ši ir gala cena, t.i., netiek piemērotas nekādas papildus atlaides
Ielikt grozā
Pievienot vēlmju sarakstam
Šī e-grāmata paredzēta tikai personīgai lietošanai. E-grāmatas nav iespējams atgriezt un nauda par iegādātajām e-grāmatām netiek atmaksāta.

Formāts: PDF+DRM
Sērija : Quantitative Applications in the Social Sciences
Izdošanas datums: 05-May-2021
Izdevniecība: SAGE Publications Inc
Valoda: eng
ISBN-13: 9781544390048

Citas grāmatas par šo tēmu:

DRM restrictions

Kopēšana (kopēt/ievietot):

nav atļauts
Drukāšana:

nav atļauts
Lietošana:

Digitālo tiesību pārvaldība (Digital Rights Management (DRM))
Izdevējs ir piegādājis šo grāmatu šifrētā veidā, kas nozīmē, ka jums ir jāinstalē bezmaksas programmatūra, lai to atbloķētu un lasītu. Lai lasītu šo e-grāmatu, jums ir jāizveido Adobe ID. Vairāk informācijas šeit. E-grāmatu var lasīt un lejupielādēt līdz 6 ierīcēm (vienam lietotājam ar vienu un to pašu Adobe ID).

Nepieciešamā programmatūra
Lai lasītu šo e-grāmatu mobilajā ierīcē (tālrunī vai planšetdatorā), jums būs jāinstalē šī bezmaksas lietotne: PocketBook Reader (iOS / Android)

Lai lejupielādētu un lasītu šo e-grāmatu datorā vai Mac datorā, jums ir nepieciešamid Adobe Digital Editions (šī ir bezmaksas lietotne, kas īpaši izstrādāta e-grāmatām. Tā nav tas pats, kas Adobe Reader, kas, iespējams, jau ir jūsu datorā.)

Jūs nevarat lasīt šo e-grāmatu, izmantojot Amazon Kindle.

"Researchers in the social sciences and beyond are dealing more and more with massive quantities of text data requiring analysis, from historical letters to the constant stream of content in social media. Traditional texts on statistical analysis have focused on numbers, but this book will provide a practical introduction to the quantitative analysis of textual data. Using up-to-date R methods, this book will take readers through the text analysis process, from text mining and pre-processing the text to final analysis. It includes two major case studies using historical and more contemporary text data to demonstrate the practical applications of these methods. Currently, there is no introductory how-to book on textual data analysis with R that is up-to-date and applicable across the social sciences. Code and a variety of additional resources are available on an accompanying website for the book"--

Researchers in the social sciences and beyond are dealing more and more with massive quantities of text data requiring analysis, from historical letters to the constant stream of content in social media. Traditional texts on statistical analysis have focused on numbers, but this book will provide a practical introduction to the quantitative analysis of textual data. Using up-to-date R methods, this book will take readers through the text analysis process, from text mining and pre-processing the text to final analysis. It includes two major case studies using historical and more contemporary text data to demonstrate the practical applications of these methods. Currently, there is no introductory how-to book on textual data analysis with R that is up-to-date and applicable across the social sciences. Code and a variety of additional resources are available on an accompanying website for the book.

Recenzijas

The authors balance sophisticated analysis in R with the fundamentals of text mining so that all readers can understand and apply to their own analysis of text data. -- Matthew Eshbaugh-Soha If you have a little experience with R, Ledolter and Vandervelde have created an accessible book for learning to analyze text. They provide a scaffolded experience with concrete examples and access to the text and code. They also provide technical information for those interested in a deeper dive of the material. Readers will feel comfortable analyzing their own text as they use the provided material and progress through the book. I will be adding this book to my applied practicum course. -- James B. Schreiber

Series Editor's Introduction

xiii

Preface

Acknowledgments

xvii

About the Authors

xix

Chapter 1 Introduction

(25)

1.1 Text Data

(5)

1.1.1 Introducing the Definitions

(2)

1.1.2 Types of Text Data

(1)

1.1.3 File Formats to Save and Store Text Information

(1)

1.2 The Two Applications Considered in This Book

(1)

1.3 Introductory Example and Its Analysis Using the R Statistical Software

(15)

1.4 The Introductory Example Revisited, Illustrating Concordance and Collocation Using Alternative Software

(2)

1.5 Concluding Remarks

(1)

1.6 References

(1)

Chapter 2 A Description of the Studied Text Corpora and A Discussion of Our Modeling Strategy

(10)

2.1 Introduction to the Corpora: Selecting the Texts

(1)

2.2 Debates of the 39th U.S. Congress, as recorded in the Congressional Globe

(2)

2.3 The Territorial Papers of the United States

(3)

2.4 Analyzing Text Data: Bottom-Up or Top-Down Analysis

(2)

2.5 References

(1)

Appendix to
Chapter 2: the Complete Congressional Record

(1)

Chapter 3 Preparing Text for Analysis: Text Cleaning and Formatting

(13)

3.1 Text Cleaning

(7)

3.1.1 Compacting Multiple Word Sets Into a Single Word

(1)

3.2 Text Formatting

(3)

3.2.1 Formatting by Marking Versus Formatting by Deleting

(1)

3.2.2 Formatting Beyond Metavariables: Telling the Computer What Sections to Skip When Running the Analysis

(2)

3.3 Concluding Remarks

(2)

3.4 References

(1)

Chapter 4 Word Distributions: Document-Term Matrices of Word Frequencies and the "Bag of Words" Representation

(13)

4.1 Document-Term Matrices of Frequencies

(4)

4.1.1 Creating the Document-Term Matrix in R

(1)

4.1.2 Dropping Sparse Words That Do Not Occur in Many Documents

(1)

4.2 Displaying Word Frequencies

(3)

4.3 Co-Occurrence of Terms in the Same Document

(3)

4.4 The Zipf Law: An Interesting Fact About the Distribution of Word Frequencies

(2)

4.5 References

(1)

Chapter 5 Metavariables and Text Analysis Stratified on Metavariables

(22)

5.1 The Significance of Stratification and the Importance of Metavariables

(1)

5.2 Analysis of the Territorial Papers

(9)

5.2.1 Territorial Papers: Visualization of the Metavariables

(5)

5.2.2 Territorial Papers: Stratified Text Analysis

(3)

5.3 Analysis of Speeches From the 39th Congress

(11)

5.3.1 Speeches From the 39th Congress: Visualization of the Metavariables

(4)

5.3.2 Speeches From the 39th Congress: Stratified Text Analysis

(6)

5.4 References'

(1)

Chapter 6 Sentiment Analysis

(13)

6.1 Lexicons of Sentiment-Charged Words

(4)

6.1.1 Attaching Sentiment to a Document

(2)

6.1.2 Sentiment Analysis for the Corpus and Its Documents

(1)

6.1.3 Importance of Sentiment Analysis

(1)

6.2 Applying Sentiment Analysis to the Letters of the Territorial Papers

(3)

6.3 Using Other Sentiment Dictionaries and the R Software tidytextfor Sentiment Analysis

(3)

6.4 Concluding Remarks: An Alternative Approach for Sentiment Analysis

(1)

6.5 References

(2)

Chapter 7 Clustering of Documents

(13)

7.1 Clustering Documents

(1)

7.2 Measures for the Closeness and the Distance of Documents

(3)

7.3 Methods for Clustering Documents

101

(5)

7.5.7 Hierarchical Agglomerative Clustering and Dendrograms

101

(2)

7.3.2 k-Means Clustering

103

(2)

7.3.3 Additional Remarks

105

(1)

7.4 Illustrating Clustering Methods on a Simulated Example

106

(3)

7.5 References

109

(1)

Chapter 8 Classification of Documents

110

(11)

8.1 Introduction

110

(1)

8.2 Classification Procedures

111

(5)

8.2.1 The k-Nearest Neighbor Algorithm

111

(2)

8.2.2 Naive Bayesian Analysis

113

(2)

8.2.3 Fisher Linear Discriminant Method and Linear Scoring (SVM) Methods

115

(1)

8.2.4 Evaluating Classification Rules on Hold-Out Samples

116

(1)

8.3 Two Examples Using the Congressional Speech Database

116

(3)

8.4 Concluding Remarks on Authorship Attribution: Commenting on the Field of Stylometry

119

(1)

8.5 References

120

(1)

Chapter 9 Modeling Text Data: Topic Models

121

(21)

9.1 Topic Models

121

(9)

9.1.1 Some More Technical Details and a Brief Primer on Dirichlet Distributions

126

(2)

9.1.2 Model Extensions and Useful Software, With a Tip of the Hat to Their Developers

128

(1)

9.1.3 Further Comments

129

(1)

9.2 Fitting Topic Models to the Two Corpora Studied in This Book

130

(10)

9.2.1 Topic Models for the Corpus of the Territorial Papers

130

(4)

9.2.2 Topic Models for the Corpus of Speeches From the 39th U.S. Congress

134

(6)

9.3 References

140

(2)

Chapter 10 n-Crams and Other Ways of Analyzing Adjacent Words

142

(9)

10.1 Analysis of Bigrams

142

(1)

10.2 Text Windows to Measure Word Associations Within a Neighborhood of Words and a Discussion of the R Package text2vec

143

(3)

10.3 Illustrating the Use of n-Grams: Speeches of the 39th Congress

146

(5)

Chapter 11 Concluding Remarks

151

(4)

Appendix: Listing of Website Resources

155

(6)

Index

161

JOHANNES LEDOLTER has professorships in both the Business School, where he is Robert Thomas Holmes Professor of Business Analytics, and in the Department of Statistics and Actuarial Science at the University of Iowa. He is a Fellow of the American Statistical Association and the American Society for Quality, and Elected Member of the International Statistical Institute. He is the author of several books, including Statistical Methods for Forecasting, Introduction to Regression Modeling, Testing 1-2-3: Experimental Design with Applications in Marketing and Service Operations, and Data Mining and Business Analytics with R. He was Professor of Statistics at the Vienna University of Economics and Business from 1997 to 2015, and held visiting professorships at Princeton, Yale, Stanford and the University of Chicago. Since 2011, he has been Associate Investigator at the Center for Prevention and Treatment of Vision Loss at the Iowa City VA Health Care System, which studies optic nerve and retinal disorders in relation to traumatic brain injury. Professor Ledolter enjoys working on multi-disciplinary projects that involve both numeric and text information.

LEA VANDERVELDE is Josephine Witte Professor of Law at the University of Iowa. She is an award-winning author in the fields of law and legal history. She is the author of several casebooks, dozens of articles in the nations leading law journals, and two historical works, Mrs. Dred Scott and Redemption Songs: Suing for Freedom before Dred Scott. She has been the Guggenheim Fellow for Constitutional Studies and the May Brodbeck Humanities Fellow, and has held visiting professorships at Yale, the University of Pennsylvania, and the American Bar Foundation. She is director of the RAOS project, Reconstruction Amendment Optical Scanning, and principle investigator of the Law of the Frontier project at Stanfords CESTA. She had given professional lectures all over the world.

Biežāk uzdotie jautājumi par e-grāmatām

Permanent link: https://www.kriso.lv/db/97815443900482e.html

Keywords:

E-grāmata: Analyzing Textual Information: From Words to Meanings through Numbers

DRM restrictions

Kopēšana (kopēt/ievietot):

Drukāšana:

Lietošana:

Recenzijas

Konts un iestatījumi

Meklēšana

Meklēt datubāzē

Refine By

Tēmas Ebook Subjects

Izvēlieties iepirkumu grozu