Klientu atbalsts: 27018494

Grāmatu iegāde | Jauns profils | Ienākt

E-grāmata: Population Genomics with R

4.50/5 (2 ratings by Goodreads)

Emmanuel Paradis

Formāts: 394 pages
Izdošanas datums: 05-May-2020
Izdevniecība: CRC Press
ISBN-13: 9780429882432

Citas grāmatas par šo tēmu:

Biology, life sciences

Formāts - PDF+DRM
Cena: 62,60 €*
* ši ir gala cena, t.i., netiek piemērotas nekādas papildus atlaides
Ielikt grozā
Pievienot vēlmju sarakstam
Šī e-grāmata paredzēta tikai personīgai lietošanai. E-grāmatas nav iespējams atgriezt un nauda par iegādātajām e-grāmatām netiek atmaksāta.

Formāts: 394 pages
Izdošanas datums: 05-May-2020
Izdevniecība: CRC Press
ISBN-13: 9780429882432

Citas grāmatas par šo tēmu:

Biology, life sciences

DRM restrictions

Kopēšana (kopēt/ievietot):

nav atļauts
Drukāšana:

nav atļauts
Lietošana:

Digitālo tiesību pārvaldība (Digital Rights Management (DRM))
Izdevējs ir piegādājis šo grāmatu šifrētā veidā, kas nozīmē, ka jums ir jāinstalē bezmaksas programmatūra, lai to atbloķētu un lasītu. Lai lasītu šo e-grāmatu, jums ir jāizveido Adobe ID. Vairāk informācijas šeit. E-grāmatu var lasīt un lejupielādēt līdz 6 ierīcēm (vienam lietotājam ar vienu un to pašu Adobe ID).

Nepieciešamā programmatūra
Lai lasītu šo e-grāmatu mobilajā ierīcē (tālrunī vai planšetdatorā), jums būs jāinstalē šī bezmaksas lietotne: PocketBook Reader (iOS / Android)

Lai lejupielādētu un lasītu šo e-grāmatu datorā vai Mac datorā, jums ir nepieciešamid Adobe Digital Editions (šī ir bezmaksas lietotne, kas īpaši izstrādāta e-grāmatām. Tā nav tas pats, kas Adobe Reader, kas, iespējams, jau ir jūsu datorā.)

Jūs nevarat lasīt šo e-grāmatu, izmantojot Amazon Kindle.

Population Genomics With R presents a multidisciplinary approach to the analysis of population genomics. The methods treated cover a large number of topics from traditional population genetics to large-scale genomics with high-throughput sequencing data. Several dozen R packages are examined and integrated to provide a coherent software environment with a wide range of computational, statistical, and graphical tools. Small examples are used to illustrate the basics and published data are used as case studies. Readers are expected to have a basic knowledge of biology, genetics, and statistical inference methods. Graduate students and post-doctorate researchers will find resources to analyze their population genetic and genomic data as well as help them design new studies.

The first four chapters review the basics of population genomics, data acquisition, and the use of R to store and manipulate genomic data. Chapter 5 treats the exploration of genomic data, an important issue when analysing large data sets. The other five chapters cover linkage disequilibrium, population genomic structure, geographical structure, past demographic events, and natural selection. These chapters include supervised and unsupervised methods, admixture analysis, an in-depth treatment of multivariate methods, and advice on how to handle GIS data. The analysis of natural selection, a traditional issue in evolutionary biology, has known a revival with modern population genomic data. All chapters include exercises. Supplemental materials are available on-line (http://ape-package.ird.fr/PGR.html).

Recenzijas

"The author has taken good care of including several important as well as emerging topics (data acquisition, next generation sequencing) that would be extremely useful for the readers. suggest that this book be targeted to graduate students and researchers who have some background in basic genetics or are taking a graduate level population genetics courseThe data acquisition chapter, descriptions of DNA sample quality, and file formats are the strengths. Case studies are very valuable and would provide more "hands-on" training on working on specific population genetics problems." -Santhosh Girirajan, Pennsylvania State University

"The strength of those chapters is to provide a global coverage of the field of population genetics based on a broad spectrum of statistical methods. The author proposes to deal with population genetic analyses in a unified programming framework that uses specific classes of the R packages ape/pegas and adegenet, and I was impressed by the work done." -Oliver Francois, University Grenoble Alpes

"This book could serve as both a reference book and a textbook. Population genetics, applied bioinformatics, genomics, molecular ecology, and conservation genetic classes with a lab component at both undergraduate and graduate levels could teach from this text. Graduate students and possible postdocs in evolutionary biology and applied bioinformatics could use this as a reference. Additionally, government and non-profit organizations that process genetic samples for conservation and management purposes would find this instruction useful. What this text offers is unique in that it is focused on practical steps to analyze data using already available programs that users can installGiven the variety of subjects and types of analyses, I think it could be a valuable resource for many students." -Sarah Hendricks, San Diego Zoo Institute for Conservation Research

"The book is a pearl among the plethora of related books: it is easy to follow, with well-motivated sections, and competently written. Each chapter is rounded up with a set of tasks/exercises to deepen and check the understanding of the skills and knowledge transferred in the corresponding chapter and all the required and important literature is cited to help the reader to dive deeper into the field of personal preference. The book has the right ratio of theoretical background information with references to relevant articles and has exceptionally good to follow examples and coding sections that can be used also by beginners to start analysing their own datasets by just following the complete examples and case studies. Overall, this is a highly recommended book to everyone in the field, beginners and advanced readers alike." -Daniel Fischer, in International Statistical Reviewi, March 2022 "The author has taken good care of including several important as well as emerging topics (data acquisition, next generation sequencing) that would be extremely useful for the readers. suggest that this book be targeted to graduate students and researchers who have some background in basic genetics or are taking a graduate level population genetics courseThe data acquisition chapter, descriptions of DNA sample quality, and file formats are the strengths. Case studies are very valuable and would provide more "hands-on" training on working on specific population genetics problems." ~Santhosh Girirajan, Pennsylvania State University

"The strength of those chapters is to provide a global coverage of the field of population genetics based on a broad spectrum of statistical methods. The author proposes to deal with population genetic analyses in a unified programming framework that uses specific classes of the R packages ape/pegas and adegenet, and I was impressed by the work done." ~Oliver Francois, University Grenoble Alpes

"This book could serve as both a reference book and a textbook. Population genetics, applied bioinformatics, genomics, molecular ecology, and conservation genetic classes with a lab component at both undergraduate and graduate levels could teach from this text. Graduate students and possible postdocs in evolutionary biology and applied bioinformatics could use this as a reference. Additionally, government and non-profit organizations that process genetic samples for conservation and management purposes would find this instruction useful. What this text offers is unique in that it is focused on practical steps to analyze data using already available programs that users can installGiven the variety of subjects and types of analyses, I think it could be a valuable resource for many students." ~Sarah Hendricks, San Diego Zoo Institute for Conservation Research

"This book provides a complete and detailed step-by-step analysis of genotype research. It presents a fairly complete list of packages designed to process data in this field of research. The authors consider mainly the following packages for working with genetic data: pegas, adegenet, snpStats, ape, Biostrings, and flashpcaR."

- Igor Malyk, International Society for Clinical Biostatistics, 71 June 2021

Preface

xiii

Symbol Description

1 Introduction

(16)

1.1 Heredity, Genetics, and Genomics

(1)

1.2 Principles of Population Genomics

(10)

1.2.1 Units

(1)

1.2.2 Genome Structures

(6)

1.2.3 Mutations

(2)

1.2.4 Drift and Selection

(1)

1.3 R Packages and Conventions

(4)

1.4 Required Knowledge and Other Readings

(1)

2 Data Acquisition

(30)

2.1 Samples and Sampling Designs

(5)

2.1.1 How Much DNA in a Sample?

(1)

2.1.2 Degraded Samples

(1)

2.1.3 Sampling Designs

(3)

2.2 Low-Throughput Technologies

(5)

2.2.1 Genotypes From Phenotypes

(1)

2.2.2 DNA Cleavage Methods

(1)

2.2.3 Repeat Length Polymorphism

(1)

2.2.4 Sanger and Shotgun Sequencing

(1)

2.2.5 DNA Methylation and Bisulfite Sequencing

(2)

2.3 High-Throughput Technologies

(4)

2.3.1 DNA Microarrays

(1)

2.3.2 High-Throughput Sequencing

(1)

2.3.3 Restriction Site Associated DNA

(1)

2.3.4 RXA Sequencing

(1)

2.3.5 Exome Sequencing

(1)

2.3.6 Sequencing of Pooled Individuals

(1)

2.3.7 Designing a Study With HTS

(1)

2.3.8 The Future of DNA Sequencing

(1)

2.4 File Formats

(4)

2.4.1 Data Files

(1)

2.4.2 Archiving and Compression

(3)

2.5 Bioinformatics and Genomics

(6)

2.5.1 Processing Sanger Sequencing Data With sangerseqR

(1)

2.5.2 Read Mapping With Rsubread

(3)

2.5.3 Managing Read Alignments With Rsamtools

(2)

2.6 Simulation of High-Throughput Sequencing Data

(3)

2.7 Exercises

(3)

3 Genomic Data in R

(26)

3.1 What is an R Data Object?

(2)

3.2 Data Classes for Genomic Data

(9)

3.2.1 The Class "loci" (pegas)

(1)

3.2.2 The Class "genind" (adegenet)

(1)

3.2.3 The Classes "SNPbin" and "genlight" (adegenet)

(1)

3.2.4 The Class "SnpMatrix" (snpStats)

(1)

3.2.5 The Class "DNAbin" (ape)

(2)

3.2.6 The Classes "XString" and "XStringSet" (Biostrings)

(1)

3.2.7 The Package SNPRelate

(2)

3.3 Data Input and Output

(10)

3.3.1 Reading Text Files

(1)

3.3.2 Reading Spreadsheet Files

(1)

3.3.3 Reading VCF Files

(5)

3.3.4 Reading PED and BED Files

(1)

3.3.5 Reading Sequence Files

(1)

3.3.6 Reading Annotation Files

(1)

3.3.7 Writing Files

(1)

3.4 Internet Databases

(1)

3.5 Managing Files and Projects

(2)

3.6 Exercises

(2)

4 Data Manipulation

(20)

4.1 Basic Data Manipulation in R

(5)

4.1.1 Subsetting, Replacement, and Deletion

(1)

4.1.2 Commonly Used Functions

(2)

4.1.3 Recycling and Coercion

(1)

4.1.4 Logical Vectors

(1)

4.2 Memory Management

(2)

4.3 Conversions

(1)

4.4 Case Studies

(10)

4.4.1 Mitochondrial Genomes of the Asiatic Golden Cat

(1)

4.4.2 Complete Genomes of the Fruit Fly

(1)

4.4.3 Human Genomes

(1)

4.4.4 Influenza H1N1 Virus Sequences

(2)

4.4.5 Jaguar Microsatellites

(1)

4.4.6 Bacterial Whole Genome Sequences

(1)

4.4.7 Metabarcoding of Fish Communities

(3)

4.5 Exercises

(2)

5 Data Exploration and Summaries

(64)

5.1 Genotype and Allele Frequencies

(5)

5.1.1 Allelic Richness

(1)

5.1.2 Missing Data

(2)

5.2 Haplotype and Nucleotide Diversity

(5)

5.2.1 The Class "haplotype"

(3)

5.2.2 Haplotype and Nucleotide Diversity From DNA Sequences

101

(2)

5.3 Genetic and Genomic Distances

103

(4)

5.3.1 Theoretical Background

103

(1)

5.3.2 Hamming Distance

103

(2)

5.3.3 Distances From DNA Sequences

105

(1)

5.3.4 Distances From Allele Sharing

105

(1)

5.3.5 Distances From Microsatellites

106

(1)

5.4 Summary by Groups

107

(3)

5.5 Sliding Windows

110

(4)

5.5.1 DNA Sequences

110

(2)

5.5.2 Summaries With Genomic Positions

112

(1)

5.5.3 Package SNPRelate

113

(1)

5.6 Multivariate Methods

114

(11)

5.6.1 Matrix Decomposition

115

(1)

5.6.1.1 Eigendecomposition

115

(2)

5.6.1.2 Singular Value Decomposition

117

(1)

5.6.1.3 Power Method and Random Matrices

118

(1)

5.6.2 Principal Component Analysis

118

(1)

5.6.2.1 adegenet

119

(2)

5.6.2.2 SNPRelate

121

(2)

5.6.2.3 flashpcaR

123

(1)

5.6.3 Multidimensional Scaling

124

(1)

5.7 Case Studies

125

(29)

5.7.1 Mitochondrial Genomes of the Asiatic Golden Cat

125

(2)

5.7.2 Complete Genomes of the Fruit Fly

127

(7)

5.7.3 Human Genomes

134

(4)

5.7.4 Influenza H1N1 Virus Sequences

138

(4)

5.7.5 Jaguar Microsatellites

142

(7)

5.7.6 Bacterial Whole Genome Sequences

149

(3)

5.7.7 Metabarcoding of Fish Communities

152

(2)

5.8 Exercises

154

(3)

6 Linkage Disequilibrium and Haplotype Structure

157

(28)

6.1 Why Linkage Disequilibrium is Important?

157

(2)

6.2 Linkage Disequilibrium: Two Loci

159

(4)

6.2.1 Phased Genotypes

159

(1)

6.2.1.1 Theoretical Background

159

(1)

6.2.1.2 Implementation in pegas

160

(2)

6.2.2 Unphased Genotypes

162

(1)

6.3 More Than Two Loci

163

(9)

6.3.1 Haplotypes From Unphased Genotypes

163

(1)

6.3.1.1 The Expectation-Maximization Algorithm

164

(1)

6.3.1.2 Implementation in haplo.stats

164

(3)

6.3.2 Locus-Specific Imputation

167

(1)

6.3.3 Maps of Linkage Disequilibrium

168

(1)

6.3.3.1 Phased Genotypes With pegas

168

(2)

6.3.3.2 SNPRelate

170

(1)

6.3.3.3 snpStats

171

(1)

6.4 Case Studies

172

(8)

6.4.1 Complete Genomes of the Fruit Fly

172

(4)

6.4.2 Human Genomes

176

(1)

6.4.3 Jaguar Microsatellites

177

(3)

6.5 Exercises

180

(5)

7 Population Genetic Structure

185

(56)

7.1 Hardy-Weinberg Equilibrium

185

(2)

7.2 F-Statistics

187

(9)

7.2.1 Theoretical Background

187

(2)

7.2.2 Implementations in pegas and in mmod

189

(4)

7.2.3 Implementations in snpStats and in SNPRelate

193

(3)

7.3 Trees and Networks

196

(6)

7.3.1 Minimum Spanning Trees and Networks

197

(2)

7.3.2 Statistical Parsimony

199

(1)

7.3.3 Median Networks

200

(1)

7.3.4 Phylogenetic Trees

201

(1)

7.4 Multivariate Methods

202

(12)

7.4.1 Principles of Discriminant Analysis

202

(1)

7.4.2 Discriminant Analysis of Principal Components

203

(4)

7.4.3 Clustering

207

(1)

7.4.4 Maximum Likelihood Methods

207

(3)

7.4.5 Bayesian Clustering

210

(4)

7.5 Admixture

214

(8)

7.5.1 Likelihood Method

214

(3)

7.5.2 Principal Component Analysis of Coancestry

217

(1)

7.5.3 A Second Look at F-Statistics

218

(4)

7.6 Case Studies

222

(17)

7.6.1 Mitochondrial Genomes of the Asiatic Golden Cat

222

(3)

7.6.2 Complete Genomes of the Fruit Fly

225

(9)

7.6.3 Influenza H1N1 Virus Sequences

234

(3)

7.6.4 Jaguar Microsatellites

237

(2)

7.7 Exercises

239

(2)

8 Geographical Structure

241

(24)

8.1 Geographical Data in R

241

(2)

8.1.1 Packages and Classes

242

(1)

8.1.2 Calculating Geographical Distances

242

(1)

8.2 A Third Look at F-Statistics

243

(7)

8.2.1 Hierarchical Components of Genetic Diversity

243

(3)

8.2.2 Analysis of Molecular Variance

246

(4)

8.3 Moran / and Spatial Autocorrelation

250

(1)

8.4 Spatial Principal Component Analysis

251

(4)

8.5 Finding Boundaries Between Populations

255

(4)

8.5.1 Spatial Ancestry (tess3r)

255

(2)

8.5.2 Bayesian Methods (Geneland)

257

(2)

8.6 Case Studies

259

(4)

8.6.1 Complete Genomes of the Fruit Fly

259

(1)

8.6.2 Human Genomes

260

(3)

8.7 Exercises

263

(2)

9 Past Demographic Events

265

(44)

9.1 The Coalescent

265

(10)

9.1.1 The Standard Coalescent

265

(3)

9.1.2 The Sequential Markovian Coalescent

268

(1)

9.1.3 Simulation of Coalescent Data

269

(6)

9.2 Estimation of 6

275

(3)

9.2.1 Heterozygosity

275

(1)

9.2.2 Number of Alleles

275

(1)

9.2.3 Segregating Sites

276

(1)

9.2.4 Microsatellites

276

(1)

9.2.5 Trees

277

(1)

9.3 Coalescent-Based Inference

278

(6)

9.3.1 Maximum Likelihood Methods

278

(2)

9.3.2 Analysis of Markov Chain Monte Carlo Outputs

280

(2)

9.3.3 Skyline Plots

282

(1)

9.3.4 Bayesian Methods

282

(2)

9.4 Heterochronous Samples

284

(2)

9.5 Site Frequency Spectrum Methods

286

(6)

9.5.1 The Stairway Method

288

(1)

9.5.2 CubSFS

289

(1)

9.5.3 Popsicle

289

(3)

9.6 Whole-Genome Methods (psmcr)

292

(1)

9.7 Case Studies

293

(13)

9.7.1 Mitochondrial Genomes of the Asiatic Golden Cat

293

(5)

9.7.2 Complete Genomes of the Fruit Fly

298

(4)

9.7.3 Influenza H1N1 Virus Sequences

302

(2)

9.7.4 Bacterial Whole Genome Sequences

304

(2)

9.8 Exercises

306

(3)

10 Natural Selection

309

(32)

10.1 Testing Neutrality

309

(4)

10.1.1 Simple Tests

309

(1)

10.1.2 Selection in Protein-Coding Sequences

310

(3)

10.2 Selection Scans

313

(11)

10.2.1 A Fourth Look at F-Statistics

313

(1)

10.2.2 Association Studies (LEA)

314

(1)

10.2.3 Principal Component Analysis (pcadapt)

314

(1)

10.2.4 Scans for Selection With Extended Haplotypes

315

(5)

10.2.5 FST Outliers

320

(4)

10.3 Time-Series of Allele Frequencies

324

(2)

10.4 Case Studies

326

(12)

10.4.1 Mitochondrial Genomes of the Asiatic Golden Cat

326

(1)

10.4.2 Complete Genomes of the Fruit Fly

327

(8)

10.4.3 Influenza H1N1 Virus Sequences

335

(3)

10.5 Exercises

338

(3)

A Installing R Packages

341

(4)

B Compressing Large Sequence Files

345

(4)

C Sampling of Alleles in a Population

349

(2)

D Glossary

351

(2)

Bibliography

353

(20)

Index

373

Emmanuel Paradis is senior researcher in the French Institute of Research for Development (IRD). His research focuses on evolutionary models and their applications. The development and publication of software associated to his research has been an important aspect of his activities for more than twenty years. He adopted R as his main software for data analysis in 2000 and has since published and maintained several packages, including ape since 2002 and pegas since 2009. He gives regular workshops and trainings in several countries.

Biežāk uzdotie jautājumi par e-grāmatām

Permanent link: https://www.kriso.lv/db/97804298824322e.html

Keywords:

E-grāmata: Population Genomics with R

DRM restrictions

Kopēšana (kopēt/ievietot):

Drukāšana:

Lietošana:

Recenzijas

Konts un iestatījumi

Meklēšana

Meklēt datubāzē

Refine By

Tēmas Ebook Subjects

Izvēlieties iepirkumu grozu