Atjaunināt sīkdatņu piekrišanu

E-grāmata: Population Genomics with R

  • Formāts: 394 pages
  • Izdošanas datums: 05-May-2020
  • Izdevniecība: CRC Press
  • ISBN-13: 9780429882432
Citas grāmatas par šo tēmu:
  • Formāts - PDF+DRM
  • Cena: 62,60 €*
  • * ši ir gala cena, t.i., netiek piemērotas nekādas papildus atlaides
  • Ielikt grozā
  • Pievienot vēlmju sarakstam
  • Šī e-grāmata paredzēta tikai personīgai lietošanai. E-grāmatas nav iespējams atgriezt un nauda par iegādātajām e-grāmatām netiek atmaksāta.
  • Formāts: 394 pages
  • Izdošanas datums: 05-May-2020
  • Izdevniecība: CRC Press
  • ISBN-13: 9780429882432
Citas grāmatas par šo tēmu:

DRM restrictions

  • Kopēšana (kopēt/ievietot):

    nav atļauts

  • Drukāšana:

    nav atļauts

  • Lietošana:

    Digitālo tiesību pārvaldība (Digital Rights Management (DRM))
    Izdevējs ir piegādājis šo grāmatu šifrētā veidā, kas nozīmē, ka jums ir jāinstalē bezmaksas programmatūra, lai to atbloķētu un lasītu. Lai lasītu šo e-grāmatu, jums ir jāizveido Adobe ID. Vairāk informācijas šeit. E-grāmatu var lasīt un lejupielādēt līdz 6 ierīcēm (vienam lietotājam ar vienu un to pašu Adobe ID).

    Nepieciešamā programmatūra
    Lai lasītu šo e-grāmatu mobilajā ierīcē (tālrunī vai planšetdatorā), jums būs jāinstalē šī bezmaksas lietotne: PocketBook Reader (iOS / Android)

    Lai lejupielādētu un lasītu šo e-grāmatu datorā vai Mac datorā, jums ir nepieciešamid Adobe Digital Editions (šī ir bezmaksas lietotne, kas īpaši izstrādāta e-grāmatām. Tā nav tas pats, kas Adobe Reader, kas, iespējams, jau ir jūsu datorā.)

    Jūs nevarat lasīt šo e-grāmatu, izmantojot Amazon Kindle.

Population Genomics With R presents a multidisciplinary approach to the analysis of population genomics. The methods treated cover a large number of topics from traditional population genetics to large-scale genomics with high-throughput sequencing data. Several dozen R packages are examined and integrated to provide a coherent software environment with a wide range of computational, statistical, and graphical tools. Small examples are used to illustrate the basics and published data are used as case studies. Readers are expected to have a basic knowledge of biology, genetics, and statistical inference methods. Graduate students and post-doctorate researchers will find resources to analyze their population genetic and genomic data as well as help them design new studies.

The first four chapters review the basics of population genomics, data acquisition, and the use of R to store and manipulate genomic data. Chapter 5 treats the exploration of genomic data, an important issue when analysing large data sets. The other five chapters cover linkage disequilibrium, population genomic structure, geographical structure, past demographic events, and natural selection. These chapters include supervised and unsupervised methods, admixture analysis, an in-depth treatment of multivariate methods, and advice on how to handle GIS data. The analysis of natural selection, a traditional issue in evolutionary biology, has known a revival with modern population genomic data. All chapters include exercises. Supplemental materials are available on-line (http://ape-package.ird.fr/PGR.html).

Recenzijas

"The author has taken good care of including several important as well as emerging topics (data acquisition, next generation sequencing) that would be extremely useful for the readers. suggest that this book be targeted to graduate students and researchers who have some background in basic genetics or are taking a graduate level population genetics courseThe data acquisition chapter, descriptions of DNA sample quality, and file formats are the strengths. Case studies are very valuable and would provide more "hands-on" training on working on specific population genetics problems." -Santhosh Girirajan, Pennsylvania State University

"The strength of those chapters is to provide a global coverage of the field of population genetics based on a broad spectrum of statistical methods. The author proposes to deal with population genetic analyses in a unified programming framework that uses specific classes of the R packages ape/pegas and adegenet, and I was impressed by the work done." -Oliver Francois, University Grenoble Alpes

"This book could serve as both a reference book and a textbook. Population genetics, applied bioinformatics, genomics, molecular ecology, and conservation genetic classes with a lab component at both undergraduate and graduate levels could teach from this text. Graduate students and possible postdocs in evolutionary biology and applied bioinformatics could use this as a reference. Additionally, government and non-profit organizations that process genetic samples for conservation and management purposes would find this instruction useful. What this text offers is unique in that it is focused on practical steps to analyze data using already available programs that users can installGiven the variety of subjects and types of analyses, I think it could be a valuable resource for many students." -Sarah Hendricks, San Diego Zoo Institute for Conservation Research

"The book is a pearl among the plethora of related books: it is easy to follow, with well-motivated sections, and competently written. Each chapter is rounded up with a set of tasks/exercises to deepen and check the understanding of the skills and knowledge transferred in the corresponding chapter and all the required and important literature is cited to help the reader to dive deeper into the field of personal preference. The book has the right ratio of theoretical background information with references to relevant articles and has exceptionally good to follow examples and coding sections that can be used also by beginners to start analysing their own datasets by just following the complete examples and case studies. Overall, this is a highly recommended book to everyone in the field, beginners and advanced readers alike." -Daniel Fischer, in International Statistical Reviewi, March 2022 "The author has taken good care of including several important as well as emerging topics (data acquisition, next generation sequencing) that would be extremely useful for the readers. suggest that this book be targeted to graduate students and researchers who have some background in basic genetics or are taking a graduate level population genetics courseThe data acquisition chapter, descriptions of DNA sample quality, and file formats are the strengths. Case studies are very valuable and would provide more "hands-on" training on working on specific population genetics problems." ~Santhosh Girirajan, Pennsylvania State University

"The strength of those chapters is to provide a global coverage of the field of population genetics based on a broad spectrum of statistical methods. The author proposes to deal with population genetic analyses in a unified programming framework that uses specific classes of the R packages ape/pegas and adegenet, and I was impressed by the work done." ~Oliver Francois, University Grenoble Alpes

"This book could serve as both a reference book and a textbook. Population genetics, applied bioinformatics, genomics, molecular ecology, and conservation genetic classes with a lab component at both undergraduate and graduate levels could teach from this text. Graduate students and possible postdocs in evolutionary biology and applied bioinformatics could use this as a reference. Additionally, government and non-profit organizations that process genetic samples for conservation and management purposes would find this instruction useful. What this text offers is unique in that it is focused on practical steps to analyze data using already available programs that users can installGiven the variety of subjects and types of analyses, I think it could be a valuable resource for many students." ~Sarah Hendricks, San Diego Zoo Institute for Conservation Research

"This book provides a complete and detailed step-by-step analysis of genotype research. It presents a fairly complete list of packages designed to process data in this field of research. The authors consider mainly the following packages for working with genetic data: pegas, adegenet, snpStats, ape, Biostrings, and flashpcaR."

- Igor Malyk, International Society for Clinical Biostatistics, 71 June 2021

Preface xiii
Symbol Description xv
1 Introduction
1(16)
1.1 Heredity, Genetics, and Genomics
1(1)
1.2 Principles of Population Genomics
2(10)
1.2.1 Units
2(1)
1.2.2 Genome Structures
3(6)
1.2.3 Mutations
9(2)
1.2.4 Drift and Selection
11(1)
1.3 R Packages and Conventions
12(4)
1.4 Required Knowledge and Other Readings
16(1)
2 Data Acquisition
17(30)
2.1 Samples and Sampling Designs
17(5)
2.1.1 How Much DNA in a Sample?
17(1)
2.1.2 Degraded Samples
18(1)
2.1.3 Sampling Designs
19(3)
2.2 Low-Throughput Technologies
22(5)
2.2.1 Genotypes From Phenotypes
22(1)
2.2.2 DNA Cleavage Methods
23(1)
2.2.3 Repeat Length Polymorphism
24(1)
2.2.4 Sanger and Shotgun Sequencing
24(1)
2.2.5 DNA Methylation and Bisulfite Sequencing
25(2)
2.3 High-Throughput Technologies
27(4)
2.3.1 DNA Microarrays
27(1)
2.3.2 High-Throughput Sequencing
27(1)
2.3.3 Restriction Site Associated DNA
28(1)
2.3.4 RXA Sequencing
29(1)
2.3.5 Exome Sequencing
29(1)
2.3.6 Sequencing of Pooled Individuals
30(1)
2.3.7 Designing a Study With HTS
30(1)
2.3.8 The Future of DNA Sequencing
30(1)
2.4 File Formats
31(4)
2.4.1 Data Files
31(1)
2.4.2 Archiving and Compression
32(3)
2.5 Bioinformatics and Genomics
35(6)
2.5.1 Processing Sanger Sequencing Data With sangerseqR
36(1)
2.5.2 Read Mapping With Rsubread
36(3)
2.5.3 Managing Read Alignments With Rsamtools
39(2)
2.6 Simulation of High-Throughput Sequencing Data
41(3)
2.7 Exercises
44(3)
3 Genomic Data in R
47(26)
3.1 What is an R Data Object?
47(2)
3.2 Data Classes for Genomic Data
49(9)
3.2.1 The Class "loci" (pegas)
49(1)
3.2.2 The Class "genind" (adegenet)
50(1)
3.2.3 The Classes "SNPbin" and "genlight" (adegenet)
51(1)
3.2.4 The Class "SnpMatrix" (snpStats)
52(1)
3.2.5 The Class "DNAbin" (ape)
53(2)
3.2.6 The Classes "XString" and "XStringSet" (Biostrings)
55(1)
3.2.7 The Package SNPRelate
56(2)
3.3 Data Input and Output
58(10)
3.3.1 Reading Text Files
58(1)
3.3.2 Reading Spreadsheet Files
59(1)
3.3.3 Reading VCF Files
60(5)
3.3.4 Reading PED and BED Files
65(1)
3.3.5 Reading Sequence Files
66(1)
3.3.6 Reading Annotation Files
67(1)
3.3.7 Writing Files
67(1)
3.4 Internet Databases
68(1)
3.5 Managing Files and Projects
69(2)
3.6 Exercises
71(2)
4 Data Manipulation
73(20)
4.1 Basic Data Manipulation in R
73(5)
4.1.1 Subsetting, Replacement, and Deletion
73(1)
4.1.2 Commonly Used Functions
74(2)
4.1.3 Recycling and Coercion
76(1)
4.1.4 Logical Vectors
77(1)
4.2 Memory Management
78(2)
4.3 Conversions
80(1)
4.4 Case Studies
81(10)
4.4.1 Mitochondrial Genomes of the Asiatic Golden Cat
82(1)
4.4.2 Complete Genomes of the Fruit Fly
83(1)
4.4.3 Human Genomes
84(1)
4.4.4 Influenza H1N1 Virus Sequences
85(2)
4.4.5 Jaguar Microsatellites
87(1)
4.4.6 Bacterial Whole Genome Sequences
87(1)
4.4.7 Metabarcoding of Fish Communities
88(3)
4.5 Exercises
91(2)
5 Data Exploration and Summaries
93(64)
5.1 Genotype and Allele Frequencies
93(5)
5.1.1 Allelic Richness
95(1)
5.1.2 Missing Data
96(2)
5.2 Haplotype and Nucleotide Diversity
98(5)
5.2.1 The Class "haplotype"
98(3)
5.2.2 Haplotype and Nucleotide Diversity From DNA Sequences
101(2)
5.3 Genetic and Genomic Distances
103(4)
5.3.1 Theoretical Background
103(1)
5.3.2 Hamming Distance
103(2)
5.3.3 Distances From DNA Sequences
105(1)
5.3.4 Distances From Allele Sharing
105(1)
5.3.5 Distances From Microsatellites
106(1)
5.4 Summary by Groups
107(3)
5.5 Sliding Windows
110(4)
5.5.1 DNA Sequences
110(2)
5.5.2 Summaries With Genomic Positions
112(1)
5.5.3 Package SNPRelate
113(1)
5.6 Multivariate Methods
114(11)
5.6.1 Matrix Decomposition
115(1)
5.6.1.1 Eigendecomposition
115(2)
5.6.1.2 Singular Value Decomposition
117(1)
5.6.1.3 Power Method and Random Matrices
118(1)
5.6.2 Principal Component Analysis
118(1)
5.6.2.1 adegenet
119(2)
5.6.2.2 SNPRelate
121(2)
5.6.2.3 flashpcaR
123(1)
5.6.3 Multidimensional Scaling
124(1)
5.7 Case Studies
125(29)
5.7.1 Mitochondrial Genomes of the Asiatic Golden Cat
125(2)
5.7.2 Complete Genomes of the Fruit Fly
127(7)
5.7.3 Human Genomes
134(4)
5.7.4 Influenza H1N1 Virus Sequences
138(4)
5.7.5 Jaguar Microsatellites
142(7)
5.7.6 Bacterial Whole Genome Sequences
149(3)
5.7.7 Metabarcoding of Fish Communities
152(2)
5.8 Exercises
154(3)
6 Linkage Disequilibrium and Haplotype Structure
157(28)
6.1 Why Linkage Disequilibrium is Important?
157(2)
6.2 Linkage Disequilibrium: Two Loci
159(4)
6.2.1 Phased Genotypes
159(1)
6.2.1.1 Theoretical Background
159(1)
6.2.1.2 Implementation in pegas
160(2)
6.2.2 Unphased Genotypes
162(1)
6.3 More Than Two Loci
163(9)
6.3.1 Haplotypes From Unphased Genotypes
163(1)
6.3.1.1 The Expectation-Maximization Algorithm
164(1)
6.3.1.2 Implementation in haplo.stats
164(3)
6.3.2 Locus-Specific Imputation
167(1)
6.3.3 Maps of Linkage Disequilibrium
168(1)
6.3.3.1 Phased Genotypes With pegas
168(2)
6.3.3.2 SNPRelate
170(1)
6.3.3.3 snpStats
171(1)
6.4 Case Studies
172(8)
6.4.1 Complete Genomes of the Fruit Fly
172(4)
6.4.2 Human Genomes
176(1)
6.4.3 Jaguar Microsatellites
177(3)
6.5 Exercises
180(5)
7 Population Genetic Structure
185(56)
7.1 Hardy-Weinberg Equilibrium
185(2)
7.2 F-Statistics
187(9)
7.2.1 Theoretical Background
187(2)
7.2.2 Implementations in pegas and in mmod
189(4)
7.2.3 Implementations in snpStats and in SNPRelate
193(3)
7.3 Trees and Networks
196(6)
7.3.1 Minimum Spanning Trees and Networks
197(2)
7.3.2 Statistical Parsimony
199(1)
7.3.3 Median Networks
200(1)
7.3.4 Phylogenetic Trees
201(1)
7.4 Multivariate Methods
202(12)
7.4.1 Principles of Discriminant Analysis
202(1)
7.4.2 Discriminant Analysis of Principal Components
203(4)
7.4.3 Clustering
207(1)
7.4.4 Maximum Likelihood Methods
207(3)
7.4.5 Bayesian Clustering
210(4)
7.5 Admixture
214(8)
7.5.1 Likelihood Method
214(3)
7.5.2 Principal Component Analysis of Coancestry
217(1)
7.5.3 A Second Look at F-Statistics
218(4)
7.6 Case Studies
222(17)
7.6.1 Mitochondrial Genomes of the Asiatic Golden Cat
222(3)
7.6.2 Complete Genomes of the Fruit Fly
225(9)
7.6.3 Influenza H1N1 Virus Sequences
234(3)
7.6.4 Jaguar Microsatellites
237(2)
7.7 Exercises
239(2)
8 Geographical Structure
241(24)
8.1 Geographical Data in R
241(2)
8.1.1 Packages and Classes
242(1)
8.1.2 Calculating Geographical Distances
242(1)
8.2 A Third Look at F-Statistics
243(7)
8.2.1 Hierarchical Components of Genetic Diversity
243(3)
8.2.2 Analysis of Molecular Variance
246(4)
8.3 Moran / and Spatial Autocorrelation
250(1)
8.4 Spatial Principal Component Analysis
251(4)
8.5 Finding Boundaries Between Populations
255(4)
8.5.1 Spatial Ancestry (tess3r)
255(2)
8.5.2 Bayesian Methods (Geneland)
257(2)
8.6 Case Studies
259(4)
8.6.1 Complete Genomes of the Fruit Fly
259(1)
8.6.2 Human Genomes
260(3)
8.7 Exercises
263(2)
9 Past Demographic Events
265(44)
9.1 The Coalescent
265(10)
9.1.1 The Standard Coalescent
265(3)
9.1.2 The Sequential Markovian Coalescent
268(1)
9.1.3 Simulation of Coalescent Data
269(6)
9.2 Estimation of 6
275(3)
9.2.1 Heterozygosity
275(1)
9.2.2 Number of Alleles
275(1)
9.2.3 Segregating Sites
276(1)
9.2.4 Microsatellites
276(1)
9.2.5 Trees
277(1)
9.3 Coalescent-Based Inference
278(6)
9.3.1 Maximum Likelihood Methods
278(2)
9.3.2 Analysis of Markov Chain Monte Carlo Outputs
280(2)
9.3.3 Skyline Plots
282(1)
9.3.4 Bayesian Methods
282(2)
9.4 Heterochronous Samples
284(2)
9.5 Site Frequency Spectrum Methods
286(6)
9.5.1 The Stairway Method
288(1)
9.5.2 CubSFS
289(1)
9.5.3 Popsicle
289(3)
9.6 Whole-Genome Methods (psmcr)
292(1)
9.7 Case Studies
293(13)
9.7.1 Mitochondrial Genomes of the Asiatic Golden Cat
293(5)
9.7.2 Complete Genomes of the Fruit Fly
298(4)
9.7.3 Influenza H1N1 Virus Sequences
302(2)
9.7.4 Bacterial Whole Genome Sequences
304(2)
9.8 Exercises
306(3)
10 Natural Selection
309(32)
10.1 Testing Neutrality
309(4)
10.1.1 Simple Tests
309(1)
10.1.2 Selection in Protein-Coding Sequences
310(3)
10.2 Selection Scans
313(11)
10.2.1 A Fourth Look at F-Statistics
313(1)
10.2.2 Association Studies (LEA)
314(1)
10.2.3 Principal Component Analysis (pcadapt)
314(1)
10.2.4 Scans for Selection With Extended Haplotypes
315(5)
10.2.5 FST Outliers
320(4)
10.3 Time-Series of Allele Frequencies
324(2)
10.4 Case Studies
326(12)
10.4.1 Mitochondrial Genomes of the Asiatic Golden Cat
326(1)
10.4.2 Complete Genomes of the Fruit Fly
327(8)
10.4.3 Influenza H1N1 Virus Sequences
335(3)
10.5 Exercises
338(3)
A Installing R Packages 341(4)
B Compressing Large Sequence Files 345(4)
C Sampling of Alleles in a Population 349(2)
D Glossary 351(2)
Bibliography 353(20)
Index 373
Emmanuel Paradis is senior researcher in the French Institute of Research for Development (IRD). His research focuses on evolutionary models and their applications. The development and publication of software associated to his research has been an important aspect of his activities for more than twenty years. He adopted R as his main software for data analysis in 2000 and has since published and maintained several packages, including ape since 2002 and pegas since 2009. He gives regular workshops and trainings in several countries.