Atjaunināt sīkdatņu piekrišanu

E-grāmata: Bioinformatics: High Performance Parallel Computer Architectures

Edited by (Johannes Gutenberg University Mainz, Germany)
  • Formāts: 370 pages
  • Sērija : Embedded Multi-Core Systems
  • Izdošanas datums: 15-Jul-2010
  • Izdevniecība: CRC Press Inc
  • Valoda: eng
  • ISBN-13: 9781439858363
  • Formāts - EPUB+DRM
  • Cena: 219,15 €*
  • * ši ir gala cena, t.i., netiek piemērotas nekādas papildus atlaides
  • Ielikt grozā
  • Pievienot vēlmju sarakstam
  • Šī e-grāmata paredzēta tikai personīgai lietošanai. E-grāmatas nav iespējams atgriezt un nauda par iegādātajām e-grāmatām netiek atmaksāta.
  • Bibliotēkām
  • Formāts: 370 pages
  • Sērija : Embedded Multi-Core Systems
  • Izdošanas datums: 15-Jul-2010
  • Izdevniecība: CRC Press Inc
  • Valoda: eng
  • ISBN-13: 9781439858363

DRM restrictions

  • Kopēšana (kopēt/ievietot):

    nav atļauts

  • Drukāšana:

    nav atļauts

  • Lietošana:

    Digitālo tiesību pārvaldība (Digital Rights Management (DRM))
    Izdevējs ir piegādājis šo grāmatu šifrētā veidā, kas nozīmē, ka jums ir jāinstalē bezmaksas programmatūra, lai to atbloķētu un lasītu. Lai lasītu šo e-grāmatu, jums ir jāizveido Adobe ID. Vairāk informācijas šeit. E-grāmatu var lasīt un lejupielādēt līdz 6 ierīcēm (vienam lietotājam ar vienu un to pašu Adobe ID).

    Nepieciešamā programmatūra
    Lai lasītu šo e-grāmatu mobilajā ierīcē (tālrunī vai planšetdatorā), jums būs jāinstalē šī bezmaksas lietotne: PocketBook Reader (iOS / Android)

    Lai lejupielādētu un lasītu šo e-grāmatu datorā vai Mac datorā, jums ir nepieciešamid Adobe Digital Editions (šī ir bezmaksas lietotne, kas īpaši izstrādāta e-grāmatām. Tā nav tas pats, kas Adobe Reader, kas, iespējams, jau ir jūsu datorā.)

    Jūs nevarat lasīt šo e-grāmatu, izmantojot Amazon Kindle.

New sequencing technologies have broken many experimental barriers to genome scale sequencing, leading to the extraction of huge quantities of sequence data. This expansion of biological databases established the need for new ways to harness and apply the astounding amount of available genomic information and convert it into substantive biological understanding.

A complilation of recent approaches from prominent researchers, Bioinformatics: High Performance Parallel Computer Architectures discusses how to take advantage of bioinformatics applications and algorithms on a variety of modern parallel architectures. Two factors continue to drive the increasing use of modern parallel computer architectures to address problems in computational biology and bioinformatics: high-throughput techniques for DNA sequencing and gene expression analysiswhich have led to an exponential growth in the amount of digital biological dataand the multi- and many-core revolution within computer architecture.

Presenting key information about how to make optimal use of parallel architectures, this book:











Describes algorithms and tools including pairwise sequence alignment, multiple sequence alignment, BLAST, motif finding, pattern matching, sequence assembly, hidden Markov models, proteomics, and evolutionary tree reconstruction





Addresses GPGPU technology and the associated massively threaded CUDA programming model











Reviews FPGA architecture and programming





Presents several parallel algorithms for computing alignments on the Cell/BE architecture, including linear-space pairwise alignment, syntenic alignment, and spliced alignment





Assesses underlying concepts and advances in orchestrating the phylogenetic likelihood function on parallel computer architectures (ranging from FPGAs upto the IBM BlueGene/L supercomputer)





Covers several effective techniques to fully exploit the computing capability of many-core CUDA-enabled GPUs to accelerate protein sequence database searching, multiple sequence alignment, and motif finding





Explains a parallel CUDA-based method for correcting sequencing base-pair errors in HTSR data

Because the amount of publicly available sequence data is growing faster than single processor core performance speed, modern bioinformatics tools need to take advantage of parallel computer architectures. Now that the era of the many-core processor has begun, it is expected that future mainstream processors will be parallel systems. Beneficial to anyone actively involved in research and applications, this book helps you to get the most out of these tools and create optimal HPC solutions for bioinformatics.
Preface vii
Editor xi
Contributors xiii
1 Algorithms for Bioinformatics
1(28)
Bertil Schmidt
1.1 Introduction
1(1)
1.2 Pairwise Sequence Alignment
2(12)
1.2.1 Definitions and Notations
2(4)
1.2.2 DP for Optimal Pairwise Alignment with Linear Gap Penalty Function
6(4)
1.2.3 DP for Optimal Pairwise Alignment with Affine Gap Penalty Fuction
10(2)
1.2.4 Computing Alignments in Linear Space Using Divide and Conquer
12(2)
1.3 Multiple Sequence Alignment
14(8)
1.3.1 Background
14(4)
1.3.2 Progressive Alignment
18(4)
1.4 Database Search and Exact Matching
22(4)
1.4.1 Filtration
22(2)
1.4.2 Suffix Trees and Suffix Arrays
24(2)
1.5 References
26(3)
2 Introduction to GPGPUs and Massively Threaded Programming
29(20)
Robert M. Farber
2.1 Introduction
29(2)
2.2 Massive Multithreading Is the Key
31(4)
2.3 CUDA Simplifies the Creation of Massively Threaded Software
35(10)
2.3.1 Step 1: Getting (and Keeping) the Data on the GPU
38(1)
2.3.2 Step 2: Maximizing the Amount of Work Performed per Call to the GPU
39(2)
2.3.3 Step 3: Exploiting Internal Resources on the GPU
41(1)
2.3.3.1 Register and Shared Memory
42(1)
2.3.3.2 Constant Memory
43(1)
2.3.3.3 Texture Memory
43(1)
2.3.3.4 Global Memory
44(1)
2.3.3.5 Local Memory
45(1)
2.4 Visualization
45(1)
2.5 Conclusion
46(1)
2.6 References
47(2)
3 FPGA: Architecture and Programming
49(10)
Douglas Maskell
3.1 Introduction
49(1)
3.2 The Need for FPGA Computing
50(2)
3.3 FPGA Computing Architectures
52(2)
3.4 FPGA Development Tools
54(2)
3.5 Discussion
56(1)
3.6 References
56(3)
4 Parallel Algorithms for Alignments on the Cell Be
59(26)
Abhinav Sarje
Srinivas Aluru
4.1 Computing Alignments
61(2)
4.2 Sequence Alignments on the Cell Processor
63(1)
4.3 A Parallel Communication Scheme
63(5)
4.3.1 Tiling Scheme for Aligning Longer Sequences
65(1)
4.3.2 Computing the Optimal Alignment Score Using Tiling
66(1)
4.3.3 Computing an Optimal Alignment Using Tiling
67(1)
4.4 A Hybrid Parallel Algorithm
68(8)
4.4.1 Parallel Alignment Scheme Using Prefix Computations
68(2)
4.4.2 Problem Decomposition Using Wavefornt Scheme
70(2)
4.4.3 Subproblem Alignment Phase Using Hirschberg's Technique
72(1)
4.4.4 Further Optimizations: Vectorization and Memory Management
73(1)
4.4.5 Space Usage
74(1)
4.4.6 Performance of the Hybrid Algorithm
74(2)
4.5 Algorithms for Specialized Alignments
76(6)
4.5.1 Spliced Alignments
76(2)
4.5.2 Performance of Parallel Spliced Alignment Algorithm
78(1)
4.5.3 Syntenic Alignments
79(1)
4.5.4 Performance of Parallel Syntenic Alignment Algorithm
80(2)
4.6 Ending Notes
82(1)
4.7 References
82(3)
5 Orchestrating the Phylogenetic Likelihood Function on Emerging Parallel Architectures
85(32)
Alexandros Stamatakis
5.1 Phylogenetic Inference
86(2)
5.2 The Phylogenetic Likelihood Function
88(9)
5.2.1 Avoiding Numerical Underflow
92(2)
5.2.2 Memory Requirements
94(1)
5.2.3 Single or Double Precision?
95(2)
5.3 Parallelization Strategies
97(10)
5.3.1 Parallel Programming Paradigms
98(1)
5.3.2 General Fine-Grain Parallelization
98(1)
5.3.1 Parallel Programming Paradigms
98(1)
5.3.2 General Fine-Grain Parallelization
98(3)
5.3.2.1 A Library for the PLF
101(1)
5.3.2.2 Scalability Issues
102(1)
5.3.3 The Real World: Load Balance Issues
103(4)
5.4 Adaptations to Emerging Parallel Architectures
107(3)
5.5 Future Directions
110(2)
5.6 References
112(5)
6 Parallel Bioinformatics Algorithms for CUDA-Enabled GPUs
117(22)
Yongchao Liu
Bertil Schmidt
Douglas Maskell
6.1 Introduction
117(1)
6.2 Techniques for Many-Core GPUs
118(3)
6.2.1 Hybrid Computing Framework
118(1)
6.2.2 Intertask and Intratask Parallelization
119(1)
6.2.3 Coalesced Subject Sequence Arrangement
120(1)
6.2.4 Coalesced Global Memory Access
120(1)
6.2.5 Cell Block Division Method
121(1)
6.3 SW Database Search
121(2)
6.4 Multiple Sequence Alignment
123(7)
6.5 Motif Discovery
130(5)
6.6 Conclusion
135(1)
6.7 References
136(3)
7 CUDA Error Correction Method for High-Throughput Short-Read Sequencing Data
139(18)
Haixiang Shi
Weiguo Liu
Bertil Schmidt
7.1 Introduction
139(2)
7.2 Spectral Alignment Approach to Error Correction
141(2)
7.3 Parallel Error Correction with CUDA
143(6)
7.3.1 Bloom Filter Data Structure and Spectrum Computation
143(2)
7.3.2 Parallel Error Correction Using CUDA
145(2)
7.3.3 Execution Example
147(2)
7.4 Performance Evaluation
149(5)
7.5 Conclusion and Future Work
154(1)
7.6 References
155(2)
8 FPGA Acceleration of Seeded Similarity Searching
157(24)
Arpith C. Jacob
Joseph M. Lancaster
Jeremy D. Buhler
Roger D. Chamberlain
8.1 The BLAST Algorithm
161(4)
8.1.1 Seed Generation
162(1)
8.1.2 Ungapped Extension
163(1)
8.1.3 Gapped Extension
164(1)
8.1.4 Execution Profile of the BLAST Algorithm
164(1)
8.2 A Streaming Hardware Architecture for BLAST
165(9)
8.2.1 Seed Generation
166(1)
8.2.1.1 Nucleotide Seed Generation Architecture
166(2)
8.2.1.2 Protein Seed Generation
168(2)
8.2.2 Ungapped Extension
170(2)
8.2.3 Gapped Extension
172(2)
8.3 Results
174(3)
8.3.1 BLASTP
175(1)
8.3.2 BLASTN
176(1)
8.4 Conclusions
177(2)
8.5 References
179(2)
9 Seed-Based Parallel Protein Sequence Comparison Combining Multithreading, GPU, and FPGA Technologies
181(22)
Dominique Lavenier
Van-Hoa Nguyen
9.1 Introduction
181(3)
9.2 Principles of the Algorithm
184(6)
9.2.1 Overview
184(2)
9.2.2 Bank Indexing
186(2)
9.2.3 Ungap Extension
188(1)
9.2.4 Gap Extension
188(1)
9.2.5 Generic Hardware Implementation
189(1)
9.3 Parallelization
190(4)
9.3.1 UNGAP Parallelization on GPU
191(1)
9.3.2 UNGAP Parallelization on FPGA
191(2)
9.3.3 SMALL GAP Parallelization on GPU
193(1)
9.4 Comparison of the GPU/FPGA Technologies
194(3)
9.4.1 GPU Platform
194(1)
9.4.2 FPGA Platform
194(1)
9.4.3 Software and Dataset
195(1)
9.4.4 Comparison of the Execution Times
195(1)
9.4.5 GPU Implementation
196(1)
9.4.6 FPGA Implementation
197(1)
9.5 Conclusion
197(6)
9.6 References
200(3)
10 Database Searching with Profile-Hidden Markov Models on Reconfigurable and Many-Core Architectures
203(20)
John Paul Walters
Vipin Chaudhary
Bertil Schmidt
10.1 Introduction
203(1)
10.2 Background
204(5)
10.3 FPGA Parallelization and Results
209(6)
10.3.1 System Design
209(4)
10.3.2 Performance Evaluation
213(2)
10.4 GPU Parallelization and Results
215(5)
10.4.1 CUDA Hardware
216(1)
10.4.2 Results
216(1)
10.4.2.1 Database Sorting
217(1)
10.4.2.2 Memory Layout Optimizations
217(1)
10.4.2.3 Memory Hierarchy Optimizations
218(1)
10.4.2.4 Host Optimizations
219(1)
10.5 Discussion
220(1)
10.6 References
221(2)
11 Copacobana: A Massively Parallel FPGA-Based Computer Architecture
223(40)
Manfred Schimmler
Lars Wienbrandt
Tim Guneysu
Jost Bissel
11.1 Introduction
224(2)
11.1.1 History of Complexity
224(1)
11.1.2 Basic Idea of the Copacobana Series
225(1)
11.2 Copacobana 1000
226(5)
11.2.1 FPGA Module
227(1)
11.2.2 Backplane
228(1)
11.2.3 Interface Controller
229(1)
11.2.4 Application Development
230(1)
11.3 Cryptanalysis with Copacobana 1000
231(11)
11.3.1 Previous Work on DES Breaking
232(1)
11.3.2 Exhaustive Key Search on DES
233(2)
11.3.3 Breaking DES-Based Crypto Tokens
235(1)
11.3.3.1 Basics of Token-Based Data Authentication
235(2)
11.3.3.2 Cryptanalysis of the ANSI X9.9-Based Challenge-Response Authentication
237(1)
11.3.3.3 Possible Attack Scenarios on Banking Systems
238(1)
11.3.3.4 Implementing the Token Attack on Copacobana
239(3)
11.4 Copacobana 5000
242(6)
11.4.1 Direction toward New Applications
242(1)
11.4.2 Requirements
242(1)
11.4.3 Architecture of Copacobana 5000
243(1)
11.4.3.1 Bus Concept and Backplane
243(1)
11.4.3.2 FPGA Module
244(2)
11.4.3.3 Interface Controller
246(1)
11.4.3.4 Power Supply and Cooling Mechanism
246(1)
11.4.3.5 Application Development
247(1)
11.5 Applications in Bioinformatics
248(11)
11.5.1 Sequence Alignment
249(1)
11.5.1.1 Simth-Waterman Alignment
249(1)
11.5.1.2 Hardware Implementation
250(1)
11.5.1.3 Performance on Copacobana 5000
251(1)
11.5.2 Motif Finding
252(1)
11.5.2.1 The BMA Alogrithm
253(1)
11.5.2.2 Implementation of BMA
253(2)
11.5.2.3 Parallelization of BMA in Hardware
255(2)
11.5.2.4 Performance Results of BMA
257(2)
11.5.3 Future Work
259(1)
11.6 References
259(4)
12 Accelerating String Set Matching for Bioinformatics Using FPGA Hardware
263(22)
Yoginder S. Dandass
12.1 Introduction
263(3)
12.1.1 String Matching Approaches
264(1)
12.1.2 Use of the ACA in Computational Biology
264(1)
12.1.3 Use of FPGAs in Computational Biology
265(1)
12.1.4 Use of String Set Matching in FPGAs in Other Domains
265(1)
12.2 Approach
266(3)
12.2.1 The Aho-Corasick Preprocessing Phase
266(3)
12.3 FPGA Implementation of the String Set Matching DFA
269(9)
12.3.1 Bit-Split DFA Architecture
270(3)
12.3.2 Implementing Bit-Split DFA Tables in FPGAs
273(3)
12.3.3 Analysis of DFA Storage Utilization Efficiency
276(1)
12.4 Case Study
277(1)
12.4.1 Storage Utilization
277(1)
12.4.2 Implementation Performance
278(2)
12.5 Conclusions
280(1)
12.6 References
281(4)
13 Reconfigurable Neural System and Its Application to Dimeric Protein Binding Site Identification
285(28)
Feng Lin Maria Stepanova
13.1 Introduction
285(2)
13.2 Design of the Neural System
287(10)
13.2.1 Numerical Representation of DNA Sequences
287(1)
13.2.2 The FFNN
288(1)
13.2.3 The HNN
289(2)
13.2.4 Adaptation of the HNN
291(6)
13.3 Reconfigurable DP-HNN
297(5)
13.3.1 Representation of Numerical Values and Operations on FPGA
298(1)
13.3.2 Control and Matching Units
298(2)
13.3.3 Neuron and Memory Units
300(1)
13.3.4 Operation of DP-HNN
301(1)
13.4 Application to Dimeric Protein Binding Site Identification
302(5)
13.4.1 The Biological Problem
302(1)
13.4.2 Dimeric Structure of HREs
303(2)
13.4.3 Two-Phase Neural System for HRE Prediction
305(1)
13.4.4 Performance of the Hardware-Accelerated System
306(1)
13.5 Discussions
307(2)
13.6 References
309(4)
14 Parallel FPGA Search Engine for Protein Identification
313(24)
Daniel Coca
Istvan Bogdan
Robert J. Beynon
14.1 Introduction
313(2)
14.2 The Reconfigurable Computing Paradigm
315(2)
14.3 Protein Identification by Sequence Database Searching Using Mass Spectral Fingerprints
317(5)
14.3.1 Overview of the Approach
317(1)
14.3.2 Abstract Computational Model
318(1)
14.3.3 Cleavage Rules
318(3)
14.3.4 Protein Identification by Spectral Matching
321(1)
14.4 Reconfigurable Computing Platform
322(3)
14.5 Protein Sequence Database FPGA Search Engine
325(6)
14.5.1 Database Encoding
325(1)
14.5.2 Database Search Processor
326(1)
14.5.2.1 Digestion Unit
326(1)
14.5.2.2 Variable Modifications
327(1)
14.5.2.3 Scoring Unit
328(3)
14.6 Performance Evaluation
331(2)
14.7 References
333(4)
Index 337
Bertil Schmidt is Associate Professor at the School of Computer Engineering at Nanyang Technological University (NTU), Singapore. Prior to that, he was faculty member at the University of New South Wales and Senior Researcher at the University of Melbourne, Australia. At NTU he also held appointments as Program Director M.Sc. in Bioinformatics and Deputy Director of BMERC. Before coming to Singapore, he held research appointments at the Karlsruhe Institute of Technology (KIT) and RWTH Aachen. Bertil has been involved in the design and implementation of parallel algorithms and architectures for over a decade. He has worked extensively with fine-grained (e.g. GPUs, FPGAs, Cell BE), coarse-grained (clusters, grids) as well as hybrid parallel architectures. He has successfully applied these technologies to various domains including bioinformatics, image processing, multimedia video compression, and cryptography. He has published more than 35 journal papers in leading journals such as Journal of VLSI Signal Processing, Microelectronic Engineering, IEEE Transactions on Circuits and Systems II, IEEE Transactions on Parallel and Distributed Systems, IEEE Transactions on IT in Biomedicine, Journal of Parallel and Distributed Computing, Parallel Computing, Concurrency and Computation: Practice and Experience, Future Generation Computer Systems, Bioinformatics, BMC Bioinformatics, Autoimmunity, and Computer Physics Communications.