Preface |
|
v | |
Acknowledgement |
|
vii | |
About the Author |
|
viii | |
|
1 Introduction to Modern Molecular Biology |
|
|
1 | (20) |
|
1.1 Cells store large amounts of information in DNA |
|
|
1 | (6) |
|
1.2 Cells process complex information |
|
|
7 | (5) |
|
1.3 Cellular life is chemically complex and somewhat stochastic |
|
|
12 | (7) |
|
1.4 Challenges in analyzing complex biodata |
|
|
19 | (1) |
|
|
19 | (2) |
|
|
21 | (39) |
|
2.1 Primary sequence and structure data |
|
|
22 | (9) |
|
2.1.1 DNA sequence databases |
|
|
22 | (5) |
|
2.1.2 Protein sequence databases |
|
|
27 | (1) |
|
2.1.3 Molecular structure databases |
|
|
28 | (3) |
|
2.2 Secondary annotation data |
|
|
31 | (7) |
|
|
32 | (3) |
|
2.2.2 Gene function annotations |
|
|
35 | (1) |
|
2.2.3 Genomic annotations |
|
|
36 | (1) |
|
2.2.4 Inter-species phylogeny and gene family annotations |
|
|
36 | (2) |
|
2.3 Experimental and personalized data |
|
|
38 | (10) |
|
2.3.1 DNA expression profiles |
|
|
38 | (2) |
|
2.3.2 Proteomics data and degradomics |
|
|
40 | (1) |
|
2.3.3 Protein expression profiles, 2D gel and protein interaction data |
|
|
41 | (1) |
|
2.3.4 Metabolomics and metabolic pathway databases |
|
|
42 | (2) |
|
|
44 | (4) |
|
2.4 Semantic and processed text data |
|
|
48 | (4) |
|
|
49 | (2) |
|
2.4.2 Text-mined annotation data |
|
|
51 | (1) |
|
2.5 Integrated and federated databases |
|
|
52 | (3) |
|
|
55 | (5) |
|
3 Local Pattern Discovery and Comparing Genes and Proteins |
|
|
60 | (37) |
|
3.1 DNA/RNA motif discovery |
|
|
64 | (14) |
|
3.1.1 Single motif models: MEME, AlignAce etc. |
|
|
64 | (6) |
|
3.1.2 Multiple motif models: LOGOS and MotifRegressor |
|
|
70 | (3) |
|
3.1.3 Informative k-mers approach |
|
|
73 | (5) |
|
3.2 Protein motif discovery |
|
|
78 | (6) |
|
3.2.1 InterProScan and other traditional methods |
|
|
79 | (3) |
|
3.2.2 Protein k-mer and other string based methods |
|
|
82 | (2) |
|
3.3 Genetic algorithms, particle swarms and ant colonies |
|
|
84 | (4) |
|
|
84 | (2) |
|
3.3.2 Particle swarm optimization |
|
|
86 | (1) |
|
3.3.3 Ant colony optimization |
|
|
87 | (1) |
|
3.4 Sequence visualization |
|
|
88 | (2) |
|
|
90 | (7) |
|
4 Global Pattern Discovery and Comparing Genomes |
|
|
97 | (48) |
|
4.1 Alignment-based methods |
|
|
98 | (10) |
|
4.1.1 Pairwise genome-wide search algorithms: LAGAN, AVID etc. |
|
|
98 | (1) |
|
4.1.2 Multiple alignment methods: MLAGAN, MAVID, MULTIZ etc. |
|
|
98 | (5) |
|
|
103 | (1) |
|
4.1.4 Visualization of genome comparisons |
|
|
104 | (1) |
|
|
105 | (3) |
|
4.2 Alignmentless methods |
|
|
108 | (17) |
|
4.2.1 K-mer based methods |
|
|
109 | (5) |
|
4.2.2 Average common substring and compressibility based methods |
|
|
114 | (3) |
|
4.2.3 2D portraits of genomes |
|
|
117 | (8) |
|
4.3 Genome scale non-sequence data analysis |
|
|
125 | (12) |
|
4.3.1 DNA physical structure based methods |
|
|
125 | (6) |
|
4.3.2 Secondary structure based comparisons |
|
|
131 | (6) |
|
|
137 | (8) |
|
5 Molecule Structure Based Searching and Comparison |
|
|
145 | (31) |
|
5.1 Molecule structures as graphs or strings |
|
|
148 | (9) |
|
5.1.1 3D to 1D transformations |
|
|
148 | (3) |
|
5.1.2 Graph matching methods |
|
|
151 | (4) |
|
5.1.3 Graph visualization |
|
|
155 | (1) |
|
|
156 | (1) |
|
5.2 RNA structure comparison and prediction |
|
|
157 | (5) |
|
5.3 Image comparison based methods |
|
|
162 | (7) |
|
5.3.1 Gabor filter based methods |
|
|
165 | (1) |
|
5.3.2 Image symmetry set based methods |
|
|
166 | (2) |
|
5.3.3 Other graph topology based methods |
|
|
168 | (1) |
|
|
169 | (7) |
|
6 Function Annotation and Ontology Based Searching and Classification |
|
|
176 | (36) |
|
6.1 Annotation ontologies |
|
|
176 | (3) |
|
6.2 Gene Ontology based mining |
|
|
179 | (3) |
|
6.3 Sequence similarity based function prediction |
|
|
182 | (2) |
|
6.4 Cellular location prediction |
|
|
184 | (2) |
|
6.5 New integrative methods: Utilizing networks |
|
|
186 | (6) |
|
6.6 Text mining bioliterature for automated annotation |
|
|
192 | (13) |
|
6.6.1 Natural language processing (NLP) |
|
|
193 | (4) |
|
|
197 | (2) |
|
6.6.3 Matrix factorization methods |
|
|
199 | (6) |
|
|
205 | (7) |
|
7 New Methods for Genomics Data: SVM and Others |
|
|
212 | (33) |
|
|
212 | (7) |
|
|
219 | (2) |
|
7.3 Methods for microarray data |
|
|
221 | (6) |
|
7.3.1 Gene selection algorithms |
|
|
223 | (2) |
|
7.3.2 Gene selection by consistency methods |
|
|
225 | (2) |
|
7.4 Genome as a time series and discrete wavelet transform |
|
|
227 | (4) |
|
7.5 Parameterless clustering for gene expression |
|
|
231 | (1) |
|
7.6 Transductive confidence machines, conformal predictors and ROC isometrics |
|
|
232 | (4) |
|
7.7 Text compression methods for biodata analysis |
|
|
236 | (2) |
|
|
238 | (7) |
|
8 Integration of Multimodal Data: Toward Systems Biology |
|
|
245 | (21) |
|
8.1 Comparative genome annotation systems |
|
|
246 | (3) |
|
8.2 Phylogenetics methods |
|
|
249 | (4) |
|
8.3 Network inference from interaction and coexpression data |
|
|
253 | (5) |
|
8.4 Bayesian inference, association rule mining and Petri nets |
|
|
258 | (4) |
|
|
262 | (4) |
|
|
266 | (31) |
|
9.1 Network analysis methods |
|
|
266 | (3) |
|
9.2 Unsupervised and supervised clustering |
|
|
269 | (1) |
|
9.3 Neural networks and evolutionary methods |
|
|
270 | (3) |
|
9.4 Semantic web and ontologization of biology |
|
|
273 | (4) |
|
9.5 Biological data fusion |
|
|
277 | (2) |
|
9.6 Rise of the GPU machines |
|
|
279 | (11) |
|
|
290 | (7) |
Index |
|
297 | |