E-Book, Englisch, Band 13, 613 Seiten
Elloumi / Küng / Linial Bioinformatics Research and Development
1. Auflage 2008
ISBN: 978-3-540-70600-7
Verlag: Springer Berlin Heidelberg
Format: PDF
Kopierschutz: 1 - PDF Watermark
Second International Conference, BIRD 2008, Vienna, Austria, July 7-9, 2008 Proceedings
E-Book, Englisch, Band 13, 613 Seiten
Reihe: Communications in Computer and Information Science
ISBN: 978-3-540-70600-7
Verlag: Springer Berlin Heidelberg
Format: PDF
Kopierschutz: 1 - PDF Watermark
This book constitutes the refereed proceedings of the Second International Bioinformatics Research and Development Conference, BIRD 2008, held in Vienna, Austria in July 2008. The 49 revised full papers presented were carefully reviewed and selected. 30 papers are organized in topical sections followed by 11 papers from the ALBIO workshop and 8 papers from the PETRIN workshop.
Autoren/Hrsg.
Weitere Infos & Material
1;Preface;5
2;Organization;6
3;Table of Contents;14
4;A Tree Index to Support Clustering Based Exploratory Data Analysis;19
4.1;Introduction;19
4.2;Methods;21
4.3;Results;24
4.3.1;Simulated Data;24
4.3.2;Real-World Data;25
4.4;Discussion;30
4.5;Availability;31
4.6;References;31
5;XMAS: An Experiential Approach for Visualization, Analysis, and Exploration of Time Series Microarray Data;34
5.1;Introduction;34
5.2;System Description;37
5.2.1;Data Sets;37
5.2.2;Data Preprocessing;38
5.2.3;Interoperable Data Operators, Visualization and HCI;38
5.2.4;Interestingness Evaluators;41
5.2.5;Interoperability, Interactivity and Extensibility of XMAS;42
5.3;Experimental Evaluation;43
5.3.1;Data Description;43
5.3.2;XMAS as a Visual Interactive Tool to Aid in Data Comprehension;43
5.3.3;Negative Expression Shift Approaching the End of Pregnancy;45
5.3.4;Interoperable Pathway Analysis;46
5.4;Conclusions;48
5.5;References;48
6;Clustering in a Fixed Manifold to Detect Groups of Genes with Similar Expression Patterns;50
6.1;Introduction;50
6.2;Statistical Model;51
6.3;Results;55
6.4;Conclusions and Future Works;59
6.5;References;59
7;Families of FPGA-Based Accelerators for BLAST Algorithm with Multi-seeds Detection and Parallel Extension;61
7.1;Introduction;61
7.2;BLAST Algorithm Overview;63
7.3;The Structure of Multi-seeds Detection and Parallel Extension Engine;64
7.3.1;Multi-seeds Detecting;65
7.3.2;Successive Seeds Merging;66
7.3.3;Multi-seeds Extension;66
7.4;FPGA Implementation and Optimization;66
7.4.1;Multi-seeds Detection Array;66
7.4.2;Decomposing the Detection Array;67
7.4.3;The Algorithm of Merging Successive Seeds;68
7.4.4;Multi-channel Parallel Extension Strategy;69
7.5;Experiments and Performance Comparison;69
7.5.1;Test 1: Comparing Synthesis Performance to Systolic Array Approaches;70
7.5.2;Test 2: Comparing Storage Requirement to Index-Based Approaches;70
7.5.3;Test 3: Comparing to Index-Based Hardware Accelerators;70
7.5.4;Test 4: Comparing Execution-Time to Software Version;71
7.6;Conclusion;74
7.7;References;74
8;Fast Structured Motif Search in DNA Sequences;76
8.1;Introduction;76
8.2;Background and Related Work;78
8.3;The STTD64 Index;81
8.4;The EMOS Algorithm;83
8.5;Experiments and Results;86
8.5.1;Comparison with SMOTIF1;86
8.5.2;Sources of EMOS Speedup;87
8.6;Conclusions and Future Work;89
8.7;References;90
9;A Discriminative Method for Protein Remote Homology Detection Based on N-nary Profiles;92
9.1;Introduction;92
9.2;Methods and Algorithms;94
9.2.1;Data Set;94
9.2.2;Generation of N-nary Profiles;94
9.2.3;Chi-Square Feature Selection;96
9.2.4;Construction of SVM Classifiers and Classification;96
9.2.5;Latent Semantic Analysis;97
9.2.6;Evaluation Methodology;98
9.3;Results and Discussion;98
9.3.1;Comparative Results of Various Methods;98
9.3.2;Computational Efficiency;102
9.4;Conclusion;102
9.5;References;103
10;Searching for Supermaximal Repeats in Large DNA Sequences;105
10.1;Introduction;105
10.2;Background and Related Work;106
10.2.1;STTD64 Index Representation;106
10.2.2;A Supermaximal Repeats Search Algorithm;108
10.2.3;Related Work;109
10.3;Our Technique for Supermaximal Repeats Search;110
10.3.1;POL Index Structure;110
10.3.2;POL Index Construction Algorithm;112
10.3.3;Our Supermaximal Repeats Search Algorithm;113
10.4;Experiments and Results;114
10.4.1;POL Index Construction;115
10.4.2;SMR Search Performance;116
10.4.3;Number of Supermaximal Repeats;117
10.5;Conclusions and Future Work;118
10.6;References;119
11;Motif Location Prediction by Divide and Conquer;120
11.1;Introduction;120
11.2;Related Work;123
11.3;The Proposed Methodology;124
11.3.1;Identifying Positions of Potential Candidate Motifs;124
11.4;Results;127
11.5;Discussion and Conclusions;128
11.6;References;129
12;Translational Control by RNA-RNA Interaction: Improved Computation of RNA-RNA Binding Thermodynamics;132
12.1;Introduction;132
12.2;Algorithm;135
12.2.1;Calculation of Accessibility;135
12.2.2;Free Energy of Interaction;137
12.3;Results;138
12.4;Conclusion;141
12.5;References;142
13;A Symmetry-Free Subspace for Ab initio Protein Folding Simulations;146
13.1;Introduction;146
13.2;Stochastic Search Algorithms for Protein Folding;147
13.2.1;Available Structure Representations;148
13.2.2;Move Sets for Protein Folding;149
13.2.3;Stochastic Algorithms;150
13.3;Proposed Subspace;150
13.4;Results;153
13.5;Discussion;156
13.6;References;156
14;Comparative Analysis of Disulfide Bond Determination Using Computational-Predictive Methods and Mass Spectrometry-Based Algorithmic Approach;158
14.1;Introduction;158
14.2;Previous Work;160
14.3;Disulfide Bond Determination Using Mass Spectrometry Data;161
14.3.1;Basic Definitions and Computational Formulation;161
14.4;Experimental Evaluation;164
14.4.1;Experimental Analysis of the Proposed Approach;165
14.4.2;Comparison with MS2Assign;166
14.4.3;Comparison with Predictive Methods;166
14.5;Conclusions;169
14.6;References;170
15;Exploration of Evolutionary Relations between Protein Structures;172
15.1;Introduction;172
15.2;Methods;174
15.2.1;Fold Mutations;174
15.2.2;CATH Fold Space;174
15.2.3;ESSM Algorithm for Recognition of Fold Mutations;175
15.2.4;Construction of Fold Space Graphs;176
15.3;Results;177
15.3.1;Distribution of Probabilities;177
15.3.2;Evolutionary Relationships between CATH Domains;178
15.3.3;Exploration of beta-Sheets;181
15.4;Summary and Conclusions;183
15.5;References;183
16;Two Local Search Methods for Protein Folding Simulation in the HP and the MJ Lattice Models;185
16.1;Introduction;185
16.2;HP and MJ Energy Functions;186
16.2.1;The HP Model;186
16.2.2;The Miyazawa-Jernigan Energy Function;187
16.3;Related Algorithmic Methods;188
16.4;Population-Based Local Search;188
16.5;Results on Benchmark Problems;192
16.5.1;The Benchmarks;192
16.5.2;HP Model Simulations;192
16.5.3;MJ Model Simulations;193
16.6;Concluding Remarks;195
16.7;References;195
17;A Robust Class of Stable Proteins in the 2D HPC Model;198
17.1;Introduction;198
17.2;Definitions;201
17.2.1;Hydrophobic-Polar-Cysteine (HPC) Model;201
17.2.2;Wave Structures;203
17.3;Stability of the Wave Structures;206
17.3.1;Saturated Folds;206
17.3.2;2DHPSolver: A Semi-automatic Prover;206
17.3.3;Proof;207
17.4;Conclusions;209
17.5;References;209
18;A Novel Adaptive Multiple Imputation Algorithm;211
18.1;Introduction;211
18.2;Methods;213
18.2.1;Adaptive Multiple Imputation Algorithm;213
18.3;Results and Discussion;215
18.4;Conclusion;221
18.5;References;222
18.6;Appendix;223
18.6.1;Dynamic Time Warping Algorithm;223
19;Topological Metrics in Blast Data Mining: Plasmid and Nitrogen-Fixing Proteins Case Studies;225
19.1;Introduction;225
19.2;Methods;227
19.3;Results and Discussion;229
19.3.1;Plasmids Network;229
19.3.2;Nitrogen Fixation Network;231
19.4;Conclusions;235
19.5;References;235
20;Gene Expression Mining for Cohesive Pattern Discovery;239
20.1;Introduction;239
20.2;Cohesion: A Concept;242
20.3;Association Discovery Using Cohesion;242
20.3.1;Identifying “100% Cohesive” Genesets;242
20.3.2;Cluster Formation;243
20.3.3;Intra-cluster Association Mining;244
20.3.4;Data Preparation and Illustration of dCARDIAC Algorithm;246
20.4;Experiment and Analysis;248
20.5;Conclusion and Future Plan;251
20.6;References;252
21;Identifying Subcellular Locations from Images of Unknown Resolution;253
21.1;Introduction;253
21.2;Methods;254
21.2.1;Processing Images of Unknown Resolution;254
21.2.2;Inferring Resolution;255
21.2.3;Evaluation of Resolution Prediction Schemes;256
21.2.4;Classification Pipeline;258
21.3;Discussion;259
21.4;References;259
22;E-BioFlow: Different Perspectives on Scientific Workflows;261
22.1;Introduction;261
22.2;Perspectives of a Workflow Model;263
22.3;E-BioFlow: A New Type of Workflow Design System;263
22.3.1;Three Perspectives to Design a Workflow;264
22.3.2;A Simple Life Science Case: Sequence Alignment;267
22.3.3;Dependencies among the Perspectives;267
22.3.4;An Architectural View and Some Implementation Details;269
22.4;Related Work;271
22.5;Discussion and Future Work;271
22.6;References;272
23;Knowledge Acquisition Focused Cooperative Development of Bio-ontologies – A Case Study with BIO2Me;276
23.1;Introduction;276
23.1.1;Bio-ontologies;277
23.2;Collaborative Ontology Engineering;282
23.2.1;Cooperation between Domain Experts and Ontology Designers;282
23.2.2;Knowledge Acquisition Focused Ontology Editing – A Case Study with BIO2Me;283
23.3;Conclusion and Future Works;288
23.4;References;289
24;Nested q-Partial Graphs for Genetic Network Inference from ”Small n, Large p” Microarray Data;291
24.1;Introduction;291
24.2;Graphical Models for Genetic Regulatory Network Inference;292
24.2.1;Independence Graphs;293
24.2.2;Concentration Graphs: Full-Order Partial Correlation Graphs;293
24.3;q-Partial Graphs: From Independence Graphs toGaussian Graphical Models;294
24.4;The q-Nested Procedure;297
24.4.1;Population Version;297
24.4.2;Sample Version;299
24.5;Experiments;300
24.5.1;Datasets;300
24.5.2;Methods;300
24.5.3;Validation;300
24.5.4;Results;301
24.6;Conclusion and Future Work;302
24.7;References;303
25;A Computational Method for Reconstructing Gapless Metabolic Networks;306
25.1;Introduction;306
25.2;Methods;309
25.2.1;Reaction Reachability;309
25.2.2;Metabolic Reconstruction Problem;310
25.2.3;Reaction and Network Scores;311
25.2.4;Divide-and-Conquer Approach;312
25.2.5;Coding of Reaction and Metabolite Evidence;312
25.3;Experiments;313
25.4;Discussion;317
25.5;References;318
26;Finding Frequent Subgraphs in Biological Networks Via Maximal Item Sets;321
26.1;Introduction;321
26.2;Graph Models of Metabolic Pathways;324
26.3;Maximal Frequent Subgraphs and Maximal Frequent Item Sets;325
26.4;An Algorithm for Finding Frequent Subgraphs Via Frequent Item Sets;327
26.5;Experimental Results and Discussion;328
26.5.1;Glutamate;329
26.5.2;Pyrimidine Pathway;332
26.5.3;Alanine-Aspartate Pathway;333
26.6;Conclusions;334
26.7;References;335
27;Multi-functional Protein Clustering in PPI Networks;336
27.1;Introduction;336
27.2;Approach Description;338
27.3;Related Work;340
27.4;Experimental Validation;341
27.4.1;Validation Metrics;342
27.4.2;Comparison of MF-PINCoC,MCODE, and RNSC;342
27.4.3;Multi-functional Proteins;344
27.5;Conclusions;346
27.6;References;347
28;Protein-Protein Interaction Network Querying by a “Focus and Zoom” Approach;349
28.1;Introduction;349
28.2;The Proposed Approach;350
28.2.1;Technical Details;354
28.3;Related Work;357
28.4;Experimental Results;358
28.4.1;Querying D. melanogaster and C. elegans by S. cerevisiae;359
28.4.2;Querying H. sapiens by S. cerevisiae;360
28.5;Concluding Remarks;363
28.6;References;363
29;Matching Spatial Regions with Combinations of Interacting Gene Expression Patterns;365
29.1;Introduction;365
29.2;Edinburgh Mouse Atlas Project;366
29.3;Combining Gene Expression Patterns;367
29.3.1;Defining the Interaction Patterns;367
29.3.2;A Grammar for Interactions;368
29.3.3;Matching Patterns;370
29.3.4;An Evolutionary Algorithm to Search for Sentences;370
29.4;Experiments and Results;371
29.4.1;Robustness of the Methodology;371
29.4.2;Matching Gene Expression Patterns from the Mouse Atlas;373
29.5;Discussion;378
29.6;References;379
30;Combining Molecular and Physiological Data of Complex Disorders;380
30.1;Introduction;380
30.2;State of the Art Clinical Bioinformatics;382
30.3;Methods and Algorithm;384
30.3.1;Data Used in This Study;385
30.4;Results;386
30.5;Discussion;390
30.5.1;Robust Disease Patterns in Serum of Schizophrenia and Affective Disorder;390
30.5.2;Towards Personalized Network Medicine;391
30.6;References;392
31;Combining Protein-Protein Interaction (PPI) Network and Sequence Attributes for Predicting Hypertension Related Proteins;395
31.1;Introduction;395
31.2;Computational Method;396
31.2.1;Dataset;396
31.2.2;Protein-Protein Interaction Network Properties;396
31.2.3;Hypertension Pathways and Protein Function;397
31.2.4;Classification;398
31.3;Results;399
31.3.1;Network Properties;400
31.3.2;Hypertension Pathways and Protein Function;402
31.3.3;Classification;405
31.4;Discussion;406
31.5;References;407
32;Genome Structure and Characterisation of an Endogenous Retrovirus from the Zebrafish Genome Project Database;410
32.1;Introduction;410
32.2;Materials and Methods;411
32.3;Results;412
32.4;Discussion and Conclusion;418
32.5;References;420
33;Bayesian Phylogeny on Grid;422
33.1;Introduction;422
33.2;Methodology;423
33.2.1;Generating Models of Evolution;423
33.2.2;Bayesian Phylogeny;425
33.2.3;State of Art, Potentialities of Grid Computing for Phylogeny;426
33.3;Results and Discussion;428
33.4;Conclusion;432
33.5;References;432
34;Comparison of Exact String Matching Algorithms for Biological Sequences;435
34.1;Introduction;435
34.2;Known Solutions for DNA Sequences;436
34.3;Variations of Our Algorithm;437
34.3.1;Our Earlier Algorithm;437
34.3.2;Variations for DNA;438
34.3.3;Variations for Amino Acids;439
34.4;Experimental Results;440
34.5;Conclusion;443
34.6;References;443
35;A New Approximation Algorithm for the Minimum Fragment Removal Problem;445
35.1;Introduction;445
35.2;Definitions and Notations;447
35.3;Fundamental Aspects;447
35.4;Algorithm;449
35.4.1;Construction of V1,V2,V3 and V4;449
35.4.2;Construction of a Bipartite Graph;449
35.5;Experimental Results;451
35.6;Conclusion;452
35.7;References;452
36;Indexing Factors in DNA/RNA Sequences;454
36.1;Introduction;454
36.2;Preliminaries;455
36.3;Truncated Generalized Factor Automaton;456
36.3.1;Generalized Suffix Automaton;457
36.3.2;On-Line Construction of Generalized Suffix Automaton;457
36.3.3;On-Line Construction of Truncated Generalized Suffix Automaton;460
36.4;Conclusion;463
36.5;References;463
37;Implementation of a Swap Matching Algorithm Using a Graph Theoretic Model;464
37.1;Introduction;464
37.1.1;Biological Motivation;465
37.2;Preliminaries;466
37.2.1;Basic Definitions;466
37.2.2;Review of the Model and Algorithm of [6];467
37.3;Implementation;469
37.4;Results;471
37.4.1;Main Results;471
37.4.2;Other Results;471
37.5;Conclusion;472
37.6;References;473
38;Suffix Tree Characterization of Maximal Motifs in Biological Sequences;474
38.1;Introduction;474
38.2;Preliminary Definitions;475
38.3;Characterization of Motif Maximality on Suffix Tree;477
38.4;Inferring Maximal Motifs with Suffix Tree;480
38.5;Conclusions;483
38.6;References;483
39;Efficient Seeding Techniques for Protein Similarity Search;484
39.1;Introduction;484
39.2;Preliminaries;486
39.3;Dominating Seed Letters;487
39.4;Transitive Seed Alphabets;488
39.4.1;Transitive Alphabets Based on a Pre-defined Clustering;489
39.4.2;Transitive Alphabets Using an ab initio Clustering Method;490
39.4.3;Non-hierarchical Alphabets;491
39.5;Experiments;492
39.5.1;Probability Assignment and Alphabet Generation;492
39.5.2;Seed Design;493
39.5.3;Blastp and the Vector Seed Family from [4];493
39.5.4;Results;493
39.6;Conclusion;495
39.7;References;496
40;An Algorithm for Multiple and Global Alignments;497
40.1;Introduction;497
40.2;Definitions and Notations;498
40.3;Construction of a MGA;499
40.4;Experimental Results;500
40.5;Conclusion;504
40.6;References;504
41;LASA: A Tool for Non-heuristic Alignment of Multiple Sequences;507
41.1;Introduction;507
41.2;Previous Work;509
41.3;The Extended Pairwise Alignment Problem;510
41.4;Improving the Lagrangian Relaxation Bound;512
41.5;Experiments;514
41.6;Conclusion;515
41.7;References;516
42;SVM-Based Local Search for Gene Selection and Classification of Microarray Data;517
42.1;Introduction;517
42.2;SVM Classification and Gene Selection;518
42.2.1;Support Vector Machines;518
42.2.2;Gene Ranking by SVM;519
42.3;SVM-LS for Gene Selection and Classification;520
42.3.1;Representation and Search Space;520
42.3.2;Evaluation Function;520
42.3.3;Move and Neighborhood;521
42.3.4;Local Search Algorithms;521
42.3.5;Initial Solution;522
42.3.6;The General SVM-LS Procedure;522
42.4;Experimental Results;523
42.4.1;Data Sets;523
42.4.2;Protocol for Experimentations and Comparison Criteria;523
42.4.3;Results and Comparisons;524
42.5;Conclusion;525
42.6;References;526
43;GENFOCS – A Comparative Tool on Gene Finding with Sensitivity and Specificity;527
43.1;Introduction;527
43.1.1;GENFOCS – Introduction;527
43.2;Methodology;528
43.2.1;GENFOCS – Parsing the Predicted Outputs;529
43.2.2;GENFOCS Visual Representations;529
43.2.3;GENFOCS – Sensitivity and Specificity;531
43.3;Discussion;532
43.4;Conclusion and Future Works;533
43.5;References;534
44;Gene Machine© – A Hardware/Software Platform for Analyzing Genome Data;535
44.1;Introduction;535
44.2;Left-to-Right Hidden Markov Models [LMM];536
44.3;Architectural Details;536
44.3.1;Vector Modules Description;537
44.3.2;Storage Structure and Register Organization;539
44.4;Instruction Set;539
44.5;Data Types;539
44.5.1;Nucleotide Data Input;540
44.5.2;Codon Sequences as Data Input;540
44.5.3;Amino Acids Data Input for Interpreting Secondary Structures;540
44.5.4;Amino Acids Data Input Interpreted as Hydrophobic/Hydrophilic Entities;540
44.5.5;Amino Acids Interpreted on the Basis of Side Chains;541
44.6;Logic Operators;541
44.6.1;Unary Operators;541
44.6.2;Binary Operators;542
44.7;Execution and Performance Considerations;543
44.8;Concluding Remarks;544
44.9;References;545
45;Complex Representation of DNA Sequences;546
45.1;Introduction;546
45.2;Complex Indicator Function;547
45.3;Wavelet Analysis of the DNA Random Walk;548
45.4;Short Haar Wavelet Transform;550
45.5;Cluster Analysis of the Wavelet Coefficients;552
45.6;Comparison with a Random Sequence;553
45.7;Conclusion;554
45.8;References;554
46;Wavelet Analysis of Impulses in Axon Physiology;556
46.1;Introduction;556
46.2;Preliminary Remarks on the Short Haar Wavelet Transform;557
46.3;Fitzhugh-Nagumo System;558
46.4;Critical Analysis;560
46.5;References;562
47;Acquisition and Algorithms for Fluorescence Applications;564
47.1;Introduction;564
47.2;General Considerations;564
47.3;Mathematical Model;567
47.4;Conclusions;573
47.5;References;573
48;Control of the Lasers Dynamics by Pumping Modulation for Biophotonics Applications;574
48.1;Introduction;574
48.2;The Single-Mode Laser Model;575
48.3;The Two-Mode Laser Model;576
48.4;Conclusion;580
48.5;References;580
49;Quantum Well Lasers for Medical Industry;581
49.1;Introduction;581
49.2;PhysicalModel;582
49.2.1;Rate Equations;582
49.2.2;The Stationary State;583
49.2.3;Transfer Function;584
49.3;The Modulated MQW Laser;586
49.4;Conclusion;587
49.5;References;587
50;Computational Model for the Study of the Fractal Parameters Characteristic for the Surface of Titanium Dental Implants;589
50.1;Introduction;589
50.2;Experimental Data for the Size Dependence of Fracture Parameters of Some Concrete and Rock Specimen and for Some Titanium Implant Coated with Hidroxiapatite;590
50.3;Obtained Results;590
50.4;Ultrasonic Waves, Generated by Heat Sources;592
50.5;Some Elements on the Physical Similitude and Fractals Theory;594
50.6;(Multi)Fractal Scaling and Frequency Power Laws;595
50.7;Conclusions;596
50.8;References;597
51;VEMS-TM – A Management Infrastructure;599
51.1;Introduction;599
51.2;VEMS-TM Architecture Overview;599
51.3;Software System Lifecycle Structure Covered by VEMS-TM;601
51.4;Software Project Management from a Training Perspective;603
51.5;References;604
52;PTF Model for Evaluating the Relationships between the VAC and Antimicrobial Resistance in Human Communities;605
52.1;Introduction;605
52.2;Asymmetrical Pulses Obtained as Derivatives of Symmetrical Functions;607
52.3;Conclusions;610
52.4;References;610
53;Author Index;612




