E-Book, Englisch, Band 3, 246 Seiten
Chen Exploitation of Linkage Learning in Evolutionary Algorithms
1. Auflage 2010
ISBN: 978-3-642-12834-9
Verlag: Springer
Format: PDF
Kopierschutz: 1 - PDF Watermark
E-Book, Englisch, Band 3, 246 Seiten
Reihe: Adaptation, Learning, and Optimization
ISBN: 978-3-642-12834-9
Verlag: Springer
Format: PDF
Kopierschutz: 1 - PDF Watermark
One major branch of enhancing the performance of evolutionary algorithms is the exploitation of linkage learning. This monograph aims to capture the recent progress of linkage learning, by compiling a series of focused technical chapters to keep abreast of the developments and trends in the area of linkage. In evolutionary algorithms, linkage models the relation between decision variables with the genetic linkage observed in biological systems, and linkage learning connects computational optimization methodologies and natural evolution mechanisms. Exploitation of linkage learning can enable us to design better evolutionary algorithms as well as to potentially gain insight into biological systems. Linkage learning has the potential to become one of the dominant aspects of evolutionary algorithms; research in this area can potentially yield promising results in addressing the scalability issues.
Autoren/Hrsg.
Weitere Infos & Material
1;Title Page;2
2;Preface;6
3;Contents;8
4;Part I Linkage and Problem Structures;10
4.1;Linkage Structure and Genetic Evolutionary Algorithms;11
4.1.1;Introduction;11
4.1.2;Test Problems;12
4.1.3;Structure;15
4.1.3.1;Modularity;15
4.1.3.2;Degree Distribution;15
4.1.4;Hill Climbing and Genetic Algorithm;16
4.1.5;Compositional Gea;19
4.1.5.1;Joins and Exchanges;20
4.1.5.2;Mutation and Inter-level Conflict;21
4.1.5.3;The J Algorithm;23
4.1.5.4;Results;25
4.1.5.5;Specificity;27
4.1.6;Conclusion;29
4.1.7;References;29
4.2;Fragment as a Small Evidence of the Building Blocks Existence;32
4.2.1;Introduction;32
4.2.2;Fragment: A Simplified Definition of BBs;34
4.2.2.1;Fragments and BBs;36
4.2.2.2;Fragments and Linkage;36
4.2.3;Operations on Fragments;36
4.2.3.1;Fragment Identification;36
4.2.3.2;Fragment Composition;38
4.2.4;Experimental Settings and Results;39
4.2.4.1;Test Problems;40
4.2.4.2;Measurement;41
4.2.4.3;Results;42
4.2.5;Discussion and Conclusions;47
4.2.6;References;50
4.3;Structure Learning and Optimisation in a Markov Network Based Estimation of Distribution Algorithm;52
4.3.1;Background;52
4.3.2;EDAs and Approaches to Probabilistic Modelling;55
4.3.3;Structure Learning in the DEUM Markov Network EDA;57
4.3.3.1;How Good Is the Structure?;58
4.3.4;Distribution Estimation Using Markov Networks;60
4.3.4.1;General Model;60
4.3.4.2;Fitness Prediction Correlation;61
4.3.5;The Ising Problem and EDAs;62
4.3.6;DEUM LDA;62
4.3.6.1;Fitness Model;63
4.3.6.2;Optimisation;64
4.3.7;DEUM-X^2;65
4.3.7.1;The Algorithm;65
4.3.7.2;Fitness Model;66
4.3.7.3;Optimisation Results;67
4.3.8;EVDEUM;69
4.3.8.1;Fitness Model;69
4.3.8.2;Optimisation Results;71
4.3.9;Conclusion;72
4.3.10;References;73
4.4;DEUM – A Fully Multivariate EDA Based on Markov Networks;77
4.4.1;Introduction;78
4.4.2;Probabilistic Graphical Models in EDA;79
4.4.2.1;Bayesian Networks;79
4.4.2.2;Markov Networks;80
4.4.3;Markov Network Based EDAs;82
4.4.3.1;Global Markov Property Based EDAs;82
4.4.3.2;Local Markov Property Based EDAs;83
4.4.4;Fitness Modelling and DEUM Algorithms;83
4.4.4.1;Fitness Modelling;83
4.4.4.2;Univariate MFM in DEUM$_pv$ and DEUM$_d$;84
4.4.4.3;Multivariate MFM in Is-DEUM;85
4.4.4.4;Estimating MRF Parameters;86
4.4.4.5;Sampling Markov Networks;86
4.4.5;A Fully Multivariate General DEUM Algorithm;86
4.4.5.1;Estimation of Undirected Structure;87
4.4.5.2;Finding Cliques and Assigning Potentials;88
4.4.5.3;Sampling New Solution;89
4.4.6;Experimental Results;91
4.4.6.1;Experimental Setup;92
4.4.6.2;Results;92
4.4.6.3;Analysis;93
4.4.7;Conclusion;95
4.4.8;References;96
5;Part II Model Building and Exploiting;100
5.1;Pairwise Interactions Induced Probabilistic Model Building;101
5.1.1;Introduction;101
5.1.2;Predicting Information Gain from Pairwise Interactions;103
5.1.2.1;Information Gain on Binary Data;104
5.1.2.2;General Measurement of Module-Wise Interactions;107
5.1.2.3;Examples;108
5.1.3;Case Study on eCGA;109
5.1.3.1;Hybridization of eCGA;111
5.1.3.2;Guided Linear Model Building;112
5.1.3.3;Test Suite;113
5.1.3.4;Performance of the Modified eCGA;115
5.1.4;Extended Bayesian Model Building;116
5.1.4.1;Multi-parent Search;117
5.1.4.2;Test Samples;118
5.1.4.3;Model Building Performance;120
5.1.5;Conclusions;124
5.1.6;References;124
5.2;ClusterMI: Building Probabilistic Models Using Hierarchical Clustering and Mutual Information;127
5.2.1;Introduction;127
5.2.2;Background;128
5.2.3;ClusterMI: A New Approach to Model Building in EDAs;131
5.2.4;Results;133
5.2.5;Future Work;138
5.2.6;Conclusion;139
5.2.7;References;140
5.3;Estimation of Distribution Algorithm Based on Copula Theory;142
5.3.1;Introduction;142
5.3.2;A Brief Introduction to Copula Theory;145
5.3.2.1;Definitions and Basic Properties;145
5.3.2.2;Random Variable Generation;146
5.3.3;Motivation;147
5.3.4;Two-Dimensional Copula-EDAs;148
5.3.4.1;Gaussion Copula-EDAs;149
5.3.4.2;Archimedean Copula-EDAs;154
5.3.5;High-Dimensional Copula-EDAs;156
5.3.5.1;High-Dimensional Copula Constructions;156
5.3.5.2;Copula-EDA Based on Empirical Copula;159
5.3.6;Conclusion;162
5.3.7;References;163
5.4;Analyzing the $k$ Most Probable Solutions in EDAs Based on Bayesian Networks;166
5.4.1;Introduction;167
5.4.2;Background;168
5.4.2.1;Bayesian Networks;168
5.4.2.2;Learning Bayesian Networks from Data;168
5.4.2.3;Estimation of Distribution Algorithms Based on Bayesian Networks;169
5.4.2.4;Abductive Inference and Most Probable Configurations;170
5.4.3;Experimental Framework;170
5.4.3.1;Problems;171
5.4.3.2;Measurements;171
5.4.3.3;Parameter Configuration;172
5.4.4;Analyzing the k MPSs in Trap5;172
5.4.4.1;Trap5 Description;172
5.4.4.2;Experimental Results;173
5.4.5;Analyzing the k MPSs in Gaussian Ising;177
5.4.5.1;2D Ising Spin Glass Description;177
5.4.5.2;Experimental Results;178
5.4.6;Analyzing k MPSs in $\pm J$ Ising;180
5.4.6.1;$\pm J$ Ising Description;180
5.4.6.2;Experimental Results;181
5.4.7;Analyzing k MPSs in Max-SAT;183
5.4.7.1;Max-SAT Description;183
5.4.7.2;Experimental Results;184
5.4.8;Related Works;186
5.4.9;Conclusions;187
5.4.10;References;189
6;Part III Applications;193
6.1;Protein Structure Prediction Based on HP Model Using an Improved Hybrid EDA;194
6.1.1;Introduction;194
6.1.2;Protein HP Model and EDAs;196
6.1.2.1;Protein Folding and HP Model;196
6.1.2.2;Estimation of Distribution Algorithms;198
6.1.3;New Hybrid EDA for Protein Folding Based on HP Model;200
6.1.3.1;Problem Representation for EDA;200
6.1.3.2;The Probabilistic Model of EDA;201
6.1.3.3;The Composite Fitness Function;202
6.1.3.4;Local Search with Guided Operators;204
6.1.4;Improved Backtracking-Based Repairing Method;205
6.1.4.1;Backtracking Method;205
6.1.4.2;Disadvantage of Traditional Backtracking-Based Method;205
6.1.4.3;The Improved Method;207
6.1.5;Experiments;208
6.1.5.1;Problem Benchmark;208
6.1.5.2;Results of the Hybrid EDA for HP Model;209
6.1.5.3;Results of Comparing Computational Cost;211
6.1.6;Conclusions and Further Work;214
6.1.7;References;214
6.2;Sensible Initialization of a Computational Evolution System Using Expert Knowledge for Epistasis Analysis in Human Genetics;216
6.2.1;Introduction;216
6.2.2;Computational Evolution System;218
6.2.2.1;Solution Representation, Evaluation, and Selection;218
6.2.2.2;Solution Operators;220
6.2.2.3;Mutation Operators;220
6.2.3;Population Initialization;221
6.2.4;Data Simulation;222
6.2.5;Experimental Design;222
6.2.6;Results and Discussion;222
6.2.7;Concluding Remarks;225
6.2.8;References;226
6.3;Estimating Optimal Stopping Rules in the Multiple Best Choice Problem with Minimal Summarized Rank via the Cross-Entropy Method;228
6.3.1;Introduction;228
6.3.2;The Multiple Best Choice Problem with Minimal Summarized Rank;229
6.3.3;Cross-Entropy Method;231
6.3.4;The Cross-Entropy Method for the Problem;233
6.3.5;Numeric Results;234
6.3.6;Genetic Algorithm;238
6.3.7;Numeric Results of GA Process;240
6.3.8;Conclusions;241
6.3.9;References;241
7;Author Index;243
8;Index;244
"Protein Structure Prediction Based on HP Model Using an Improved Hybrid EDA (p. 193-194)
Benhui Chen and Jinglu Hu
Abstract. Protein structure prediction (PSP) is one of the most important problems in computational biology. This chapter introduces a novel hybrid Estimation of Distribution Algorithm (EDA) to solve the PSP problem on HP model. Firstly, a composite fitness function containing the information of folding structure core (H-Core) is introduced to replace the traditional fitness function of HP model. The new fitness function is expected to select better individuals for probabilistic model of EDA. Secondly, local search with guided operators is utilized to refine found solutions for improving efficiency of EDA. Thirdly, an improved backtracking-based repairing method is introduced to repair invalid individuals sampled by the probabilistic model of EDA. It can significantly reduce the number of backtracking searching operation and the computational cost for long sequence protein. Experimental results demonstrate that the new method outperforms the basic EDAs method. At the same time, it is very competitive with other existing algorithms for the PSP problem on lattice HP models.
1 Introduction
Protein structure prediction (PSP) is one of the most important problems in computational biology. A protein is a chain of amino acids (also called as residues) that folds into a specific native tertiary structure under certain physiological conditions. Understanding protein structures is vital to determining the function of a protein and its interaction with DNA, RNA and enzyme. The information about its conformation can provide essential information for drug design and protein engineering.
While there are over a million known protein sequences, only a limited number of protein structures are experimentally determined. Hence, prediction of protein structures from protein sequences using computer programs is an important step to unveil proteins’ three dimensional conformation and functions. Because of the complexity of the PSP problem, simplified models like Dill’s HPlattice [17] model have become the major tools for investigating general properties of protein folding. In HP model, 20-letter alphabet of residues is simplified to a two-letter alphabet, namely H (hydrophobic) and P (polar).
Experiments on small protein suggest that the native state of a protein corresponds to a free energy minimum. This hypothesis is widely accepted, and forms the basis for computational prediction of a protein’s conformation from its residue sequence. The problem of finding such a minimum energy configuration has been proved to be NP-complete for the bi-dimensional (2-D) [8] and tri-dimensional (3-D) lattices [4]. Therefore, a deterministic approaches is always not practical for this problem."




