

In a modelling problem, the ‘learning’ term refers to running a computer program to induce a model by using training data or past experience. The optimized criterion can be the accuracy provided by a predictive model-in a modelling problem-, and the value of a fitness or evaluation function-in an optimization problem. Machine learning consists in programming computers to optimize a performance criterion by using example data or past experience. In addition to all these applications, computational techniques are used to solve other problems, such as efficient primer design for PCR, biological image analysis and backtranslation of proteins (which is, given the degeneration of the genetic code, a complex combinatorial problem). A review of the application of text mining techniques in biology and biomedicine can be found in Ananiadou and McNaught.

Thus, text mining is becoming more and more interesting in computational biology, and it is being applied in functional annotation, cellular location prediction and protein interaction analysis. This provides a new source of valuable information, where text mining techniques are required for the knowledge extraction. This comparison is made by means of multiple sequence alignment, where optimization techniques are very useful.Ī side effect of the application of computational techniques to the increasing amount of data is an increase in available publications. Traditionally, they were constructed according to different features (morphological features, metabolic features, etc.) but, nowadays, with the great amount of genome sequences available, phylogenetic tree construction algorithms are based on the comparison between different genomes. Phylogenetic trees are schematic representations of organisms’ evolution. Thus, computational techniques are extremely helpful when modelling biological networks, especially genetic networks, signal transduction networks and metabolic pathways.Įvolution and, especially phylogenetic tree reconstruction also take advantage of machine learning techniques. It is very complex to model the life processes that take place inside the cell. Systems biology is another domain where biology and machine learning work together. In the case of microarray data, the most typical applications are expression pattern identification, classification and genetic network induction. Second, the analysis of the data, which depends on what we are looking for. modified to be suitably used by machine learning algorithms. First, data need to be pre-processed, i.e. Complex experimental data raise two different problems. Microarray essays are the best known (but not the only) domain where this kind of data is collected. In proteomics, as in the case of genomics, machine learning techniques are applied for protein function prediction.Īnother interesting application of computational methods in biology is the management of complex experimental data.

This makes protein structure prediction a very complicated combinatorial problem where optimization techniques are required. Hence, the number of possible structures is huge. Proteins are very complex macromolecules with thousands of atoms and bounds. In the proteomic domain, the main application of computational methods is protein structure prediction. Proteins play a very important role in the life process, and their three-dimensional (3D) structure is a key feature in their functionality. If the genes contain the information, proteins are the workers that transform this information into life.
