Author: Cèsar Ferri |
Title: Multi-Paradigm Learning of Declarative Models |
Language of presentation: Spanish |
Promotor: Prof. José Hernández-Orallo and Prof. María José Ramírez |
Date of defense: May 2003 |
Institution granting degree: Universidad Politénica de Valencia, Spain |
In this thesis, we present some contributions to the field of machine learning that partially solve some of these problems. Specifically, we introduce learning methods that increase the expressiveness of the representation language, improve the accuracy of the learned models, and reduce the inherent costs of the learning process and the application of the learned model.
The first contribution is the definition of an induction framework based on multi-paradigm languages. These languages combine the best features of the two main families of declarative paradigms: functional programming and logic programming. The use of languages of this kind for learning provides advantages such as types and functions. Our new learning framework has been implemented in a machine learning system, for which we include several experiments and results.
Learning with very expressive representation languages makes it possible to deal with problems having a complex structure. However, this considerably increases the search space of the learning algorithms, thus reducing their efficiency. Therefore, problems with a huge amount of data or attributes, typical of data mining, are better addressed by approaches based on simple representation languages such as propositional languages. The design of methods that could obtain comprehensible, and more accurate models according to the computational resources provided to the algorithm would be useful. In this context, this thesis introduces the definition of decision multi-trees and presents several applications for this structure. First, we define an anytime algorithm which utilises the multi-tree structure for the generation of decision trees, which gradually improve in terms of accuracy and comprehensibility with respect to classifiers constructed by the classic decision-tree learning algorithms. The multi-tree structure is also employed as an ensemble method by combining the hypotheses contained in the multi-tree. The major advantage of this method over other ensemble algorithms is the important saving in computational resources due to the very features of the multi-tree structure, where the hypotheses share common parts. In addition, we include a novel method for generating a comprehensible hypothesis that is semantically similar to the combined model, which requires even fewer resources.
The real usefulness of a learning method not only depends on the obtainment of expressive, comprehensible and accurate models, but also on the reduction of the costs related to the context in which they are going to be applied. Examples of these costs are misclassification costs or the costs necessary to apply the attribute tests in order to compute the predictions of a classifier (test costs). In this thesis, we introduce a technique based on the Area Under the Receiver Operating Characteristic (ROC) Curve (AUC), in order to efficiently evaluate the quality of a rule-based binary classifier independently of the cost context. We also use the efficiency of this measurement to derive a new splitting criterion that constructs decision trees based on ROC analysis. We extend this criterion to more than two classes, using approximations of the AUC for multi-class problems. Finally, we investigate several techniques to reduce the costs associated to the tests of classifiers, as well as study the integration of all the above techniques.
In summary, this thesis contributes new learning methods to generate more expressive, comprehensible and accurate models, which are capable of customising the use of computational resources, the quality of the classifiers and the costs of application in variable contexts.