CONSTRUCTIVE REINFORCEMENT LEARNING
Authors
José Hernández-Orallo
Abstract
This paper presents an operative measure of reinforcement for constructive
learning methods, i.e., eager learning methods using highly expressible
(or universal) representation languages. These evaluation tools allow a
further insight in the study of the growth of knowledge, theory revision
and abduction. The final ap-proach is based on an apportionment of credit
wrt. the ‘course’ that the evidence makes through the learnt theory. Our
measure of reinforcement is shown to be justified by cross-validation and
by the connection with other successful evaluation criteria, like the MDL
principle. Finally, the relation with the classical view of reinforcement
is studied, where the actions of an intelligent system can be rewarded
or penalised, and we discuss whether this should affect our distribution
of reinforcement. The most important result of this paper is that the way
we distribute reinforcement into knowledge results in a rated ontology,
instead of a single prior distribution. Therefore, this detailed information
can be exploited for guiding the space search of in-ductive learning algorithms.
Likewise, knowledge revision may be done to the part of the theory which
is not justified by the evidence.
Keywords
Reinforcement Learning; Theory Evaluation; Incremental Learning; Ontology;
Apportionment of Credit; Abduction; Induction; MDL principle; Knowledge
Acquisition and Revision; ILP; Philosophy of Science