Thursday, September 26
12:30pm - 1:45pm 
Heller Hall 302

CLP(BN) in School: Constraint Logic Programming for Probabilistic Knowledge
Vitor Santos Costa

Visiting Assistant Professor
Department of Biostatistics and Medical Informatics
University of Wisconsin-Madison

Abstract

We present CLP($\cal{BN}$), a novel approach that aims at integrating
Bayesian variables with the logic programming framework. Our motivation  
in developing CLP($\cal{BN}$) was to reason with relations between 
probabilistic and non-probabilistic knowledge.

Logic programming traditionally uses constraints to represent
partial knowledge about variables. CLP($\cal{BN}$) is based on two key  
observations. First, we observe that random variables are bound
variables. The only difference is is that we do not know to which 
value they were bound: instead, we know that the value is an
(unknown) function of the universally quantified variables in the
clause. We thus constrain a random variable to be a skolem function of
the universally bound variables. Our second observation is that even   
if we do not know the actual value of the random variable, we often
have a probability distribution for the possible values of the
skolem function. A second constraint represents such a probability   
distribution. Succesful execution of a CLP($\cal{BN}$) program returns
a store consisting of a network of skolem functions related through
probability distributions. This store thus forms a Bayesian network,
and parameters of interest can be obtained through standard Bayesian 
solving techniques.

We have successfuly experimented on CLP($\cal{BN}$) with examples 
such as an artificially generated school data-based, in the style of
the Probabilistic Relational Models (PRM) example.
  
In a second step we have experimented with learning CLP($\cal{BN}$)
programs in the presence of complete data. We use likelihood as our
basic measure. We extend Srinivasan's Inductive Logic Programming
System Aleph to learn CLP($\cal{BN}$) programs. Our initial experiment
consists of learning the relations that define random variables in the
school data-base. We use a greedy algorithm to do multi-predicate
learning. Initial results show that Aleph can indeed replicate most of
the rules we used to generate the original data-base.