Context specific graphical models and its applications
April 25, 2014 Phd students
We can almost never predict with certainty what will happen in the future as uncertainty is an unavoidable aspect of most real-world phenomena. Uncertainty arises due to limitations in our ability to both observe the world as well as model it. Probability theory provides us with the basic foundation to model our beliefs about different possible states of the world and to update our beliefs accordingly as new evidence is received. These beliefs may than act as an aid when making decisions or predicting the future.
The possibility of using probability theory efficiently on large scale problems consisting of many inter-related variables is fairly recent. For this purpose, the development of the framework of graphical models has gained vivid interest during the last decades. This framework is based on ideas from discrete data structures in computer science to efficiently encode and manipulate probability distributions over hundreds or even many thousands of variables. These methods have been used in an enormous range of applications ranging from genetic mapping of diseases to machine learning and probabilistic expert systems.
A graphical model combines probability theory and graph theory by utilizing a graph-based representation to encode the dependence structure of a complete distribution over a multi-dimensional space. The nodes in the graph represent random variables while the edges between them represent direct dependencies. Furthermore, absence of an edge represents a statement of conditional independence. Following the dependence statements encoded by the graph, the corresponding joint probability distribution may be represented in a compact manner.
If we for example have a graphical model over N binary variables, its corresponding joint probability distribution can be fully represented by 2N-1 parameters. This naive representation is, however, too resource-demanding when the number of variables is increased and thereby not useful in practice. The conditional statements reflected by a graphical model can be utilized in order to greatly reduce the number of parameters needed to represent the joint distribution.
Despite of their generic applicability and wide adoptance, the constraints imposed by ordinary graphical models have been recognized to be unnecessarily stringent under certain circumstances. This observation has led to the proposal of several generalizations that aim at more relaxed constraints by which the models can impose local or context specific dependence structures. Context specific independence (CSI) is a form of conditional independence that only holds in part of the outcome space. Context specific graphical models form a generalization of ordinary graphical models and they have been discussed by e.g. Corander (2003) and Friedman et al. (1996).
One method for learning the structure layer of a graphical model is an optimization based search. In the Bayesian approach, the scoring function is the posterior probability of the structure given the training data, Heckerman et al. (1995). An exhaustive evaluation of all graphs is, however, not possible in practice since the number of possible graph structures grows superexponentially by the number of variables. To tackle this problem, non-reversible MCMC based search algorithms have successfully been used to search through parts of the outcome space for ordinary graphical models, Corander et al. (2006, 2008). As the outcome space of context specific graph structures is considerably larger than that of ordinary graphs, some deterministic method must, however, be included into the search process.
The objectives for this research project include:
- Development and generalization of the theory of context specific graphical models.
- Development of efficient Bayesian inference algorithms for context specific graphical models.
- Testing of the new models and algorithms on several extensive real data sets.
Some references
Corander, J. (2003). Labelled graphical models. Scandinavian Journal of Statistics. 30, 493-508.
Corander, J., Gyllenberg, M. and Koski, T. (2006). Bayesian model learning based on a parallel MCMC strategy. Statistics and Computing,16, 355-362.
Corander, J., Ekdahl, M. and Koski, T. (2008). Parallell interacting MCMC for learning of topologies of graphical models. Data Mining and Knowledge Discovery, 17, 431-456.
C. Boutilier, N. Friedman, M. Goldszmidt, and D. Koller. Context-specific independence in Bayesian networks. I Horvitz and Jensen UAI‘96, 115-123.
N. Friedman and M. Goldszmidt. Learning Bayesian networks with local structure. I Horvitz and Jensen UAI ‘96, 252-262.
Lauritzen, SL (1996) Graphical models. Oxford: Oxford University Press.
Heckerman, D, Geiger, D and Chickering DM (1995) Learning Bayesian networks: The combination of knowledge and statistical data.Machine Learning 20: 197-243.
Jordan, MI (2004) Graphical models. Statistical Science 19: 140-155.