PhD defense Sebastian Tschiatschek

Vorsitzender: Ao.Univ.-Prof.Dr. W. RENHART
Prüfer:           Assoc.Prof.Dr. F. PERNKOPF
                      Prof. J. BILMES (University of Washington, Seattle)
Maximum Margin Bayesian Networks: Asymptotic Consistency, Hybrid Learning, and
Reduced-Precision Analysis
We consider Bayesian networks (BNs) with discriminatively optimized parameters and structures,
i.e. BNs that are optimized to maximize a kind of probabilistic margin. These maximum margin
Bayesian networks (MM BNs) are inspired by support vector machines (SVMs) that aim to
separate samples from different classes by a large margin in some feature space. MM BNs
achieve classification performance on par with BNs optimized according to other discriminative
criteria, e.g. maximum conditional likelihood. Furthermore, in several applications, they achieve
classification performance comparable to that of both, linear and kernelized, SVMs. In the
literature, two definitions of MM BNs with respect to their parameters are available. We analyze
these definitions in terms of asymptotic consistency, extend these definitions by a generative
regularizer and analyze properties of MM BNs with respect to reduced-precision implementations.
We start by analyzing the asymptotic consistency of MM BNs. Our analysis reveals a deficiency of
MM BNs according to the definitions available in the literature. This deficiency renders MM BNs
inconsistent in the multiclass case under certain conditions. However, this deficiency is not eminent
in the binary-class case. In experiments, we demonstrate that MM BNs are nevertheless able to
compensate for model-mismatch, i.e. the true data distribution and the distributions representable
by the learned models do not match. By slightly altering the definitions existing in the literature, we
extend MM BNs to include a generative norm-regularizer. This regularizer consists of normalized
frequency counts and is balanced against a margin term. In this novel formulation, MM BNs can be
interpreted as linear SVMs with a special regularizer or, alternatively, as BNs that are optimized
generatively as well as discriminatively. This interpretation allows one to naturally deal with missing
features scenarios, simply by marginalization of the missing features, and to perform semisupervised
learning — the unlabeled data influence the generative regularizer. State-of-the-art
performance of the novel MM BN formulation is achieved in a large set of experiments and efficient
algorithms for parameter learning in large scale scenarios are presented. Furthermore, we
Seite 2 von 2
consider reduced-precision implementations of BNs. We especially focus on BNs optimized for a large margin, either in terms of their structure or their parameters. In preliminary experiments, we investigate the classification performance of BNs with parameters rounded to some specified precision. These experiments extend results from the literature and reveal that BNs used for classification are well suited for reduced-precision implementations. We continue by deriving several types of classification performance bounds for BNs. These bounds can be used to analyze worst-case classification performance upon parameter rounding. In experiments, these bounds are evaluated. Furthermore, BNs optimized for a large margin (in terms of their parameters and their structure) are compared to generatively optimized BNs in terms of robustness to parameter quantization and in terms of absolute classification performance. We extend our reduced-precision considerations by proposing an alternative to determining reduced- precision parameters for MM BNs by rounding. Therefore, we slightly modify our formulation of MM BNs and propose algorithms for maximizing this modified criterion over the search space of reduced-precision parameters. In several experiments, we demonstrate that parameters learned in this way yield better classification performance than parameters obtained by rounding. Finally, we end our investigations by considering parameter learning using reduced-precision computations only. Therefore, we propose specialized algorithms for generative and discriminative parameter learning and demonstrate their effectiveness in experiments.

Date with Time
22. September 2014 - 10:00
Seminar room IDEG 134 (Inffeldgasse 16c)