Download Free Scalable Structure Learning Inference And Analysis With Probabilistic Programs Book in PDF and EPUB Free Download. You can read online Scalable Structure Learning Inference And Analysis With Probabilistic Programs and write the review.

How can we automate and scale up the processes of learning accurate probabilistic models of complex data and obtaining principled solutions to probabilistic inference and analysis queries? This thesis presents efficient techniques for addressing these fundamental challenges grounded in probabilistic programming, that is, by representing probabilistic models as computer programs in specialized programming languages. First, I introduce scalable methods for real-time synthesis of probabilistic programs in domain-specific data modeling languages, by performing Bayesian structure learning over hierarchies of symbolic program representations. These methods let us automatically discover accurate and interpretable models in a variety of settings, including cross-sectional data, relational data, and univariate and multivariate time series data; as well as models whose structures are generated by probabilistic context-free grammars. Second, I describe SPPL, a probabilistic programming language that integrates knowledge compilation and symbolic analysis to compute sound exact answers to many Bayesian inference queries about both hand-written and machine-synthesized probabilistic programs. Third, I present fast algorithms for analyzing statistical properties of probabilistic programs in cases where exact inference is intractable. These algorithms operate entirely through black-box computational interfaces to probabilistic programs and solve challenging problems such as estimating bounds on the information flow between arbitrary sets of program variables and testing the convergence of sampling-based algorithms for approximate posterior inference. A large collection of empirical evaluations establish that, taken together, these techniques can outperform multiple state-of-the-art systems across diverse real-world data science problems, which include adapting to extreme novelty in streaming time series data; imputing and forecasting sparse multivariate flu rates; discovering commonsense clusters in relational and temporal macroeconomic data; generating synthetic satellite records with realistic orbital physics; finding information-theoretically optimal medical tests for liver disease and diabetes; and verifying the fairness of machine learning classifiers.
Probabilistic modeling and reasoning are central tasks in artificial intelligence and machine learning. A probabilistic model is a rough description of the world: the model-builder attempts to capture as much detail about the world's complexities as she can, and when no more detail can be given the rest is left as probabilistic uncertainty. Once constructed, the goal of a model is to perform automated inference: compute the probability that some particular fact is true about the world. It is natural for the model-builder to want a flexible expressive language - the world is a complex thing to describe - and over time this has led to a trend of increasingly powerful modeling languages. This trend is taken to its apex by probabilistic programming languages (PPLs), which enable modelers to specify probabilistic models using the facilities of a full programming language. However, this expressivity comes at a cost: the computational cost of inference is in direct tension with the flexibility of the modeling language, and so it becomes increasingly difficult to design automated inference algorithms that scale to the kinds of systems that model builders want to create. This thesis focuses on the central question: how can we design effective probabilistic programming languages that profitably trade-off expressivity and tractability for inference? The approach taken here is first to identify and exploit important structure that a probabilistic program may possess. The kinds of structure considered here are discrete program structure and symmetry. Programs are heterogeneous objects, so different parts of programs may exhibit different kinds of structure; in the second part of the thesis I show how to decompose heterogeneous probabilistic program inference using a notion of program abstraction. These contributions enable new applications of probabilistic programs in domains such as text analysis, verification of probabilistic systems, and classical simulation of quantum algorithms.
This book presents an exciting new synthesis of directed and undirected, discrete and continuous graphical models. Combining elements of Bayesian networks and Markov random fields, the newly introduced hybrid random fields are an interesting approach to get the best of both these worlds, with an added promise of modularity and scalability. The authors have written an enjoyable book---rigorous in the treatment of the mathematical background, but also enlivened by interesting and original historical and philosophical perspectives. -- Manfred Jaeger, Aalborg Universitet The book not only marks an effective direction of investigation with significant experimental advances, but it is also---and perhaps primarily---a guide for the reader through an original trip in the space of probabilistic modeling. While digesting the book, one is enriched with a very open view of the field, with full of stimulating connections. [...] Everyone specifically interested in Bayesian networks and Markov random fields should not miss it. -- Marco Gori, Università degli Studi di Siena Graphical models are sometimes regarded---incorrectly---as an impractical approach to machine learning, assuming that they only work well for low-dimensional applications and discrete-valued domains. While guiding the reader through the major achievements of this research area in a technically detailed yet accessible way, the book is concerned with the presentation and thorough (mathematical and experimental) investigation of a novel paradigm for probabilistic graphical modeling, the hybrid random field. This model subsumes and extends both Bayesian networks and Markov random fields. Moreover, it comes with well-defined learning algorithms, both for discrete and continuous-valued domains, which fit the needs of real-world applications involving large-scale, high-dimensional data.
Probabilistic modeling, as known as probabilistic machine learning, provides a principled framework for learning from data, with the key advantage of offering rigorous solutions for uncertainty quantification. In the era of big and complex data, there is an urgent need for new inference methods in probabilistic modeling to extract information from data effectively and efficiently. This thesis shows how to do theoretically-guaranteed scalable and reliable inference for modern machine learning. Considering both theory and practice, we provide foundational understanding of scalable and reliable inference methods and practical algorithms of new inference methods, as well as extensive empirical evaluation on common machine learning and deep learning tasks. Classical inference algorithms, such as Markov chain Monte Carlo, have enabled probabilistic modeling to achieve gold standard results on many machine learning tasks. However, these algorithms are rarely used in modern machine learning due to the difficulty of scaling up to large datasets. Existing work suggests that there is an inherent trade-off between scalability and reliability, forcing practitioners to choose between expensive exact methods and biased scalable ones. To overcome the current trade-off, we introduce general and theoretically grounded frameworks to enable fast and asymptotically correct inference, with applications to Gibbs sampling, Metropolis-Hastings and Langevin dynamics. Deep neural networks (DNNs) have achieved impressive success on a variety of learning problems in recent years. However, DNNs have been criticized for being unable to estimate uncertainty accurately. Probabilistic modeling provides a principled alternative that can mitigate this issue; they are able to account for model uncertainty and achieve automatic complexity control. In this thesis, we analyze the key challenges of probabilistic inference in deep learning, and present novel approaches for fast posterior inference of neural network weights.
There is a trade-off between expressiveness and tractability in generative modeling. On the one hand, while neural-based deep generative models are extremely expressive, the ways we can query them are limited; on the other hand, while tractable probabilistic models support efficient computation of various probabilistic queries, scaling them up is a major challenge. Probabilistic circuits are a tractable representation of probability distributions allowing for exact and efficient computation of likelihoods and marginals. We study the task of scaling up the learning of probabilistic circuits and then applying them to various applications. On the learning front, we propose a new algorithm for learning the sparse structures of probabilistic circuits that can significantly improve their capacity. On the application front, we further demonstrate the expressiveness and tractability of probabilistic circuits in two downstream applications: genetic sequence modeling and controllable language generation.
This dissertation focuses on Markov logic networks (MLNs), a knowledge representation tool that elegantly unifies first-order logic (FOL) and probabilistic graphical models (PGMs). FOL enables compact representation while probability allows the user to model uncertainty in a principled manner. Unfortunately, although the representation is compact, inference in MLNs is quite challenging, as PGMs generated from MLNs typically have millions of random variables and features. As a result, even linear time algorithms are computationally infeasible. Recently, there has been burgeoning interest in developing "lifted" algorithms to scale up inference in MLNs. These algorithms exploit symmetries in the PGM associated with an MLN, detecting them in many cases by analyzing the first-order structure without constructing the PGM, and thus have time and space requirements that are sub-linear when symmetries are present and can be detected. However, previous research has focused primarily on lifted marginal inference while algorithms for optimization tasks such as maximum-a-posteriori (MAP) inference are far less advanced. This dissertation fills this void, by developing next generation algorithms for MAP inference. This dissertation presents several novel, scalable algorithms for MAP inference in MLNs. The new algorithms exploit both exact and approximate symmetries, and experimentally are orders of magnitude faster than existing algorithms on a wide variety of real-world MLNs. Specifically, this dissertation makes the following contributions: A key issue with existing lifted approaches is that one has to make substantial modifications to highly engineered, well-researched inference algorithms and software, developed in the PGM community over the last few decades. We address this problem by developing the ``lifting as pre-processing'' paradigm, where we show that lifted inference can be reduced to a series of pre-processing operations that compresses a large PGM to a much smaller PGM. Another problem with current lifted algorithms is that they only exploit exact symmetries. In many real-world problems, very few exact symmetries are present while approximate symmetries are abundant. We address this limitation by developing a general framework for exploiting approximate symmetries that elegantly trades solution quality with time and space complexity. Inference and weight learning algorithms for MLNs need to solve complex combinatorial counting problems. We propose a novel approach for formulating and efficiently solving these problems. We scale-up two approximate inference algorithms, Gibbs sampling and MaxWalkSAT and three weight learning algorithms, Contrastive Divergence, Voted Perceptron, and, Pseudo-log-likelihood learning. We propose novel approximate inference algorithms for accurate, scalable inference in PGMs having shared sub-structures but no shared parameters. We demonstrate both theoretically and experimentally that they outperform state-of-the-art approaches.
"A biological system is a complex network of heterogeneous molecular entities and their interactions contributing to various biological characteristics of the system. Although the biological networks not only provide an elegant theoretical framework but also offer a mathematical foundation to analyze, understand, and learn from complex biological systems, the reconstruction of biological networks is an important and unsolved problem. Current biological networks are noisy, sparse and incomplete, limiting the ability to create a holistic view of the biological reconstructions and thus fail to provide a system-level understanding of the biological phenomena. Experimental identification of missing interactions is both time-consuming and expensive. Recent advancements in high-throughput data generation and significant improvement in computational power have led to novel computational methods to predict missing interactions. However, these methods still suffer from several unresolved challenges. It is challenging to extract information about interactions and incorporate that information into the computational model. Furthermore, the biological data are not only heterogeneous but also high-dimensional and sparse presenting the difficulty of modeling from indirect measurements. The heterogeneous nature and sparsity of biological data pose significant challenges to the design of deep neural network structures which use essentially either empirical or heuristic model selection methods. These unscalable methods heavily rely on expertise and experimentation, which is a time-consuming and error-prone process and are prone to overfitting. Furthermore, the complex deep networks tend to be poorly calibrated with high confidence on incorrect predictions. In this dissertation, we describe novel algorithms that address these challenges. In Part I, we design novel neural network structures to learn representation for biological entities and further expand the model to integrate heterogeneous biological data for biological interaction prediction. In part II, we develop a novel Bayesian model selection method to infer the most plausible network structures warranted by data. We demonstrate that our methods achieve the state-of-the-art performance on the tasks across various domains including interaction prediction. Experimental studies on various interaction networks show that our method makes accurate and calibrated predictions. Our novel probabilistic model selection approach enables the network structures to dynamically evolve to accommodate incrementally available data. In conclusion, we discuss the limitations and future directions for proposed works."--Abstract.
This book constitutes the refereed proceedings of the 11th International Conference on Scalable Uncertainty Management, SUM 2017, which was held in Granada, Spain, in October 2017. The 24 full and 6 short papers presented in this volume were carefully reviewed and selected from 35 submissions. The book also contains 3 invited papers. Managing uncertainty and inconsistency has been extensively explored in Artificial Intelligence over a number of years. Now, with the advent of massive amounts of data and knowledge from distributed, heterogeneous, and potentially conflicting sources, there is interest in developing and applying formalisms for uncertainty and inconsistency in systems that need to better manage this data and knowledge. The International Conference on Scalable Uncertainty (SUM) aims to provide a forum for researchers who are working on uncertainty management, in different communities and with different uncertainty models, to meet and exchange ideas.
This book provides an overview of the theoretical underpinnings of modern probabilistic programming and presents applications in e.g., machine learning, security, and approximate computing. Comprehensive survey chapters make the material accessible to graduate students and non-experts. This title is also available as Open Access on Cambridge Core.