Konferenz Detail

Workshop on Bayesian Inference for Latent Gaussian Models with Applications

02.02.2011-05.02.2011

Latent Gaussian models have numerous applications, for example in spatial and spatio-temporal epidemiology and climate modelling. This workshop brings together researchers who develop and apply Bayesian inference in this broad model class. One methodological focus is on model computation, using either classical MCMC techniques or more recent deterministic approaches such as integrated nested Laplace approximations (INLA). A second theme of the workshop is model uncertainty, ranging from model criticism to model selection and model averaging.

Confirmed invited speakers are:

Gonzalo García-Donato (UCLM, Spain)

Objective priors and search strategies in large variable selection problems

In estimation problems with one entertained statistical model, Bayesian answers are naturally expressed as probabilistic statements this being a major appeal of Bayesian approach. In model selection problems however, probabilistic arguments are the exception rather than the rule, even though posterior model probabilities are the obvious measure of evidence. The main reason is the difficulty of appropriate prior elicitation. Indeed, the dimension of most problems make subjective assignments practically impossible, but assignment of objective priors that are suitable for model selection is rather difficult and not yet entirely understood. To further complicate matters, the key ideas behind the choice of priors for objective Bayes estimation are useless to assign priors for model selection. In this talk we study arguments (like invariance, robustness and predictive matching) that we consider specially relevant and useful to guide the construction of a model selection prior. We do so within the context of variable selection in normal regression models and using flexible heavy tailed model specific priors. Our recommended proposal is a prior distribution which, quite remarkably, produces Bayes factors in closed form. This quite convenient property can substantially reduce the computational burden in problems with large number of explanatory variables. So far, only the popular conjugate Zellner's g-prior had this convenient property.
Another important aspect in model selection is how to deal with huge model spaces, for which exhaustive enumeration of all models is unfeasible and inferences have to be based on the very small proportion of visited models. We review some of the strategies proposed in the literature and argue that inferences based on empirical frequencies via MCMC sampling of the posterior distribution outperforms recently proposed searching methods. We provide our likely explanation for this effect and a number of illustrative examples.

(This talk is based on joint work with Susie Bayarri, Jim Berger, Anabel Forte and Miguel Martínez-Beneito)

Sylvia Frühwirth-Schnatter (JKU, Austria)

Bayesian variable selection and model identification through sparsity priors

The talk discusses Bayesian variable selection and model identification for latent variable models. It is shown how variable selection and model identification is achieved by combining the likelihood function of a general latent variable model with a prior which induces sparsity. While this approach is by now well-known for variable selection in a standard regression model, only few researchers tried so far to extend this approach to latent variable models. Such an extension is illustrated for two special classes of latent variable models.
The first example is the random intercept model which is widely applied in econometric analysis of panel data. Choosing a sparsity prior for this model is closely related to the appropriate choice of the distribution of heterogeneity. If, for instance, a Laplace rather than the usual normal prior is considered as prior distribution of the random effects, we obtain the Bayesian Lasso random effects model which allows individual shrinkage of the random effects toward 0. The sparsity prior allows, in this way, to identify units in the panel with zero random effects. In addition, spike-and-slab random effects models with both an absolutely continuous and a Dirac spike are studied.
The second example is the basic structural time series model which has a representation as a state space model. By choosing suitable sparsity priors on the variances appearing in this model it is possible to separate components which are fixed from components which are random.
Finally, details of efficient MCMC estimation are discussed for all models.

(This talk is based on joint work with Helga Wagner)

Alan Gelfand (Duke, USA)

Point pattern modeling for degraded presence-only data over large regions

Explaining species distribution using local environmental features is a long standing ecological problem. Often, available data is collected as a set of presence locations only thus precluding the possibility of a presence-absence analysis. We propose that it is natural to view presence-only data for a region as a point pattern over that region and to use local environmental features to explain the intensity driving this point pattern. This suggests hierarchical modeling, treating the presence data as a realization of a spatial point process whose intensity is governed by environmental covariates. Spatial dependence in the intensity surface is modeled with random effects involving a zero mean Gaussian process. Highly variable and typically sparse sampling effort as well as land transformation degrades the point pattern so we augment the model to capture these effects. The Cape Floristic Region (CFR) in South Africa provides a rich class with such species data. The potential, i.e., nondegraded presence surfaces over the entire area are of interest from a conservation and policy perspective.
Our model assumes grid cell homogeneity of the intensity process where the region is divided into ~37, 000 grid cells. To work with a Gaussian process over a very large number of cells we use predictive process approximation. Bias correction by adding a heteroscedastic error component is implemented. The model was run for a number of different species. Model selection was investigated with regard to choice of environmental covariates. Also, comparison is made with the now popular Maxent approach, though the latter is much more limited with regard to inference. In fact, inference such as investigation of species richness immediately follows from our modeling framework.

Chris Holmes (Oxford, UK)

Computational strategies for Bayesian logistic regression analysis in genetic association studies within related populations

We discuss computational strategies to assist in Bayesian logistic regression analysis of case-control population based genome-wide association studies (GWAS) aimed at highlighting human genetic (or genomic) variation that associates with common disease risk. We explore Monte Carlo and asymptotic (Laplace) approximations and how they can be used to alleviate some of the computational challenges arising in high-dimensional logistic regression in the presence of predictor set uncertainty on highly-structured genetic covariates.

Finn Lindgren (NTNU, Norway)

How to avoid covariance functions, kernels, and dense lattices

Markov random field models generated by high-rank Hilbert space approximations of stochastic partial differential equations are surprisingly practical, and allow easy construction of non-stationary non-separable space-time models. This avoids the need to design kernels or positive definite covariance functions, while also giving easy access to complex dependencies. Furthermore, the method is faster and more accurate than approximations based on covariance tapering or compactly supported kernels. The spatially continuous interpretation also allows spatially consistent Markov random field models to be constructed on irregular grids, regardless of the type of observation process, further reducing the computational costs typically associated with dense lattices. The approach is illustrated with global temperature data.

Christopher Paciorek (Harvard, USA)

A unified approach to spatial modeling using Markov random fields?

Conditional auto-regressive models are popular for areal data, with the Markov random field (MRF) precision matrix often based simply on whether areas share a boundary. An alternative, when the areas form a regular grid, is the Markov random field approximation to a thin plate spline (Rue and Held, (2005)). I consider the use of these and other Markov random field specifications to represent latent spatial processes on a fine grid. One can then consider likelihoods that relate the latent process to point observations or areal observations, in the process avoiding the modifiable areal unit problem. I explore the properties of different MRF specifications in this context based on analytic calculations and simulations. Computational approaches include penalized quasi likelihood and INLA. MCMC in the Gaussian likelihood case can be feasible, but it poses difficulties for generalized models.

Christian P. Robert (Paris, France)

ABC methods for Bayesian model choice

Approximate Bayesian computation (ABC), also known as likelihood-free methods, have become a standard tool for the analysis of complex models, primarily in population genetics but also for complex financial models. We examined in Grelaud et al. (Bayesian Analysis, 2009) the use of ABC for Bayesian model choice in the specific of Gaussian random fields (GRF), relying on a sufficient property only enjoyed by GRFs to show that the approach was legitimate. Despite having previously suggested the use of ABC for model choice in a wider range of models in the DIY ABC software (Cornuet et al., Bioinformatics, 24(23), 2713-19, 2008), we present theoretical evidence that the general use of ABC for model choice is fraught with danger in the sense that no amount of computation, however larger, can garantee a proper approximation of the posterior probabilities of the models under comparison. This work shows as an corollary that GRFs are the exception to this lack of convergence.

(This talk is based on joint work with Jean-Michel Marin and Natesh Pillai).

Håvard Rue (NTNU, Norway)

INLA: Past, Present & Future

In this talk, I will discuss the INLA methodology and software, making some ``historical'' remarks, discuss the current status with its successes and limitations, and then present some open problems and a wishlist for the future.

Stephan Sain (NCAR, USA)

Statistical analysis of regional climate model ensembles: NARCCAP case studies

The North American Regional Climate Change Assessment Program (NARCCAP) is an ambitious multi-agency, multi-institution collaboration to produce regional projections of climate change for North America based on a multi-model ensemble of regional climate models. In this talk, I will present a statistical approach to analyze and combine the information in the ensemble based on a functional analysis of variance (ANOVA) embedded within a hierarchical Bayesian method that accounts for differences in the models as well as the spatial correlation in the model output. In particular, I will present preliminary results that seek to examine the various sources of uncertainty in the model output.

On Wednesday morning, Håvard Rue will present a tutorial about INLA.

Abstract submission for contributed talks and poster presentations is now closed.

Contact: bilgm11@math.uzh.ch

Institut für Mathematik

Konferenz Detail

Workshop on Bayesian Inference for Latent Gaussian Models with Applications