**Date: 4 February 2020, CB 3.7, 13:15**

**Testing a secondary endpoint after a group sequential test**

** Christopher Jennison (University of Bath)**

**Abstract: **A key feature of reporting clinical trial results to regulators is the control of the type I error rate. The same considerations apply to multiple hypothesis testing when both primary and secondary outcomes for a treatment may be reported. If a trial design is to permit early stopping at an interim analysis, a group sequential testing boundary can be used to control type I error for a test of the primary endpoint. This talk addresses the question of how a secondary endpoint may be tested after such a test yields an early positive conclusion.

**Date: 11 February 2020, CB 3.7, 13:15**

**Bayesian Modelling Frameworks for Under-Reporting and Delayed Reporting in Count Data **

** Oliver Stoner (University of Exeter)**

**Abstract:** In practical applications, including disease surveillance and monitoring severe weather events, available count data are often an incomplete representation of phenomena we are interested in. This includes under-reporting, where observed counts are thought to be less than or equal to the truth, and delayed reporting, where total counts are not immediately available, instead arriving in parts over time.

In this seminar I will present two Bayesian hierarchical modelling frameworks which aim to deal with each of these issues, respectively. I will also present applications to UK tornado data, where under-reporting is thought to be a problem in areas of low population density, and to dengue fever data from Brazil, where notification delay means that the true size of outbreaks may not be known for weeks or even months after they've occurred.

**Date: 18 February 2020, CB 3.7, 13:15**

**Analysis of UK river flow data using Bayesian clustering and extremal principal components**

** Christian Rohrbeck (University of Bath)**

**Abstract:** The UK has experienced several severe and widespread flood events over the past years. For instance, the floods related to Storm Desmond, Storm Eva and Storm Frank in 2015/2016 caused an estimated economic damage of between £1.3--1.9 billion. Extreme value analysis provides us with a theoretical justified framework to model extreme river flow levels. This talk presents two approaches to study extreme river flow across a set of 46 observation stations (gauges) in the UK. The first part of the talk introduces a Bayesian clustering framework which pools information spatially to estimate the marginal tail behaviour for each of the gauges. The second part considers the spatial dependence of extreme events. A recently proposed method investigates the extremal dependence structure using ideas from principal component analysis. At the end of the talk, we introduce an approximative procedure to generate realizations of extreme events based on estimated extremal principal components.

**Date: 17 March 2020, CB 3.7, 13:15**

**Flexible models for nonstationary dependence: methods and their place in risk analysis**

** Ben Youngman (University of Exeter)**

**Abstract:** When modelling a dependent process we often assume that its dependence structure is stationary; e.g. doesn't change with location in a spatial model or doesn't change over input space in an emulator. But if we're dealing with a large spatial domain or input space, we must at some point think our dependence structure is nonstationary. Sampson & Guttorp (JASA, 1992), and various subsequent works, incorporated nonstationarity in the spatial context by transforming locations' geographic space to a new space in which stationarity was a fair (or at least better) assumption. This talk will discuss a new approach that aims to synthesise and make more intuitive some past works while also allowing objective inference. An example related to risk analysis, based on loss estimation for US rainfall, will be used for illustration.

**Date: 1 October 2019, CB 5.8, 14:15**

**Integrated copula spectral densities and their applications
**

** Tobias Kley (University of Bristol)**

**Abstract:** Copula spectral densities are defined in terms of the copulas associated with the pairs (X_{t+k}, X_{t}) of a process (X_{t})_{t ∈ ℤ}. Thereby they can capture a wide range of dynamic features, such as changes in the conditional skewness or dependence of extremes, that traditional spectra cannot account for. A consistent estimator for copula spectra was suggested by Kley et al. [Bernoulli 22 (2016) 1707-831] who prove a functional central limit theorem (fCLT) according to which the estimator, considered as a stochastic process indexed in the quantile levels, converges weakly to a Gaussian limit. Similar to the traditional case, no fCLT exists for this estimator when it is considered as a stochastic process indexed in the frequencies. In this talk, we consider estimation of integrated copula spectra and show that our estimator converges weakly as a stochastic process indexed in the quantile levels and frequencies. Interestingly and in contrast to the estimator considered by Kley et al., estimation of the unknown marginal distribution has an effect on the asymptotic covariance. We apply subsampling to obtain confidence intervals for the integrated copula spectra. Further, our results allow to use copula spectra to test a wide range of hypotheses. As an example, we suggest a test for the hypothesis that the underlying process is pairwise time-reversible.
(This is joint work with H. Dette, Y. Goto, M. Hallin, R. Van Hecke and S. Volgushev.)

**Date: 8 October 2019, CB 5.8, 13:15**

**A novel use of multivariate statistics to diagnose test-to-test variation in complex measurement systems**

**Richard Burke (University of Bath, Dept. Mechanical Engineering)**

**Abstract:**
Vehicle testing is critical to demonstrating the cost-benefits of new technologies that will reduce fuel consumption, CO_{2} and toxic emissions. However, vehicle testing is also costly, time consuming and it is vital that these are conducted efficiently and that the information they yield is maximised. Vehicles are complex systems, but it is straightforward to install intensive instrumentation to record many data channels. Due to costs, relatively few repeated test cycles are conducted. Identifying correlations within these datasets is challenging and requires expert input who ultimately focus on small subsets of the original data. In this paper, a novel application of Partial Least Squares (PLS) regression is used to explore the complete data set, without the need for data exclusion. Two approaches are used, the first collapses the data set and analyses all data channels without time variations, while the second unfolds the data set to avoid any information loss. The technique allows for the systematic analysis of large datasets in a very time efficient way meaning more information can be obtained about a testing campaign. The methodology is used successfully to identify sources of imprecision in four different case studies to analyse sources of imprecision in vehicle testing on a chassis dynamometer. These findings will lead to significant improvements in vehicle testing, allowing both substantial savings in testing effort and increased likelihood confidence in demonstrating the cost-benefit of new products. The measurement analysis technique can also be applied to other fields where repeated testing or batch processes are conducted.

**Date: 15 October 2019, CB 5.8, 13:15**

**Using Multilevel Regression Poststratification to Improve the Accuracy of Surveys**

** Julian Faraway (University of Bath)**

**Abstract:** Surveys and opinion polls will be biased when the sample does not represent the population. People respond to surveys in different ways. Some are much more likely to respond than others. Some are much easier to contact than others. If we do not take account of the varying response rates, the survey results will be unreliable.

Multilevel Regression Poststratification (MRP) is a state-of-the-art statistical method for improving surveys to account for varying levels of response. It is used worldwide for political opinion polls and in marketing and government surveys. The method not only produces more reliable overall estimates but allows sub-area and sub-populations estimates even when survey data is sparse. The method also produces assessments of the reliability of estimates so that decision makers can take considered actions.

We will demonstrate the methodology using data from the Paraguayan Ficha Social and Encuesta Permanente de Hogares showing how estimates of household income and poverty can be improved and provided for various sub-populations.

https://github.bath.ac.uk/jjf23/MRP

~~Date: 22 October 2019, CB 5.8, 13:15~~ CANCELLED

**Automatic Learning of Dependency Structures and Alignments with Bayesian Non-Parametrics**

** Neill Campbell (University of Bath, Computer Science)**

**Abstract:** We present (time allowing) two recent models, from collaborative work between the Universities of Bath and Bristol, looking at the use of Gaussian Processes (GPs) in the context of multi-view learning and alignment in an unsupervised setting.

The first is a latent variable model capable of learning dependency structures across dimensions in a multivariate setting. Our approach is based on GP priors for the generative mappings and interchangeable Dirichlet process priors to learn the structure. The introduction of the Dirichlet process as a specific structural prior allows our model to circumvent issues associated with previous GP latent variable models. Inference is performed by deriving an efficient variational bound on the marginal log-likelihood on the model.

If time, the second is a model that can automatically learn alignments between high-dimensional data in an unsupervised manner. Our proposed method casts alignment learning in a framework where both alignment and data are modelled simultaneously. Further, we automatically infer groupings of different types of sequence within the same dataset. We derive a probabilistic model built on non-parametric priors that allows for flexible warps while at the same time providing means to specify interpretable constraints. We demonstrate the efficacy of our approach via quantitative comparison to the state-of-the-art approaches and provide examples to illustrate the versatility of our model in automatic inference of sequence groupings, absent from previous approaches, as well as easy specification of high level priors for different modalities of data.

**Date: 29 October 2019, CB 5.8, 13:15**

**Optimal split-plot designs ensuring precise pure-error estimation of the variance components**

** Kalliopi Mylona (King's College London)**

**Abstract:** In this work, we present a novel approach to design split-plot experiments which ensures that the two variance components can be estimated from pure error and guarantees a precise estimation of the response surface model. Our novel approach involves a new Bayesian compound D-optimal design criterion which pays attention to both the variance components and the fixed treatment effects. One part of the compound criterion (the part concerned with the treatment effects) is based on the response surface model of interest, while the other part (which is concerned with pure-error estimates of the variance components) is based on the full treatment model. We demonstrate that our new criterion yields split-plot designs that outperform existing designs from the literature both in terms of the precision of the pure-error estimates and the precision of the estimates of the factor effects.

This is a joint work with Steven G. Gilmour (King’s College London) and Peter Goos (KU Leuven)

**Date: 5 November 2019, CB 5.8, 13:15**

**Optimising First in Human Trial Designs Through Dynamic Programming**

** Lizzi Pitt (University of Bath)**

**Abstract:** First In Human clinical trials typically aim to find the Maximum Tolerated Dose of a potential new treatment. These trials are conducted sequentially, dosing one cohort at a time. Thus, after observing results from one cohort, trial teams must decide which dose to give the next cohort. This talk will demonstrate that dynamic programming can be used to find a trial design with a dose escalation scheme that is optimal with respect to a given objective function. Since the objective function influences the results, it must be carefully constructed to incorporate any study aims and regulatory requirements. We analyse the properties of resulting designs through simulation and compare the results to established designs such as the Continual Reassessment Method of O'Quigley et. al (1990).

**Date: 12 November 2019, CB 5.8, 13:15**

**Predicting patient engagement in IAPT services: A statistical analysis of Electronic Health Records**

** Alice Davis (University of Bath)**

**Abstract:**The Mental Health Foundation found that two-thirds of people say that they have experienced a mental health problem and that collective mental health is deteriorating. Across England, 12% of Improving Access to Psychological Therapy (IAPT) appointments are missed. In order to intervene effectively so that more patients receive the care they need, it is important to encourage the patients who are most likely to miss their appointments.

The University of Bath and Mayden, who provide cloud based technologies to support patient care, are embarking on a joint research project which aims to develop and test models to predict whether a patient will attend their therapy appointments, allowing for targeted intervention to be provided.

This project is just over half way through and this talk will discuss the highs and lows so far and look ahead to our plans for the future.

**Date: 19 November 2019, CB 5.8, 13:15**

**Measuring and understanding reproductive hormone signals: insights from quantitative models**

** Margaritis Voliotis (University Exeter)**

**Abstract:** Pulsatile release of reproductive hormones (LH and FSH) from the pituitary gland is critical for normal reproductive health. These hormonal pulses are driven by signals stemming from the hypothalamic area of the brain, but the precise mechanisms generating those signals are largely unknown.

In this talk, I’ll present how statistical models can aid the profiling of reproductive hormone pulsatility in the clinic, and also generate novel testable hypotheses regarding the mechanisms underlying these hormonal rhythms. In particular, I will present a simple stochastic model of pulse generation and how we use it to extract information regarding clinically relevant quantities, such as pulse frequency, from hormone time-series data using a Bayesian inference framework. Furthermore, I will present a novel mechanistic model explaining the generation of pulsatile dynamics. The model describes the dynamics of a neuronal population that has been linked to LH pulsatility, and highlights the role of neuropeptide-mediated feedback in the generation of these pulses. Most importantly, the model makes a series of predictions regarding how the system responds to external perturbations, that we verify experimentally opening new horizons for understanding fertility regulation.

~~Date: 26 November 2019, CB 5.8, 13:15~~ CANCELLED

~~TBA~~

~~ Silvia Liverani (Queen Mary University of London)~~

**Abstract:** TBA

~~Date: 3 December 2019, CB 5.8, 13:15~~ CANCELLED

~~TBA~~

~~ Sumeetpal Singh (University of Cambridge)~~

**Abstract:** TBA

**Date: 10 December 2019, CB 5.8, 13:15**

**High-dimensional mediation analysis with (tentative) implications for stratified medicine**

** Rhial Daniel (Cardiff University)**

**Abstract:** It is increasingly popular to perform traditional mediation analysis (e.g. using “product of coefficients”) in multi-omic datasets with a very large number of potential mediators. For example, a linear regression model with LDL-cholesterol (Y) as the outcome, and a particular genetic variant (X) with one of c.4,000 proteins (M) as two predictors will be fitted, followed by a second linear regression model with the protein (M) as the outcome, and the genetic variant (X) as the predictor. The product of the coefficient of X in the second model and the coefficient of M in the first model is taken to represent the part of the effect of X on Y mediated by that particular protein M. This is then repeated in turn for each of the c.4,000 proteins.

In this seminar I will discuss why one might embark on such an analysis, what scientific question might it be roughly addressing, and whether an alternative formulation of this question, which more closely aligns with the scientific objectives, exists. I will then discuss the potential pitfalls of relying on a simple analysis strategy such as the above, and suggest alternatives, justifying its performance using a simulation study and illustrating its use with an analysis of multi-omic data from the UCLEB consortium. Our alternatives include first a relatively simple method, and then a more sophisticated (doubly robust) extension that lends itself to valid statistical inference even when data-adaptive (machine learning) techniques are employed for the estimation of nuisance functionals.

Along the way, I will give an overview of some of the conceptual difficulties with mediation analysis with a single mediator as most commonly defined in causal inference, e.g. cross-world counterfactuals and independence assumption, and discuss more recently-proposed alternatives such as randomized interventional analogue effects and exposure splitting.

**Date: 5 February 2019, CB 3.7, 14:15**

**How to simulate a max-stable process?**

**Kirstin Strokorb (Cardiff University)**

**Abstract:**
Max-stable processes form a fundamental class of stochastic processes in the analysis of spatio-temporal extreme events. Simulation is often a necessary part of inference of certain characteristics, in particular for future spatial risk assessment. In this talk I will give an an overview over existing procedures for this task and put them into perspective of one another and make comparisons with respect to their properties making use of some new theoretical results. A particular emphasis will be given to the popular class of Brown-Resnick processes, which are max-stable processes that are associated to Gaussian processes. The talk is based on joint work with Marco Oesting.

**Date: 12 February 2019, CB 3.7, 13:15**

**Application of Bayesian Networks for Quantitative Risk Assessment**

**Sebastian Stolze (University of Bath)**

**Abstract:** The application of quantitative risk analysis (QRA) is crucial in many industries, including the oil and gas industry. Current QRA methods include the Event Tree (ET) framework in which developments of risk scenarios are analysed as paths of a tree structure. Just like many tree structures in general, ETs are discrete, fairly static objects that can grow quickly which can make it hard to understand their structure and the dependencies of subcomponents. Most of these drawbacks can be overcome by using Bayesian networks (BNs). We suggest a translation/simplification algorithm and exemplify how ETs can be translated into (potentially simplified) BNs. The algorithm tests for conditional independencies and quantifies conditional dependencies by employing an information measure. The same approach can be used to reduce the complexity of BNs with weak parent relationships. We also address an extension to include continuous time variables for the special case of piece-wise constant hazard functions.

**Date: 19 February 2019, CB 3.7, 13:15**

**Detecting causal effects on networks: a distribution regression approach**

**Nicolo Colombo (University College London)**

**Abstract:** We address the problem of predicting causal impacts of natural shocks in a complex network by examining a stochastic process defined over its nodes. The stochastic process is observed under its `natural' regime and under a series of `perturbed' conditions where non-experimental interventions are applied to some of the nodes. The task is to predict the status of the system under new unseen perturbations.

We cast the problem into a distribution regression framework where perturbed-regime distributions, p_y, are functions of the natural-regime (counterfactual) distributions, p_x, the network structure, G, and possibly some features associated with the interventions, z. By mapping all distributions into an infinite-dimensional Hilbert space, we reduce the problem to learning a simple regression model, y = f(x, G, z), with vectors x and y being respectively the Hilbert-space embedding of p_x and p_y. The model can then be trained in a supervised fashion from a set of observed perturbations and used to predict a new situation given the normal-regime distribution and the features of the new intervention.

We motivate this approach through the study of effects of unplanned disruptions in the London Underground.

**Date: 26 February 2019, CB 3.7, 13:15**

**The clustering of earthquake magnitudes**

**Katerina Stavrianaki (University College London)**

**Abstract:**
Many widely used seismicity models assume that there are no correlations between the magnitudes of successive earthquakes in a region, which implies that earthquakes are essentially unpredictable. While some empirical studies in the past have suggested the existence of magnitude correlations, other studies argue that such correlations are a statistical artifact due to the incompleteness of earthquake catalogs. We introduce a novel methodology for detecting magnitude correlations which is more robust to potential incompleteness. Our approach uses the conditional intensity of the Epidemic Type Aftershock Sequence (ETAS) model as a proxy for the local level of seismic activity. We study several regions in California using our methodology, and demonstrate that the level of seismic activity and the earthquake magnitudes are indeed correlated.

**Date: 5 March 2019, CB 3.7, 14:15**

**Wavelet lifting: what is it and what can you do with it?**

**Matt Nunes (University of Bath)**

**Abstract:**
Classical wavelet transforms have been popular since the nineties in a helping with a variety of analysis, both in time series and image processing tasks. However, they are limited by only working on regularly spaced data of certain lengths. In this talk I will discuss wavelet lifting, flexible wavelet-like transforms designed to work on different data sampling structures, such as where missingness occurs. I will talk about some new results on lifting schemes, and discuss their use in statistical applications in statistics, namely problems in time series analysis. I will also introduce some new work in extending lifting transforms to other settings such as complex-valued data and shape analysis.

**Date: 12 March 2019, CB 3.7, 13:15**

**Design and analysis of biosimilar clinical trials**

**Ryuji Uozumi (Kyoto University, Department of Biomedical Statistics and Bioinformatics)**

**Abstract:**
Recently, numerous pharmaceutical sponsors have expressed a great deal of interest in the development of biosimilars. A biosimilar is defined as a biological medicinal product that contains an active substance that is similar to that of an original previously authorized biological medicinal product. A reduction in healthcare costs for patients can be expected if a biosimilar is approved by regulators and placed on the market. Biosimilars differ from generic chemical products, for example, with respect to the complexity and heterogeneity of the molecular structure. Owing to the characteristic, a larger number of subjects would be required to investigate and clinically develop a biosimilar than that is required for the development of a generic product. In this talk, I shall describe statistical methods in designing and analyzing data for biosimilar clinical trials. In particular, I shall discuss the potential benefits in the settings of adaptive seamless designs and multi-regional clinical trials. The proposed methods are motivated by the clinical development of a biosimilar to the innovator infliximab (Remicade).

**Date: 21 March 2019, CB 3.9, 14:15**

**On Spectral Graph Clustering**

**Carey E. Priebe (Department of Applied Mathematics and Statistics, Johns Hopkins University)**

**Abstract:** Clustering is a many-splendored thing. As the ill-defined cousin of
classification, in which the observation to be classified X comes with a
true but unobserved class label Y, clustering is concerned with coherently
grouping observations without any explicit concept of true groupings.
Spectral graph clustering – clustering the vertices of a graph based on
their spectral embedding – is all the rage, and recent theoretical results
provide new understanding of the problem and solutions. In particular, we
reset the field of spectral graph clustering, demonstrating that spectral
graph clustering should not be thought of as kmeans clustering composed with
Laplacian spectral embedding, but rather Gaussian mixture model (GMM)
clustering composed with either Laplacian or Adjacency spectral embedding
(LSE or ASE); in the context of the stochastic blockmodel (SBM), we use
eigenvector CLTs & Chernoff analysis to show that (1) GMM dominates kmeans
and (2) neither LSE nor ASE dominates, and we present an LSE vs ASE
characterization in terms of affinity vs core-periphery SBMs. Along the way,
we describe our recent asymptotic efficiency results, as well as an
interesting twist on the eigenvector CLT when the block connectivity
probability matrix is not positive semidefinite. (And, time permitting, we
will touch on essential results using the matrix two-to-infinity norm.) We
conclude with a ‘Two Truths’ LSE vs ASE spectral graph clustering result –
necessarily including model selection for both embedding dimension & number
of clusters – convincingly illustrated via an exciting new diffusion MRI
connectome data set: different embedding methods yield different clustering
results, with one (ASE) capturing gray matter/white matter separation and
the other (LSE) capturing left hemisphere/right hemisphere characterization.

https://www.pnas.org/content/pnas/early/2019/03/07/1814462116.full.pdf

**Date: 26 March 2019, CB 3.7, 14:15**

**Assessing the performance in predicting terrorism in space and time using extreme gradient boosting and geostatistical models**

**Andre Python (University of Oxford)**

**Abstract:** In 2017, about 18,700 people lost their life due to terrorist attacks. Iraq, Syria, Pakistan, and Afghanistan counted more than half of the attacks and the total number of deaths attributed to terrorism worldwide. Providing spatially and temporally accurate predictions of terrorist events can help policy-makers design and implement efficient counterterrorism measures. In this talk, I will assess the performance of two approaches used to predict a week ahead the locations of terrorist events at fine spatial scale in Iraq, Iran, Afghanistan, and Pakistan. I will describe and compare the results obtained from the Xtreme Gradient Boosting (XGboost)—a machine-learning algorithm—with INLA-SPDE, a Bayesian approach applied in modelling spatial and spatio-temporal phenomena.

**Date: 2 April 2019, CB 3.7, 13:15**

**Statistical Anomaly Detection Framework for Cyber-Security**

**Marina Evangelou (Imperial College London)**

**Abstract:**Cyber attacks have emerged as a modern era 'epidemic'. Large enterprises must protect their network infrastructures and masses of private customer data while being attacked multiple times on a daily basis. Considering the risks of cyber attacks, enterprises are investing enormous amounts of money to apply tools, typically using signature-based methods, in order to defend their networks. However this is not enough. Cyber-security must strengthen its relationship with Statistics in order to exploit the vast amounts of data sources generated by the field. Statistical analysis of these data sources can lead to the development of statistical anomaly detection frameworks that can complement existing enterprise network defence systems. NetFlow is one of the available data sources widely used for monitoring an enterprise network. We present an anomaly detection framework based on predicting individual device behaviour. Device behaviour, defined as the number of NetFlow events, is modelled to depend on the observed historic NetFlow events. Through a comprehensive analysis, the best predictive model is chosen and based on its findings an anomaly detection framework is built.

**Date: 9 April 2019, CB 3.7, 13:15**

**Longitudinal models to examine critical periods of exposure**

**Kate Tilling (University of Bristol)**

**Abstract:** Gestational weight gain (GWG) is associated with a range of perinatal outcomes and longer term cardiovascular and metabolic outcomes in mother and child. Relatively little is known about how pre-pregnancy BMI and weight gain during different periods of gestation may interact, or whether weight gain is more important during some antenatal periods than others. We used random effects models to examine GWG in a pregnancy cohort in which women had detailed repeat assessment of weight during pregnancy (median number of measures 10 (IQR: 8,11)) measures. A linear spline model identified three distinct periods of GWG: 4-18weeks, 18-28 weeks and 28+ weeks of gestation. Multivariate multilevel models were then used to relate GWG to birthweight, with trivariate models to also account for gestational age at delivery. Critical period, sensitive period and mobility models, with LASSO for model selection, were used to investigate the epidemiology of GWG with respect to birthweight of the offspring. Limitations of this approach include assumptions of linearity between the random effects in the trivariate model, and sensitivities to high-order interactions in the second-stage epidemiological models.

**Date: 16 April 2019, CB 3.7, 13:15**

**Utility-based methods for Preferentially Sampled
Spatial Data**

**Elizabeth Gray (University of Bath)**

**Abstract:**
Spatial preferential sampling occurs when sampling locations of a spatial
process are stochastically dependent upon its values, e.g. those designing an air
pollution monitoring network may seek to place monitors close to main roads, or
other areas where they might expect violations of pollution guidelines. Should
the data from these monitors be used to construct a prediction map, this will
lead to upwardly biased estimates in low pollution areas if this preference is
ignored. Many existing methods for accounting for the sampling process within
the prediction model, e.g. Diggle, Menezes and Su (2010), consider the sampling
process as a point process, for which the location of each site is independent of
all others, given the underlying process: a potentially unrealistic assumption.
In this talk we will discuss the use of ’whole design utility’ functions to encapsulate the intentions of the experimenter and proportional design distributions.
We discuss the advantages and challenges of this approach, including proposals
for utility functions which are both practically useful and have mathematically
helpful and exciting properties. Inference for these models is hampered by the
need to compute an intractable normalising constant of the design distribution.
We propose a Metropolis-Hastings algorithm replacing the ratio of these constants by its estimate using Geyer’s (1994) reverse logistic regression method,
requiring draws from the ’design distribution’. We show how this can be done efficiently by permuting the utility arguments. We use the proposed methodology
to create a prediction map of lead concentrations in Galicia, Spain.

**Date: 7 May 2019, CB 3.7, 13:15**

**Are hazard ratios estimated from trials valid causal effect measures?**

**Jonathan Bartlett (University of Bath)**

**Abstract:** Virtually every randomised clinical trial with a time to event endpoint is in practice analysed using Cox's proportional hazards model. Recently Aalen et al (doi: 10.1007/s10985-015-9335-y) argued that the resulting hazard ratio is not a valid measure of the causal effect of treatment, even in a randomised trial and when the proportional hazards assumption holds. This claim has serious implications for the interpretation of a vast number of randomised trials and indeed non-randomised studies. In this talk I shall review the potential outcome framework for causal inference and the interpretations of causal effects like risk ratios for binary outcomes. I will then review Cox’s proportional hazards model & the critique given by Aalen et al. I conclude that (under proportional hazards) the hazard ratio is a perfectly valid population level casual effect measure.

**Date: 2 October 2018, CB 5.8, 13:15**

**Multiple Change-point Estimation in High-Dimensional Gaussian Graphical Models**

**Sandipan Roy (University of Bath)**

**Abstract:**
We consider the consistency properties of a regularized estimator for the simultaneous identification of both changepoints and graphical dependency structure in multivariate time-series. Traditionally, estimation of Gaussian Graphical Models (GGM) is performed in an i.i.d setting. More recently, such models have been extended to allow for changes in the distribution, but only where changepoints are known a-priori. In this work, we study the Group-Fused Graphical Lasso (GFGL) which penalizes partial-correlations with an L1 penalty while simultaneously inducing block-wise smoothness over time to detect multiple changepoints. We present a proof of consistency for the estimator, both in terms of changepoints, and the structure of the graphical models in each segment. Several synthetic experiments and two real data application validate the performance of the proposed methodology.

**Date: 9 October 2018, CB 5.8, 13:15**

**Designing an adaptive trial with treatment selection and a survival endpoint**

**Chris Jennison (University of Bath)**

**Abstract:**
In some Phase III clinical trials, more than one new treatment is compared to the control treatment.
Such a trial requires a larger sample size than a two arm trial. However, this sample size can be reduced
by choosing to focus on one of the new treatments part way through the trial.
We consider a clinical trial in which two versions of a new treatment are compared against control, with
the primary endpoint of overall survival. At an interim analysis, mid-way through the trial, one of the
two treatments is selected, based on the short term response of progression free survival. In the
remainder of the trial, new patients are randomised between the selected treatment and the control.
For such an adaptive design, the familywise type I error rate can be protected by use of a closed testing
procedure to deal with the two null hypotheses and combination tests to combine data from before and
after the interim analysis. However, with the primary endpoint of overall survival, there is still a danger
of inflating the type I error rate: we present a way of applying the combination test that solves this
problem simply and effectively. With the methodology in place, we then assess the potential benefits of
treatment selection in this adaptive trial design.

**Date: 16 October 2018, CB 5.8, 13:15**

**Scaling limits of sequential Monte Carlo algorithms**

**Jere Koskela (University of Warwick)**

**Abstract:**
Sequentual Monte Carlo (SMC) methods constitute a broad class of numerical approximation schemes for non-linear smoothing and filtering, rare event simulation, and many other applications. In brief, an ensemble of weighted particles is used to approximate a sequence of target densities. The ensemble is evolved from one target to the next by first resampling a new generation of particles proportional to the weights of the current ensemble, and then moving and reweighting the new generation in a manner which preserves the target. The resampling step induces a genealogy between particles, which has played a role in estimating mixing rates of SMC and particle MCMC algorithms, variance of SMC estimates, as well as the memory cost of storing SMC output. I will briefly introduce SMC algorithms, and show that appropriately rescaled SMC genealogies converge to a tractable process known as the Kingman coalescent in the infinite particle limit, under strong but standard assumptions. This facilitates the a priori estimation of genealogical quantities, which I will demonstrate by showing that simulated genealogies match predicted results even for relatively few particles, and when the assumptions that are needed to prove convergence fail.

**Date: 23 October 2018, CB 5.8, 13:15**

**Parallelising particle filters with butterfly interactions**

**Kari Heine (University of Bath)**

**Abstract:**
In modern computing systems an increase in the computational power is primarily obtained by increasing the number of parallel processing elements (PE) rather than by increasing the speed of an individual PE. While in many cases such parallel computing systems have enabled the completion of increasingly complex computational tasks, they can only do so if the task in question admits parallel computations. This talk focuses on an important class of algorithms lacking such inherent parallelism, namely the sequential Monte Carlo (SMC) methods, or particle filters. We consider some new parallel particle filter algorithms whose interaction diagram coincides with the butterfly diagram best known from the context of fast Fourier transform. We present some new convergence results and consider the potential of these algorithms based on numerical experiments.

Joint work with Nick Whiteley (University of Bristol) and Ali Taylan Cemgil (Bogazici University)

**Date: 30 October 2018, CB 5.8, 13:15**

**Optimal Change Point Detection and Localization in Sparse Dynamic Networks**

**Yi Yu (University of Bristol)**

**Abstract:**
We study the problem of change point detection and localization in dynamic networks. We assume that we observe a sequence of independent adjacency matrices of given size, each corresponding to one realization from an unknown inhomogeneous Bernoulli model. The underlying distribution of the adjacency matrices may change over a subset of the time points, called change points. Our task is to recover with high accuracy the unknown number and positions of the change points. Our generic model setting allows for all the model parameters to change with the total number of time points, including the network size, the minimal spacing between consecutive change points, the magnitude of the smallest change and the degree of sparsity of the networks.

We first identify an impossible region in the space of the model parameters such that no change point estimator is provably consistent if the data are generated according to parameters falling in that region. We propose a computationally simple novel algorithm for network change point localization, called Network Binary Segmentation, which relies on weighted averages of the adjacency matrices. We show that Network Binary Segmentation is consistent over a range of the model parameters that nearly cover the complement of the impossibility region, thus demonstrating the existence of a phase transition for the problem at hand. Next, we devise a more sophisticated algorithm based on singular value thresholding, called Local Refinement, that delivers more accurate estimates of the change point locations. We show that, under appropriate conditions, Local Refinement guarantees a minimax optimal rate for network change point localization while remaining computationally feasible.

https://arxiv.org/abs/1809.09602

**Date: 6 November 2018, CB 5.8, 13:15**

**Error control for sequential trials**

**David Robertson (MRC Biostatistics Unit, University of Cambridge)**

**Abstract:**
In many areas of biomedical research, trials accumulate data gradually over time. Hence there is often the possibility of monitoring the results, and making decisions in a sequential manner based on the interim data. These decisions can involve adapting the trial design in some way or performing an interim analysis, and can yield substantial ethical and economic advantages. However, a barrier to using sequential trials, particularly from a regulatory viewpoint, is ensuring control of a suitable error rate when formally testing the hypotheses of interest. In this talk, I discuss recent methodological advances in error control for two sequential trial settings.

Firstly, I present work on familywise error rate (FWER) control for response-adaptive clinical trials. In such trials, the randomisation probabilities to the different treatments arms are sequentially updated using the accumulating response data. I propose adaptive testing procedures that ensure strong familywise error control, for both fully-sequential and block-randomised trials. I show that there can be a high price to pay in terms of power to achieve FWER control for randomisation schemes with extreme allocation probabilities. However, for proposed Bayesian adaptive randomisation schemes in the literature, the adaptive tests maintain the power of the trial.

Secondly, I compare and contrast recently proposed procedures for false discovery rate (FDR) control in trials with online hypothesis testing. In this setting, a sequence of hypotheses is tested and the investigator has to decide whether to reject the current null hypothesis without having access to the future p-values or even the number of hypotheses to be tested. A key example is the perpetual platform trial design, which allows multiple treatment arms to be added during the course of the trial. Using comprehensive simulation scenarios and case studies, I provide recommendations for which procedures to use in practice for online FDR control.

**Date: 13 November 2018, CB 5.8, 13:15**

**Markov chain Monte Carlo methods for studying phase transitions**

**Tom Underwood (University of Bath)**

**Abstract:**
There is considerable interest in being able to accurately calculate the properties of a given substance (e.g. water) at a given temperature using computational methods. This problem is essentially a statistical one. The substance is comprised of N particles, each of which has a (3-dimensional) position. Hence the space of possible states of the substance is 3N-dimensional, and each possible state has a corresponding probability of being the `real' state of the substance at any particular moment. The probability of a given state depends on the considered temperature, as well as the nature of the interactions between the particles. For instance, while at low temperatures states in which the particles' positions form a liquid may have the highest probabilities, gas-like states become increasingly likely as the temperature is increased. This is why a liquid eventually boils as its temperature is increased.

One method to calculate the properties of the substance is to generate a Markov chain of states which reflects the underlying probability distribution; the properties of the substance can be determined by taking the average over all states in the chain. This is what is done in Markov chain Monte Carlo (MCMC). However, the success of this approach hinges on the generated Markov chain containing a large number of uncorrelated states - it is uncorrelated states which will be used in the averaging. Unfortunately it is often difficult to achieve this. For instance at the boiling temperature there are two high-probability regions of state space, one corresponding to the liquid and the other to the gas. To move smoothly between these regions one must traverse a region of extremely low probability. This `geography' of state space makes it difficult to generate Markov chains which efficiently sample from both the liquid and gas regions of state space, which in turn makes it difficult to calculate accurate properties of substances near their boiling points.

In this seminar I will discuss MCMC methods used to calculate the properties of solids and liquids, with a focus on advanced methods used to study phase transitions (e.g. melting, boiling). Advanced methods employ a number of tricks to significantly reduce the degree of correlation between samples in the Markov chain. These include importance sampling to `guide' the system from one high-probability region of phase space to another; and MCMC schemes which take direct leaps between high-probability regions.

**Date: 20 November 2018, CB 5.8, 13:15**

**Modelling Similarity Matrices as Mixtures for structural comparison**

**Dan Lawson (University of Bristol)**

**Abstract:**
Similarity matrices arise in many circumstances, including examining covariance structures. If is often desirable to compare the structure of such matrices, whilst being robust to differences in scaling, noise and making as few modelling assumptions about how the matrices were generated. Here we describe a way to describe similarity matrices as a nonparametric decomposition into two components: a cluster membership (defined as a mixture) and an underlying relationship between those clusters. This allows a natural comparison between two such matrices, by identifying where the mixtures learned from the first matrix fail to predict the second matrix. This will have ubiquitous application in many problems where it is useful to identify, in an unsupervised way, how two matrices differ at a broad scale. This work has three main contributions. Firstly, we highlight the importance of being able to perform structural rather than numerical comparison between matrices. Secondly, we introduce a new algorithm for fitting a matrix as a mixture of components, suited for this purpose. Finally, we demonstrate the value of this approach with diverse examples: identifying structure present in genetics but absent in language, and identifying structure present in economics but absent in cultural values.

**Date: 27 November 2018, CB 5.8, 13:15**

**Confusion: Developing an information-theoretic secure approach for multiple parties to pool and unify statistical data, distributions and inferences**

**Murray Pollock (University of Warwick)**

**Abstract:**
“Monte Carlo Fusion” (Dai, Pollock, Roberts, 2018, JAP) is a new theory providing a framework for the unification of distributed statistical analyses and inferences, into a single coherent inference. This problem arises in many settings (for instance, expert elicitation, multi-view learning, distributed ‘big data’ problems etc.). Monte Carlo Fusion is the first general statistical approach which avoids any form of approximation error in obtaining the unified inference, and so has broad applicability across a number of statistical applications.

A direction of particular interest for broad societal impact is in Statistical Cryptography. Considering the setting in which multiple (potentially untrusted) parties wish to securely share distributional information (for instance in insurance, banking and social media settings), Fusion methodology offers the possibility that distributional sharing can be conducted in such a manner that the information which is required to be exchanged between the parties for the methodology can be secretly shared. As a consequence a gold-standard information theoretic security of the raw data can be achieved. So called “Confusion”, a confidential fusion approach to statistical secret sharing, has the property that another party with unbounded compute power could not determine secret information of any other party.

Joint work with Louis Aslett, Hongsheng Dai, Gareth Roberts.

**Date: 4 December 2018, CB 5.8, 13:15**

**A statistical journey: From Networks to Organs-on-Chips**

**Beate Ehrhardt (IMI, University of Bath)**

**Abstract:**
In this seminar, I will take you on a statistical journey from hypothesis testing on networks to a Bayesian analysis to establish a standard of best practice for a novel experiment.

For networks, we deliver a tool to quantify the strength of the relation between observed community structure and the interactions in a network (i.e., modularity). We do so by characterizing the large-sample properties of network modularity in the presence of covariates, under a natural and flexible null model. This provides an objective measure of whether or not an observed community structure is meaningful. Our technical contribution is to provide limit theorems for modularity when a community assignment is given by nodal features or covariates. This allows us to assign p-values to observed community structure, which we apply to investigate a multi-edge network of corporate email interactions.

For organs-on-chips, we derive a standard of best practice for these novel experiments. Organs-on-chips emulate physiology at a small scale by engineering appropriate cellular microenvironments. Our novel automated imaging workflow enables us to robustly capture multi-cellular phenotypes at a high throughput from these high-content models. However, due to the novelty of the technology, there are no best practice standards to analyse the data, determine the sources of variability, or to perform sample size calculations. We have established an analysis pipeline for organs-on-chips to reduce bias and variability that utilizes Bayesian multi-level models. This provides for the first time, a framework of statistical best practice for organs-on-chip experiments.

Keywords: network community structure, statistical network analysis, central limit theorems, Bayesian multi-level models, and optimal experimental design

**Date: 11 December 2018, CB 5.8, 13:15**

**Multi-task relative attribute ranking by maximizing statistical dependence**

**Kwang In Kim (University of Bath)**

**Abstract:**
In this talk, I will present a new multi-task learning approach that can be applied to multiple heterogeneous task estimators. Our motivation is that the best task predictor could change depending on the task itself. For example, we may have a deep neural network for the first task and a Gaussian process predictor for the second task. Classical multi-task learning approaches cannot handle this case, as they require the same model or even the same parameter types for all tasks. We tackle this by considering task-specific estimators as random variables. Then, the task relationships are discovered by measuring the statistical dependence between each pair of random variables. By doing so, our model is independent of the parametric nature of each task, and is even agnostic to the existence of such parametric formulation. If time allows, I will also present our recent work on image-to-image translation.

**Date: 3 October 2017, CB 5.8, 13:15**

**Inference in generative models using the Wasserstein distance**

**Mathieu Gerber (University of Bristol)**

**Abstract:**
A growing range of generative statistical models are such the numerical evaluation of their likelihood functions is intractable. Approximate Bayesian computation and indirect inference have become popular approaches to overcome this issue, simulating synthetic data given parameters and comparing summaries of these simulations with the corresponding observed values. We propose to avoid these summaries and the ensuing loss of information through the use of Wasserstein distances between empirical distributions of observed and synthetic data. We describe how the approach can be used in the setting of dependent data such as time series, and how approximations of the Wasserstein distance allow the method to scale to large data sets. In particular, we propose a new approximation to the optimal assignment problem using the Hilbert space-filling curve. We provide an in-depth theoretical study, including consistency in the number of simulated data sets for a fixed number of observations and posterior concentration rates. The approach is illustrated with various examples, including a multivariate g-and-k distribution, a toggle switch model from systems biology, a queueing model, and a Lévy-driven stochastic volatility model. (Joint work with E. Bernton, P. E. Jacob and C.P. Robert.)

**Date: 10 October 2017, CB 5.8, 13:15**

**Tensor Train algorithms for stochastic PDE problems**

**Sergey Dolgov (University of Bath)**

**Abstract:**
Surrogate modelling is becoming a popular technique to reduce the
computational burden of forward and inverse uncertainty quantification
problems.
In this talk we use the Tensor Train (TT) decomposition for
approximating the forward solution map of the stochastic diffusion
equation,
as well as the posterior density function in the Bayesian inverse
problem.
The TT decomposition is based on the separation of variables, hence the
multivariate integration factorises into a set of one-dimensional
quadratures.
For sufficiently smooth functions, the storage cost of the TT
decomposition grows much slower with the accuracy compared to the Monte
Carlo rate.
The TT decomposition of a multivariate function can be constructed from
adaptively chosen fibres of samples along each variable (the so-called
TT cross interpolation), with the number of function evaluations
proportional to the (small) number of unknowns in the TT representation.
In turn, the TT approximation of the probability density function allows
an efficient computation of the Rosenblatt transform,
and hence a fast method for proposing almost uncorrelated MCMC samples.
We show that for smooth PDE coefficients the TT approach can be faster
than Quasi Monte Carlo and adaptive Metropolis techniques.

**Date: 17 October 2017, CB 5.8, 13:15**

**Stein variational Quasi-Newton algorithm: sampling by sequential transport of particles**

**Gianluca Detommaso (University of Bath)**

**Abstract:**
In many statistical applications and real-world situations, it is of fundamental
importance being able to quantify the uncertainty related to estimates of interest.
However, whenever the underlying probability distribution is difficult or unknown, this
cannot be done directly and sampling algorithms are typically needed. A recently
introduced sampling algorithm is the *Stein variational gradient descent* [Q. Liu,
D. Wang, 2016], where a cloud of particles are sequentially transported towards the target
distribution. This is accomplished by a functional gradient descent which minimises the
Kullback–Leibler divergence between the current distribution of the particles and the
target one.
In collaboration with Dr. A. Spantini (MIT) and T. Cui (Monash, AU), we are currently
working on accelerating this algorithm. From a transport maps perspective, we work out
second-order information to replace gradient descent with Quasi-Newton algorithms, with
potentially huge convergence accelerations. Furthermore, we substitute the simple kernel
used in the original algorithm by more sophisticated ones, better representing the
interaction between the particles and accelerating their spread along the target
distribution support.

**Date: 24 October 2017, CB 5.8, 13:15**

**Fast computation for latent Gaussian models with a multivariate link function
**

**Birgir Hrafnkelsson (University of Iceland)**

**Árni Víðir Jóhannesson (University of Iceland)**

**Abstract:**
Latent Gaussian models (LGMs) form a frequently used class within Bayesian hierarchical models. This class is such that the density of the observed data conditioned on the latent parameters can be any parametric density, and the prior density of the latent parameters is Gaussian. Typically, the link function is univariate, i.e., it is only a function of the location parameter. Here the focus is on LGMs with a multivariate link function, e.g., LGMs structured such that the location parameter, the scale parameter and the shape parameter of an observation are transformed into three latent parameters. These three latent parameters are modeled with a linear model at the latent level. The parameters within the linear model are also defined as latent parameters and thus assigned a Gaussian prior density. To facilitate fast posterior computation, a Gaussian approximation is proposed for the likelihood function of the parameters. This approximation, along with a priori assumption of Gaussian latent parameters, allows for straightforward sampling from the posterior density. One benefit of this approach, e.g., is subset selection at the latent level. The computational approach is applied to annual maximum peak flow series from UK.

**Date: 31 October 2017, CB 5.8, 13:15**

**Automated formative and summative assessment: the R solution with the "exams" package**

**James Foadi (University of Bath)**

**Abstract:**
Increased student numbers for the Department of Mathematical Sciences at the University of Bath
means that the working load involved in providing feedback and marking for homework, coursework
and exams, is becoming undesirable. The obvious solution to avoid pressure on academic staff is
the implementation of one or more automated formative and summative types of assessment. In this
seminar I will discuss the approach brought about by the R package "exams", initially tested at
the Wirtschaftsuniversitat Wien (WU Wien) in 2007. With "exams" it is possible to generate, for
each type of problem/question, hundreds of versions with, for instance, different numerical values.
These different versions can be imported in the Moodle questions bank, so to be used to create
random quizzes for students. Once each student complete and submit a quiz, the final mark is returned,
together with the detailed and correct solution, without instructor's intervention.
The approach to automated assessment here described, is particularly, but not exclusively,
suited to statistics subjects. Fields in which questions involve various algebraic manipulations can be
better handled with alternative systems, like STACK (C. J. Sangwin and M. J. Grove, 2006), that is
currently being investigated by the Department.

**Date: 7 November 2017, CB 5.8, 13:15**

**Time-dependent feature allocation models via Poisson Random Fields**

**Paul Jenkins (University of Warwick)**

**Abstract:**
In a feature allocation model, each data point is described by a collection of latent features, possibly unobserved. For example, we might classify a corpus of texts by describing each document via a set of topics; the topics then determine a distribution over words for that document. In a Bayesian nonparametric setting, the Indian Buffet Process (IBP) is a popular prior model in which the number of topics is unknown a priori. However, the IBP is static in that it does not account for the change in popularity of topics over time. I will introduce the Wright-Fisher Indian Buffet Process (WF-IBP), a probabilistic model for collections of time-stamped documents. By adapting the Wright-Fisher diffusion from population genetics, we derive a stochastic process with appealing properties including that (i) each feature popularity evolves independently as a diffusion and (ii) marginal observations at a fixed timepoint are given by the original IBP. We describe a Markov Chain Monte Carlo algorithm for exact posterior simulation and illustrate our construction by analysing the topics of NIPS conference papers over 12 years. This is joint work with Valerio Perrone (Warwick), Dario Spano (Warwick), and Yee Whye Teh (Oxford).

**Date: 14 November 2017, ****CB 5.13, 13:15 (Computer Group Work Room)**

**Introduction to Python for R Users**

**Julian Faraway (University of Bath)**

**Abstract:**
Python is a popular programming language that is widely used in Machine Learning and Data Science. While it can be used for Statistics, its real value to Statisticians lies in its extensive range of other capabilities. It can form a valuable complement to the statistical strengths of R. This hands on introduction in a computer lab will help you get started in Python and will focus on the ways it differs from R.

**Date: 21 November 2017, CB 5.8, 13:15**

**Kernel methods for spatiotemporal learning in criminology (or, the
methods behind our winning entry in the US National Institute of
Justice's crime forecasting challenge)**

**Seth Flaxman (Imperial College London)**

**Abstract:**
In this talk I will highlight the statistical machine learning methods
that I am developing to address public policy questions in
criminology.
We develop a scalable inference method for the log-Gaussian Cox
Process, and show that an expressive kernel parameterisation can learn
space/time structure in a large point pattern dataset [Flaxman et al,
ICML 2015]. Our approach has nearly linear scaling, allowing us to
efficiently fit a point pattern dataset of n = 233,088 crime events
over a decade in Chicago and discover spatially varying multiscale
seasonal trends and produce highly accurate long-range local area
forecasts. Building on this work, we use scalable approximate kernel
methods to provide a winning solution to the US National Institute of
Justice "Real-Time Crime Forecasting Challenge," providing forecasts
of four types of crime at a very local level (less than 1 square mile)
1 week, 1 month, and 3 months into the future.

In another line of work, we use a Hawkes process model to quantify the spatial and temporal scales over which shooting events diffuse in Washington, DC, using data collected by an acoustic gunshot locator system, in order to assess the hypothesis that crime is an infectious process. While we find robust evidence for spatiotemporal diffusion, the spatial and temporal scales are extremely short (126 meters and 10 minutes), and thus more likely to be consistent with a discrete gun fight, lasting for a matter of minutes, than with a diffusing, infectious process linking violent events across hours, days, or weeks [Loeffler and Flaxman, Journal of Quantitative Criminology 2017]

Papers and replication code available at www.sethrf.com

**Date: 28 November 2017, CB 5.8, 13:15**

**Using forest eco-system monitoring data to model tree survival for investigating climate change effects**

**Nicole Augustin (University of Bath)**

**Alice Davis (University of Bath)**

**Abstract:**
Forests are economically, recreationally and ecologically important, providing timber and wildlife habitat and acting as a carbon sink, among many ecosystem services. They are therefore extremely valuable to society, and it is crucial to ensure that they remain healthy. Forest health is monitored in Europe by The International Co-operative Programme on Assessment and Monitoring of Air Pollution Effects (ICP Forests) in cooperation with the European Union. More recently climate change has contributed to the decline in forest health and these data are increasingly being used to investigate the effects of climate change on forests in order to decide on forest management strategies for mitigation.
Here we model extensive yearly data on tree mortality and crown defoliation, an indicator of tree health, from a monitoring survey carried out in Baden-Württemberg, Germany since 1983, which includes a part of the ICP transnational grid. On a changing irregular grid, defoliation, mortality and other tree and site specific variables are recorded. In some cases the grid locations are no longer observed which leads to censored data, also recruitment of trees happens throughout when new grid points are added. We model tree survival as a function of the predictor variables on climate, soil characteristics and deposition. We are interested in the process leading to tree mortality rather than prediction and this requires the inclusion of all potential drivers of tree mortality in the model. We use the semiparametric shared frailty model fitted using a Cox regression model which allows for random effects (frailties) taking care of dependence between neighbouring trees and non-linear smooth functions of time varying predictors and functional predictors. At each of 2385 locations 24 trees were observed between 1983 and 2016, with not all locations being observed yearly. Altogether a total of 80000 trees are observed making the analysis computationally challenging.

**Date: 5 December 2017, CB 5.8, 13:15**

**Personalised dynamic prediction of survival using patient registry data: An anatomy of a landmarking analysis**

**Ruth Keogh (London School of Hygiene & Tropical Medicine )**

**Abstract:**
In ‘dynamic’ prediction of survival we make updated predictions of individuals’ survival as new longitudinal measures of health status become available. Landmarking is an attractive and flexible method for dynamic prediction. In this talk I will take the audience through a dynamic prediction analysis using data from the UK Cystic Fibrosis Registry. Challenges arise due to a large number of potential predictors, use of age as the timescale, and occurrence of intermediate events. Various modelling options are possible, and choices have to be made concerning time-varying effects and landmark-specific effects. I will outline how different model selection procedures were investigated; how models were assessed and compared using bootstrapping; and how predictions and their uncertainties can be obtained for a new individual.

**Date: 6 February 2018, CB 3.7, 14:15**

**Supermassive black hole growth**

**Carolin Villforth (University of Bath)**

**Abstract:**
Supermassive black holes are found in the centers of all massive
galaxies. While they are usually quiescent, all supermassive black holes
go through phases of accretion during which they can outshine the galaxy
they reside in. Black holes gain the majority of their mass during
accretion events. It was also discovered that supermassive black hole
masses correlate well with the properties of their host galaxies. This
has raised an important question: how do supermassive black holes grow
and how is this growth connected to the evolution of galaxies?
Addressing this question requires disentangling emission from the galaxy
and accreting black hole as well as isolating signatures of different
coevolution models in large populations.

**Date: 13 February 2018, CB 3.7, 14:15**

**On Local orthogonality and parameter meaning preservation**

**Karim Anaya-Izquierdo (University of Bath)**

**Abstract:**
This is an informal talk about work very much in progress. The work is motivated by the following two issues that appear when enlarging parametric models:
(1) a working parametric model is enlarged to account for relevant (but not of direct interest) structures in the data but the meaning of some of the parameters is lost in that process. A simple example is when the interpretation of a parameter as a treatment effect (marginal mean difference, mean ratio or hazard ratio) is lost when including correlated random effects to account for spatial dependence.
(2) More frivolously, we might want the estimation of the enlarged and uninteresting parts of the model to interfere the least with the estimation of the important bits. A well known example is the partial likelihood in the Cox model which allows estimation on target regression parameters without having to worry about any unknown baseline distribution. Local orthogonality techniques go in this direction.
Despite the interesting-looking examples above, the talk will be, disappointingly, about trivial examples, concepts, modelling ideas and with very little (if any) data. Inevitably, being me, I will introduce some useful geometric tools in this context.

**Date: 20 February 2018, CB 3.7, 14:15**

**Multivariate Output Analysis for Markov Chain Monte Carlo**

**Dootika Vats (University of Warwick)**

**Abstract:**
Markov chain Monte Carlo (MCMC) produces a correlated sample for estimating expectations with respect to a target distribution. A fundamental question is when should sampling stop so that we have good estimates of the desired quantities? The key to answering this question lies in assessing the Monte Carlo error through a multivariate Markov chain central limit theorem (CLT). We present a multivariate framework for terminating simulation in MCMC. We define a multivariate effective sample size, estimating which requires strongly consistent estimators of the covariance matrix in the Markov chain CLT; a property we show for the multivariate batch means estimator. We then provide a lower bound on the number of minimum effective samples required for a desired level of precision. This lower bound depends on the problem only in the dimension of the expectation being estimated, and not on the underlying stochastic process. This result is obtained by drawing a connection between terminating simulation via effective sample size and terminating simulation using a relative standard deviation fixed-volume sequential stopping rule; which we demonstrate is an asymptotically valid procedure. The finite sample properties of the proposed method are then demonstrated through simple motivating example. This work is joint with Galin Jones (U of Minnesota) and James Flegal (UC Riverside).

~~Date: 27 February 2018, CB 3.7, 14:15~~ CANCELLED

**Designing an adaptive trial with treatment selection and a survival endpoint**

**Chris Jennison (University of Bath)**

**Abstract:**
In some Phase III clinical trials, more than one new treatment is compared to the control treatment.
Such a trial requires a larger sample size than a two arm trial. However, this sample size can be reduced
by choosing to focus on one of the new treatments part way through the trial.
We consider a clinical trial in which two versions of a new treatment are compared against control, with
the primary endpoint of overall survival. At an interim analysis, mid-way through the trial, one of the
two treatments is selected, based on the short term response of progression free survival. In the
remainder of the trial, new patients are randomised between the selected treatment and the control.
For such an adaptive design, the familywise type I error rate can be protected by use of a closed testing
procedure to deal with the two null hypotheses and combination tests to combine data from before and
after the interim analysis. However, with the primary endpoint of overall survival, there is still a danger
of inflating the type I error rate: we present a way of applying the combination test that solves this
problem simply and effectively. With the methodology in place, we then assess the potential benefits of
treatment selection in this adaptive trial design.

~~Date: 6 March 2018, CB 3.7, 14:15~~ CANCELLED

**TBA**

**David Robertson (MRC Biostatistics Unit, University of Cambridge)**

**Abstract:**
TBA

~~Date: 13 March 2018, CB 3.7, 14:15~~ CANCELLED

**TBA**

**Maria De Iorio (University College London)**

**Abstract:**
TBA

**Date: 20 March 2018, CB 3.7, 14:15**

**Progress on the connection between spectral embedding
and network models used by the probability, statistics and machine-learning communities**

**Patrick Rubin-Delanchy (University of Bristol)**

**Abstract:**
In this talk, I give theoretical and methodological results, based on work spanning Johns Hopkins, the Heilbronn Institute for Mathematical Research, Imperial and Bristol, regarding the connection between various graph spectral methods and commonly used network models which are popular in the probability, statistics and machine-learning communities. An attractive feature of the results is that they lead to very simple take-home messages for network data analysis: a) when using spectral embedding, consider eigenvectors from both ends of the spectrum; b) when implementing spectral clustering, use Gaussian mixture models, not k-means; c) when interpreting spectral embedding, think of "mixtures of behaviour" rather than "distance". Results are illustrated with cyber-security applications.

**Date: 10 April 2018, CB 3.7, 14:15**

**A Dirichlet Process Tour**

**Tom Fincham Haines (University of Bath)**

**Abstract:**
I will introduce the Dirichlet process, as used in non-parametric Bayesian models when you want them to dynamically adjust how many discrete elements are used, depending on the data. This talk will demonstrate its use to solve a variety of machine learning and computer vision problems using both Gibbs sampling (MCMC) and mean field variational techniques.

**Date: 17 April 2018, CB 3.7, 14:15**

**A regional random effects model for peaks-over-threshold flood events**

**Emma Eastoe (Lancaster University)**

**Abstract:**
Statistical models for the extreme events of environmental data sets must often account for temporal non-stationarity. In this talk we look at peaks-over-threshold river flow data, which consists of the times and sizes of the peak flows of flooding events. Our goal is to model the event sizes whilst accounting for non-stationarity. If event sizes are assumed to be stationary over time, then an appropriate statistical model is given by the generalised Pareto distribution. However, the assumption of stationarity is mostly invalid since the behaviour of event sizes varies across years under the influence of other climate-related processes, eg. precipitation. If observations have been made on these underlying processes then regression methods can be used. However such observations are rarely available and, even if they are, it is often not clear which combination of covariates to include in the model. We develop a regional random effects model which accounts for non-stationarity in event sizes without the need for any measurements on underlying processes. This model can be used to predict both unconditional extreme events such as the m-year maximum, as well as extreme events that condition on the value of the random effect. The random effects also provide information on likely candidates for which underlying climate-related processes cause variability in flood magnitudes. The model is applied to UK flood data.

**Date: 24 April 2018, CB 3.7, 14:15**

**Geometric MCMC for infinite-dimensional inverse problems**

**Alex Beskos (University College London)**

**Abstract:**
Bayesian inverse problems often involve sampling posterior distributions on infinite-dimensional function spaces. Traditional Markov chain Monte Carlo (MCMC) algorithms are characterized by deteriorating mixing times upon mesh-refinement, when the finite-dimensional approximations become more accurate. Such methods are typically forced to reduce step-sizes as the discretization gets finer, and thus are expensive as a function of dimension. Recently, a new class of MCMC methods with mesh-independent convergence times has emerged. However, few of them take into account the geometry of the posterior informed by the data. At the same time, recently developed geometric MCMC algorithms have been found to be powerful in exploring complicated distributions that deviate significantly from elliptic Gaussian laws, but are in general computationally intractable for models defined in infinite dimensions. In this work, we combine geometric methods on a finite-dimensional subspace with mesh-independent infinite-dimensional approaches. Our objective is to speed up MCMC mixing times, without significantly increasing the computational cost per step (for instance, in comparison with the vanilla preconditioned Crank–Nicolson (pCN) method). This is achieved by using ideas from geometric MCMC to probe the complex structure of an intrinsic finite-dimensional subspace where most data information concentrates, while retaining robust mixing times as the dimension grows by using pCN-like methods in the complementary subspace. The resulting algorithms are demonstrated in the context of three challenging inverse problems arising in subsurface flow, heat conduction and incompressible flow control. The algorithms exhibit up to two orders of magnitude improvement in sampling efficiency when compared with the pCN method.

**Date: 1 May 2018, CB 3.7, 14:15**

**Detection & attribution of large scale drivers for flood risk in the UK**

**Aoibheann Brady (University of Bath)**

**Abstract:**
In this talk, we investigate the attribution of trends in peak river flows to large-scale climate drivers such as the North Atlantic Oscillation (NAO) and the Eastern Atlantic (EA) Index. We focus on a set of near-natural ``benchmark" catchments in the UK in order to detect trends not caused by anthropogenic changes, and aim to attribute trends in peak river flows to some climate indices.To improve the power of our approach compared with at-site testing, we propose modelling all stations together in a Bayesian framework. This approach leads to the detection of a clear countrywide time trend. Additionally, the EA appears to have a considerable association with peak river flows, particularly in the south east of the UK, while the effect of NAO appears to be minor across the UK. When a multivariate approach is taken to detect collinearity between climate indices and time, the association between NAO and peak flows disappears, while the association with EA remains strong.

**A Decision Theoretic Approach for Phase II/III Programmes**

**Robbie Peck (University of Bath)**

**Abstract:**
Drug development involves the problem of identifying potentially efficacious treatments or doses (phase II), and providing evidence of their efficacy (phase III). This may be done in a phase II/III programme. The design may be adaptive, meaning the design of the rest of the programme may be changed based on previously observed data.
One can apply a Bayesian decision theoretic framework to make Bayes decisions at each stage in the programme based upon the data observed so far. These decisions may include the choice of treatment/dose, and the number of patients required. The framework requires a choice of gain function for which each Bayes decision maximises the expected value of.
I shall illustrate this approach through 3 case studies.

**Date: 8 May 2018, CB 3.7, 14:15**

**Renewable energy analytics: how to model uncertainties in the energy demand and supply**

**Jooyoung Jeon (University of Bath)**

**Abstract:**
The future electrical grid will have unprecedented complexity and uncertainty. The cost of low carbon technologies, such as PV, wind, electric vehicles and battery storage is rapidly decreasing and they are increasingly being connected to the edge of the grid. The rise of prosumers, where energy buyers and sellers become increasingly blurred, is far beyond the capability of the current market and system operation framework. In view of this, this research proposes uncertainty quantification modelling framework for a prototype peer-to-peer energy trading/sharing (P2P-ETS) platform, which will lead to a unique scalable market place for mass prosumers to buy/sell/share energy themselves. In detail, the research explores how to model and evaluate the forecast uncertainties (1) in the energy supply from wind, using hierarchical density forecasting techniques, and (2) in the energy demand measured by smart-meters, using stochastic optimisation based on density forecasts.

**Date: 14 February 2017, CB 3.7, 14:15**

**Kate Button & Michelle St Clair (Bath) (Joint seminar)**

**(Kate Button) Personalising psychological care**

**Abstract:** Depression and anxiety are leading causes of disability in the UK. Improving access to psychological therapies (IAPT) aims to reduce this disability by making ‘talking therapies’ available through the NHS. IAPT has been a success, providing therapy to those who would have otherwise not had access, and half of patients referred make a full recovery. However, we can still do better. The aim of this research is to use routinely collected IAPT data to identify optimal care regimes for a given patient. By providing evidence to tailor psychological care to the individual, we aim to further improve recovery rates.

**(Michelle St Clair) Statistics and Human Development: Characterising developmental trajectories and (causal) pathways through childhood, adolescence and adulthood**

**Abstract:** I will be giving a short overview of my research with regard using large scale longitudinal projects and/or longitudinal cohort databases to evaluate developmental trajectories using complex multivariate person centred and variable centred statistical techniques. I will also evaluate some work that is looking at possible causal pathways or relationships between experiences in early life and outcomes in later life using longitudinal cohort data.

**Date: 17 Feb 2017, 4W 1.7, 15:15 (Landscapes seminar)**

**Adrian Bowman (Glasgow)**

**Surfaces, shapes and anatomy**

**Abstract:** Three-dimensional surface imaging, through laser-scanning or stereo-photogrammetry, provides high-resolution data defining the surface shape of objects. In an anatomical setting this can provide invaluable quantitative information, for example on the success of surgery. Two particular applications are in the success of facial surgery and in developmental issues with associated facial shapes. An initial challenge is to extract suitable information from these images, to characterise the surface shape in an informative manner. Landmarks are traditionally used to good effect but these clearly do not adequately represent the very much richer information present in each digitised images.

Curves with clear anatomical meaning provide a good compromise between informative representations of shape and simplicity of structure, as well as providing guiding information for full surface representations. Some of the issues involved in analysing data of this type will be discussed and illustrated. Modelling issues include the measurement of asymmetry and longitudinal patterns of growth.

**Date: 14 March 2017, CB 3.7, 13:15**

**Kari Heine (Bath)**

**TBA**

**Date: 28 March 2017, CB 3.7, 13:15**

**Paul Northrop (UCL)**

**Extreme value threshold selection**

**Abstract: ** A common form of extreme value modelling involves modelling excesses of a threshold by a generalised Pareto (GP) distribution. The GP model arises by considering the possible limiting distributions of excesses as the threshold increased. Selecting too low a threshold leads to bias from model mis-specification; raising the threshold increases the variance of estimators: a bias-variance trade-off. Many existing threshold selection methods do not address this trade-off directly, but rather aim to select the lowest threshold above which the GP model is judged to hold approximately. We use Bayesian cross-validation to address the trade-off by comparing thresholds based on predictive ability at extreme levels. Extremal inferences can be sensitive to the choice of a single threshold. We use Bayesian model averaging to combine inferences from many thresholds, thereby reducing sensitivity to the choice of a single threshold. The methodology is illustrated using significant wave height datasets from the North Sea and from the Gulf of Mexico.

**Date: 25 April 2017, CB 3.7, 13:15**

**Heather Battey (Imperial)**

**Exploring and exploiting new structured classes of covariance and inverse covariance matrices**

**Abstract:** Estimation of covariance and inverse covariance (precision) matrices is an essential ingredient to virtually every modern statistical procedure. When the dimension, p, of the covariance matrix is large relative to the sample size, the the sample covariance matrix is inconsistent in non-trivial matrix norms, and its non-invertibilty renders many techniques in multivariate analysis impossible. Structural assumptions are necessary in order to restrain the estimation error, even if this comes at the expense of some approximation error if the structural assumptions fail to hold. I will introduce new structured model classes for estimation of large covariance and precision matrices. These model classes result from imposing sparsity in the domain of the matrix logarithm. After studying the structure induced in the original and inverse domains, I will then introduce estimators of both the covariance and precision matrix that exploit this structure. I derive the convergence rates of these estimators and show that they achieve a new minimax lower bound over classes of covariance and precision matrices whose matrix logarithm is sparse. The implication of this result is that the estimators are efficient and the minimax lower bound is sharp.

**Date: 2 May 2017, CB 3.7, 13:15**

**Anthony Lee (Warwick)**

**TBA**

**Date: 9 May 2017, CB 3.7, 13:15**

**Tiago de Paula Peixoto (Bath)**

**TBA**

**Date: 16 May 2017, CB 3.7, 13:15**

**Stats PhD students (Bath)**

**TBA**

**Date: 11 October 2016, CB 5.8, 13:15**

**Daniel Falush (Bath)**

**The painting palettes of human ancestry**

**Abstract:** Genomic technology is advancing at a remarkable pace and provide a great deal of information on our origins but requires new statistical technology to analyze. I will describe our chromosome painting approach to summarizing ancestry information (available from here). A Hidden Markov Model is used to fit each individual as a mosaic of the other individuals in the sample. A summary of this painting is used to subdivide the sample into populations with discrete ancestry profiles, using a merge-split sampler. I illustrate the application of this method to subdivide the British Isles into 17 regions with distinct ancestry profiles. Historical admixture events can be explored using mixture modelling. I show how Non Linear Least Squares and curve fitting can be used to estimate global admixture events in the last 3,000 years.

**Date: 25 October 2016, CB 5.8, 13:15**

**Francisco Javier Rubio (LSHTM)**

**Tractable Bayesian variable selection: beyond normality**

**Abstract:** Bayesian variable selection for continuous outcomes often assumes normality, and so do its theoretical studies. There are sound reasons behind this assumption, particularly for large \(p\): ease of interpretation, analytical and computational convenience. More flexible frameworks exist, including semi- or non-parametric models, often at the cost of losing some computational or theoretical tractability. We propose a simple extension of the Normal model that allows for skewness and thicker-than-normal tails but preserves its tractability. We show that a classical strategy to induce asymmetric Normal and Laplace errors via two-piece distributions leads to easy interpretation and a log-concave likelihood that greatly facilitates optimization and integration. We also characterize asymptotically its maximum likelihood estimator and Bayes factor rates under model misspecification. Our work focuses on the likelihood and can thus be combined with any likelihood penalty or prior, but here we adopt non-local priors, a family that induces extra sparsity and which we characterize under misspecification for the first time. Under suitable conditions Bayes factor rates are of the same order as those that would be obtained under the correct model, but we point out a potential loss of sensitivity to detect truly active covariates. Our examples show how a novel approach to infer the error distribution leads to substantial gains in sensitivity, thus warranting the effort to go beyond normality, whereas for near-normal data one can get substantial speedups relative to assuming unnecessarily flexible models.

The methodology is available as part of R package mombf.

Joint work with David Rossell.

**Date: 1 November 2016, CB 5.8, 13:15**

**Keming Yu (Brunel)**

**Data Adaptive Tail-index Regression**

**Abstract:** Tail-index is an important measure to gauge the heavy-tailed behavior of a distribution. The problem of estimation of a Tail-index from various types of data has become rather important. Tail-index regression is introduced when covariate information is available. Inference of Tail-index regression may face two challenges: small sample bias with the analysis of small to moderate size data and the problem of storage and computational efficiency with dealing with massive data. In this paper we derive new statistical inference for Tail-index regression based on Pareto-type of distributions and Burr-XII distributions.

**Date: 15 November 2016, CB 5.8, 13:15**

**Theresa Smith (Bath)**

**Age-period-cohort models for cancer incidence**

**Abstract:** Age-period-cohort models have been used to examine and forecast cancer incidence and mortality for over three decades. However, the fitting and interpretation of these models requires great care because of the well-known identifiability problem that exists; given any two of age, period, and cohort, the third is determined.

In this talk I introduce APC models and the identifiability problem. I examine proposed ‘’solutions’’ to this problem and approaches based on an identifiable parameterization. I conclude with an analyis of cancer incidence data from Washington State and a discussion of future research directions.

**Date: 21 November 2016, CB 3.16, 13:15 (Note different time and venue)**

**Adam Johansen (Warwick)**

**Search and Jump Algorithm for Markov Chain Monte Carlo Sampling**

**Abstract:** We present an offline, iterated particle filter to facilitate statistical inference in general state space hidden Markov models. Given a model and a sequence of observations, the associated marginal likelihood L is central to likelihood-based inference for unknown statistical parameters. We define a class of “twisted” models: each member is specified by a sequence of positive functions psi and has an associated psi-auxiliary particle filter that provides unbiased estimates of L. We identify a sequence psi* that is optimal in the sense that the psi-auxiliary particle filter’s estimate of L has zero variance. In practical applications, psi is unknown so the psi-auxiliary particle filter cannot straightforwardly be implemented. We use an iterative scheme to approximate psi, and demonstrate empirically that the resulting iterated auxiliary particle filter significantly outperforms the bootstrap particle filter in challenging settings. Applications include parameter estimation using a particle Markov chain Monte Carlo algorithm and approximation of conditioned diffusion sample paths. [arxiv: 1511.06286]

Joint work with Pieralberto Guarniero and Anthony Lee

**Date: 22 November 2016, CB 5.8, 13:15**

**Chris Jennison (Bath)**

**Search and Jump Algorithm for Markov Chain Monte Carlo Sampling**

**Abstract:** MCMC sampling is now established as a fundamental tool in statistical inference but there are still problems to solve. MCMC samplers can mix slowly when the target distribution has multiple modes. A more insidious problem arises when sampling a distribution that is concentrated on a thin sub-region of a high-dimensional sample space. I shall present a new approach to mode-jumping and show how this can be used to sample from some challenging “thin” distributions.

Joint work with Adriana Ibrahim, University of Malasya.

**Date: 6 December 2016, CB 5.8, 13:15**

**Sam Livingstone (Bristol)**

**Some recent advances in dynamics-based Markov chain Monte Carlo**

**Abstract:** Markov chain Monte Carlo methods based on continuous-time dynamics such as Langevin diffusions and Hamiltonian flow are among the state of the art when performing inference for challenging models in many application areas. I will talk about some statistical models which Markov chains produced by these methods can explore well, and others for which they often struggle to do so. I’ll discuss some existing and new algorithms that use gradient information and/or exploit the geometry of the space through an appropriate Riemannian metric, and how these inputs can both positively and negatively affect exploration, using the notion of geometric ergodicity for Markov chains.

**Date: 13 December 2016, CB 5.8, 13:15**

**Ilaria Prosdocimi (Bath)**

**A statistician’s wander into flood hydrology**

**Abstract:** In the design and maintenance of structures such as dams or drainage networks, it is essential to be able to obtain reliable estimates of the magnitude and frequency of extreme events such as high river flow and rainfall totals. This talk will discuss methods to perform such estimation, focusing on similarities and differences of the different approaches developed by statisticians and civil engineers. Rather than presenting final results, the talk will focus on discussing the open challenges in the statistical methods for flood frequency estimation and will suggest possible future research avenues.

**Date: 19 April 2016, CB 5.1, 14:15**

**Nick Whiteley (Bristol)**

**Variance estimation in the particle filter**

**Abstract:** Particle filters provide sampling based approximations of marginal likelihoods and filtering expectations in hidden Markov models. However, estimating the Monte Carlo variance of these approximations, without generating multiple independent realizations of the approximations themselves, is not straightforward. We present an unbiased estimator of the variance of the marginal likelihood approximation, and consistent estimators of the asymptotic variance of the approximations of the marginal likelihood and filtering expectations. These estimators are byproducts of a single run of a particle filter and have no added computational complexity or storage requirements. With additional storage requirements, one can also consistently estimate higher-order terms in the non-asymptotic variance. This is information can be used to approximate the variance-optimal allocation of particle numbers.

Joint work with Anthony Lee, University of Warwick

**Date: 26 April 2016, CB 5.1, 14:15**

**Statistical shape analysis in a Bayesian framework for shapes in two and three dimensions**

**Thomai Tsiftsi (Bath)**

**Abstract:** Shape analysis is an integral part of object classification and has been used as a tool by many branches of science such as computer vision, pattern recognition and shape classification. In this talk I will present a novel shape classification method which is embedded in the Bayesian paradigm and utilises the efficacy of geometrical statistics as well as differential geometry. I will focus on the statistical classification of planar shapes by using techniques which replace some previous approximate results by analytic calculations in a closed form. This gives rise to a new Bayesian shape classification algorithm of which the efficiency was tested on available shape databases. Finally, I will conclude by demonstrating the extension of the proposed classification algorithm for shapes in three-dimensions.

**Date: 3 May 2016, CB 5.1, 14:15**

**Hilbertian Fourth Order Blind Identification**

**Germain Van Bever (Open University)**

**Abstract:** In the classical Independent Component (IC) model, the observations \(X_1,\cdots,X_n\) are assumed to satisfy \(X_i=\Omega Z_i\), \(i=1,\dots,n\), where the \(Z_i\)’s are i.i.d random vectors with independent marginals and \(\Omega\) is the mixing matrix. Independent component analysis (ICA) encompasses the set of all methods aiming at \(X=(X_1,\dots,X_n)\), that is estimating a (non unique) unmixing matrix \(\Gamma\) such that \(\Gamma X_i\), \(i=1,\dots,n\), has independent components. Cardoso (1989) introduced the celebrated Fourth Order Blind Identification (FOBI) procedure, in which an estimate of \(\Gamma\) is provided, based on the regular covariance matrix and a scatter matrix based on fourth moments. Building on robustness considerations and generalizing FOBI, Invariant Coordinate Selection (ICS, 2009) was originally introduced as an exploratory tool generating an affine invariant coordinate system. The obtained coordinates, however, are proved to be independent in most IC models.

Nowadays, functional data (FD) are occurring more and more often in practice, and only little statistical techniques have been developed to analyze this type of data (see, for example Ramsay and Silverman 2006). Functional PCA is one such technique which only aims at dimension reduction with very little theoretical considerations. In this talk, we propose an extension of the FOBI methodology to the case of Hilbertian data, FD being the go-to example used throughout. When dealing with distributions on Hilbert spaces, two major problems arise: (i) the scatter operator is, in general, non-invertible and (ii) there may not exist two different affine equivariant scatter functionals. Projections on finite dimensional subspaces and Karhunen-Lo`eve expansions are used to overcome these issues and provide an alternative to FPCA. More importantly, we show that the proposed construction is Fisher consistent for the independent components of an appropriate Hilbertian IC model.

Affine invariance properties of the resulting FOBI components will be discussed and potential extension to a FICS procedure will be sketched. Simulated and real data are analyzed throughout the presentation to illustrate the properties and the potential benefits of the new tools.

This work is supported by the EPSRC grant EP/L010429/1.

**References**

J.F. Cardoso (1989), Source Separation Using Higher Moments Proceedings of IEEE international conference on acoustics, speech and signal processing 2109-2112.

D. Tyler, F. Critchley, L. Dumbgen and H. Oja (2009), Invariant Co-ordinate Selection J.R. Statist. Soc. B., 2009,71, 549-592.

J. Ramsay and B.W. Silverman (2006) Functional Data Analysis 2nd edn. Springer, New York

**Date: 13 October 2015, 8W 2.13, 13:15**

**Evangelous Evangelou (Bath)**

**Writing and publishing your own R package: Some techniques and useful tools.**

**Abstract:** Publishing an R package requires quality code but also adherence to CRAN policies. I will present some techniques for automating the process of creating and maintaining an R package the and some good practices from my experience as a package author.

**Date: 20 October 2015, 8W 2.13, 13:15**

**Causal Models and How to Refute Them**

**Robin Evans (Oxford)**

**Abstract:** Directed acyclic graph models (DAG models, also called Bayesian networks) are widely used in the context of causal inference, and they can be manipulated to represent the consequences of intervention in a causal system. However, DAGs cannot fully represent causal models with confounding; other classes of graphs, such as ancestral graphs and ADMGs, have been introduced to deal with this using additional kinds of edge, but we show that these are not sufficiently rich to capture the range of possible models. In fact, no mixed graph over the observed variables is rich enough, regardless of how many edges are used. Instead we introduce mDAGs, a class of hyper-graphs appropriate for representing causal models when some of the variables are unobserved. Results on the Markov equivalence of these models show that when interpreted causally, mDAGs are the minimal class of graphs which can be sensibly used. Understanding such equivalences is critical for the use of automatic causal structure learning methods, a topic in which there is considerable interest. We elucidate the state of the art as well as some open problems.

**Date: 27 October 2015, 8W 2.13, 13:15**

**Jonty Rougier (Bristol)**

**Predicting large explosive eruptions for individual volcanoes**

**Abstract:**Large explosive volcanic eruptions can be devastating, given that many volcanoes capable of such eruptions are close to cities. But data, on which predictions could be based, is very limited. Globally, such eruptions happen about once every two years, but the record is rapidly thinned going backwards in time, where the rate of under-recording depends not just on time, but also on location and magnitude. I describe our approach to assessing the under-recording rate, and to making predictions for sets of volcanoes with similar recorded histories, based on an exchangeable model of eruption rates. This is part of our larger project to provide a return period curve for each volcano. This is joint work with volcanologists Profs Steve Sparks and Kathy Cashman.

**Date: 10 November 2015, 8W 2.13, 13:15**

**Haakon Bakka (NTNU)**

**A spatial random effect with one range for each region (the Difficult Terrain model component)**

**Abstract:**Classical models in spatial statistics assume that the correlation between two points depends only on the distance between them (i.e. the models are stationary). In practice, however, the shortest distance may not be appropriate. Real life is not stationary! For example, when modelling fish near the shore, correlation should not take the shortest path going across land, but should travel along the shoreline. In ecology, animal movement depends on the terrain or the existence of animal corridors. We will show how this kind of information can be included in a spatial non-stationary model, by defining a different spatial range (distance) in each region.

We will answer the following questions:

- How to make a model with one range in each region?
- Is the algorithm fast enough for real data? (Hint: Yes!)
- How to avoid overfitting with flexible random effects?
- How to interpret the inference when you have flexible random effects?
- How do we model a point process with different cluster sizes in different regions, without changing the average number of points?

**Date: 17 November 2015, 8W 2.13, 13:15**

**Daniel Williamson (Exeter)**

**Earth system models and probabilistic Bayesian calibration: a screw meets a hammer?**

**Abstract:** The design and analysis of computer experiments, now called “Uncertainty Quantification” or “UQ” has been an active area of statistical research for 25 years. One of the most high profile methodologies, that of calibrating a complex computer code using the Bayesian solution to the inverse problem as described by Kennedy and O’Hagan’s seminal paper in 2001, has become something of a default approach to tackling applications in UQ and has over 1200 citations. However, is this always wise? Though the method is well tested and arguably appropriate for many types of model, particularly those for which large amounts of data are readily available and in which the limitations of the underlying mathematical expressions and solvers are well understood, many models, such as those found in climate simulation, go far beyond those successfully studied in terms of non-linearity, run time, output size and complexity of the underlying mathematics.

Have we really solved the calibration problem? To what extent is our “off the shelf approach” appropriate for the problems faced in fields such as Earth system modelling? In this talk we will discuss some of the known limitations of the Bayesian calibration framework (and some perhaps unknown) and we explore the extent to which the conditions in which calibration is known to fail are met in climate model problems. We will then present and argue for an alternative approach to the problem and apply it an ocean GCM known as NEMO.

**Date: 24 November 2015, 8W 2.13, 14:15** (Note the kick-off time of 2:15)

**Georg Lindgren (Lund)**

**Stochastic models for ocean waves - Gaussian fields made more realistic by Rice formula and some physics**

**Abstract:** Gaussian fields were introduced in the early fifties as models for irregular ocean waves and they have been used in ocean engineering ever since. A simple modification leads to the more realistic stochastic Lagrange models, which account for horizontal and vertical movements of individual water particles, leading to realistic asymmetry of the generated waves.

Rice formula for the expected number of level crossings and its modern implementation to “level sets” makes it possible to derive exact statistical distributions for many important wave characteristics, like steepness and asymmetry. In the talk I will describe the stochastic Lagrange model and some of its statistical properties.

**Date: 9 December 2015, CB 4.1, 13:15** (Note the different day (Wednesday) and venue)

**John Copas (Warwick)**

**Title** Model choice and goodness of fit

**Abstract:** How do we know whether a statistical model is sensible? The usual answer is to check that the model gives a reasonable fit to the data. The seminar will look at the variation between inferences based on different models, and show that this can be extremely large, even amongst models which seem to fit the data equally well. What does this tell us about most applications of statistics which completely ignore the problem of model uncertainty? What does it tell us about formal methods of model selection and model averaging which all, directly or indirectly, depend on model fit?