Onsdag 11 september 2024, Shuangzhe Liu, University of Canberra, Australia
Titel: On heavy-tailed matrix variate regression models and statistical diagnostics
Sammanfattning: Matrix variate distributions and matrix regression models are powerful tools for analysing multivariate data with inherent matrix structure. These methods extend traditional univariate and multivariate techniques to handle more complex data structures, such as those found in genomics, neuroscience, and image analysis. In this talk, we introduce a framework for regression models under heavy-tailed matrix variate distributions. We begin by discussing several matrix variate distributions and then explore the general linear model under the matrix variate normal distribution, along with its relevant variations based on heavy-tailed distributions. Additionally, we cover important sensitivity analysis and statistical diagnostics for these models, highlighting potential future research problems and applications. Our goal is to provide insights into the theoretical and practical aspects of heavy- tailed matrix variate regression models, paving the way for further advancements in this area.
Tisdag 2 april 2024, Magnus Herberthson, Matematiska institutionen, Linköpings universitet
Titel: The central limit theorem for random walks on bounded domains
Sammanfattning: The standard central limit theorem typically assumes independent identically distributed (i.i.d.) random variabels X_i (with E[X_i]=0 and finite variance V[X_i]). A scaled average of N terms will then converge to a normal distribution as N -> oo, and the variance of this normal distribution is directly related to V[X_i].
We consider a random walk in a bounded domain Omega (for simplicity in the in the plane) which will imply that the the random walk variables X_i are not any longer
independent. A central limit theorem still ensures a limiting distribution for the rescaled averages as the number of terms increases, but now the covariance matrix of the limiting distribution is not so obvious. We will discuss how this covariance matrix is given by the geometry of the bounding domain Omega.
Tisdag 13 juni 2023, Nestor Parolya, Delft University of Technology
Titel: Logarithmic law of large random correlation matrices
Sammanfattning: Consider a random vector $\mathbf{y}=\bm{\Sigma}^{1/2}\mathbf{x}$, where the $p$ elements of the vector $\mathbf{x}$ are i.i.d. real-valued random variables with zero mean and finite fourth moment, and $\bm{\Sigma}^{1/2}$ is a deterministic $p\times p$ matrix such that the eigenvalues of the population correlation matrix $\mathbf{R}$ of $\mathbf{y}$ are uniformly bounded away from zero and infinity. In this paper, we find that the log determinant of the sample correlation matrix $\hat{\mathbf{R}}$ based on a sample of size $n$ from the distribution of $\mathbf{y}$ satisfies a CLT (central limit theorem) for $p/n\to \gamma\in (0, 1]$ and $p\leq n$. Explicit formulas for the asymptotic mean and variance are provided.
In case the mean of $\mathbf{y}$ is unknown, we show that after re-centering by the empirical mean the obtained CLT holds with a shift in the asymptotic mean. This result is of independent interest in both large dimensional random matrix theory and high-dimensional statistical literature of large sample correlation matrices for non-normal data. Finally, the obtained findings are applied for testing of uncorrelatedness of $p$ random variables. Surprisingly, in the null case $\mathbf{R} =\mathbf{I}$, the test statistic becomes distribution-free and we show analytically that the obtained CLT also holds if the moments of order four do not exist at all, which conjectures a promising and robust test statistic for heavy-tailed high-dimensional data.
Tisdag 7 februari 2023, Annika Lang, Chalmers University of Technology
Titel: Simulation of random fields on Riemannian manifolds
Sammanfattning: Random fields are important building blocks in spatial models disturbed by randomness such as solutions to stochastic partial differential equations. The fast simulation of random fields is therefore crucial for efficient algorithms in uncertainty quantification. In this talk I present numerical methods for Gaussian random fields on Riemannian manifolds and discuss their convergence. Simulations illustrate the theoretical findings. This talk is based on joint work with Erik Jansson, Mihály Kovács, and Mike Pereira.
Location: Online via Zoom.
Onsdag 7 december 2022, Daniel Klein, Pavol Jozef Šafárik University, Slovakien (i samarbete med Ivan Žežula)
Titel: Multiple testing of mean values in multivariate data with BCS variance structure
Sammanfattning: A common problem in multivariate data is that the number of unknown parameters in a model can be close or even higher than sample size. This may cause problems in statistical inference. Therefore, models with special variance structure are studied by many authors, where the number of covariance parameters is reduced by some restrictions on parameter space. One of possible structure could be so called block compound symmetry (BCS) structure. In this talk we will be studying estimation and testing such a structure as well as test of mean values assuming BCS structure.
Tisdag 11 oktober 2022, Solomon W. Harrar, University of Kentucky, USA
Titel: Nonparametric Finite Mixture: Applications in Contaminated Trials
Sammanfattning: Investigating the differential effect of treatments in groups defined by patient characteristics is of paramount importance in personalized medicine. Group membership is typically determined by diagnostic devices or biomarkers, but such tools are not perfectly accurate. The impact of diagnostic misclassification or contamination in statistical inference has received only little attention in the literature. This work addresses the problem in a fully nonparametric setting. Nonparametric finite mixture is proposed for estimating and testing of meaningful yet nonparametric treatment effects. Consistent estimators and asymptotic distributions are provided for the misclassification error rates as well as treatment effects. Numerical examples show significant advantages of the proposed method in terms of bias reduction, coverage probability and power. The application of the proposed method is illustrated with data from asthma and sleep deprivation studies.
Tisdag 4 februari 2020, Raazesh Sainudiin, Uppsala universitet
Titel: Minimum distance histograms with universal performance guarantees
Sammanfattning: We present a data-adaptive multivariate histogram estimator of an unknown density f based on n independent samples from it. Such histograms are based on binary trees called regular pavings (RPs). RPs represent a computationally convenient class of simple functions that remain closed under addition and scalar multiplication. Unlike other density estimation methods, including various regularization and Bayesian methods based on the likelihood, the minimum distance estimate (MDE) is guaranteed to be within an L1distance bound from f for a given n, no matter what the underlying f happens to be, and is thus said to have universal performance guarantees (Devroye and Lugosi, Combinatorial methods in density estimation. Springer, New York, 2001). Using a form of tree matrix arithmetic with RPs, we obtain the first generic constructions of an MDE, prove that it has universal performance guarantees and demonstrate its performance with simulated and real-world data. Our main contribution is a constructive implementation of an MDE histogram that can handle large multivariate data bursts using a tree-based partition that is computationally conducive to subsequent statistical operations.
Tisdag 15 oktober 2019, Kristoffer Lindensjö, Uppsala universitet
Titel: Moment constrained optimal dividends: precommitment & consistent planning
Sammanfattning: A moment constraint that limits the number of dividends in the optimal dividend problem is suggested. This leads to a new type of time-inconsistent stochastic impulse control problem. First, the optimal solution in the precommitment sense is derived. Second, the problem is formulated as an intrapersonal sequential dynamic game in line with Strotz' consistent planning. In particular, the notions of pure dividend strategies and a (strong) subgame perfect Nash equilibrium are adapted. An equilibrium is derived using a smooth fit condition. The equilibrium is shown to be strong. The uncontrolled state process is a fairly general diffusion.
Tisdag 17 september 2019, Jimmy Olsson, KTH
Titel: Bayesian learning of weakly structural Markov graph laws using sequential Monte Carlo methods
Sammanfattning: We shall discuss a sequential Monte Carlo-based approach to approximation of weakly structural Markov graph laws on spaces of decomposable graphs, or, more generally, spaces of junction (clique) trees associated with such graphs. In particular, we apply a particle Gibbs version of the algorithm to Bayesian structure learning in decomposable graphical models, where the target distribution is a junction tree posterior distribution. Moreover, we use the proposed algorithm for exploring certain fundamental combinatorial properties of decomposable graphs, e.g. clique size distributions. Our approach requires the design of a family of proposal kernels, so-called junction tree expanders, which expand junction trees by connecting randomly new nodes to the underlying graphs. The performance of the estimators is illustrated through a collection of numerical examples demonstrating the feasibility of the suggested approach in high-dimensional domains.
Tisdag 7 maj 2019, Chun-Biu Li, Stockholms universitet
Titel: Statistical Learning as a Compression Problem from the Information Theory Perspective
Sammanfattning: Although it was introduced in the context of communication theory, modern information theory provides us with a nonparametric probabilistic framework for statistical learning free from a priori assumption on the underlying statistical model. In this talk, I will discuss some of the information theory based methods for unsupervised and supervised learning. In particular, the soft (fuzzy) clustering problem in unsupervised learning can be viewed as a tradeoff between data compression and minimizing the distortion of the data. Similarly, modeling in supervised learning can be treated as a tradeoff between compression of the predictor variables and retaining the relevant information about the response variable. To illustrate the usage of these methods, some applications in biophysical problems and time series analysis will be addressed in the talk.
Tisdag 5 mars 2019, Peter Olofsson, Högskolan i Jönköping
Titel: Muller's Ratchet in Populations Doomed to Extinction
Sammanfattning: Muller's ratchet is the process by which asexual populations accumulate deleterious mutations in an irreversible manner. Most mathematical models have been of the Wright-Fisher type with fixed population size and relative fitness. In contrast, we use a branching process model with absolute fitness, leading to unavoidable extinction. Individuals are divided into classes depending on how many mutations they have accumulated, and we give results for the rate of the ratchet and the size of the fittest class.