Modern statistics often involve extremely high-dimensional data, where the number of parameters is comparable to or larger than the number of samples. As high dimensionality brings great challenges to statistical inference, there has been rich literature exploring the ``intrinsic dimension'' of the parameters to ease the burden. By imposing different structures on the parameters such as sparsity, clusters, low-rank, and tree structures, one can successfully reduce the dimensionality. However, this process inevitably introduces discrete variables, making it difficult to build probabilistic models via Bayesian frameworks.
This work aims to develop Bayesian methods with efficient computation tools to reduce modeling burdens for unknown dimension problems. The main contribution is incorporating optimization methods to quantify uncertainty for varying dimensional problems: though the intrinsic dimension of the parameter of interest is often unknown or varying, we can transform it into a varying-dimensional space using the proximal mapping. This leads to a large class of new Bayesian models that can directly exploit the popular frequentist regularization and their algorithms, such as the nuclear norm penalty and the alternating direction method of multipliers, while providing a principled and probabilistic uncertainty estimation. A special case of proximal mapping, the l1-ball projection, is studied in detail, through which one can define a flexible sparse prior with positive probability at exact zeros. Nearly optimal minimax contraction theory in sparse linear regression is established. Several data applications, such as image segmentation and traffic network analysis, are shown and discussed.
学术海报3.pdf