"Fast Bayesian Factor Analysis via Automatic Rotations to Sparsity" (Next Week at the Statistics Seminar)
Attention conservation notice: Only of interest if (1) you care about factor analysis and Bayesian nonparametrics, and (2) will be in Pittsburgh on Monday.
Constant readers, knowing of my love-hate relationship with both
factor analysis
and with Bayesian methods will appreciate that the
only way I could possibly be more ambivalent about our next seminar was if it
also involved
power-law distributions.
- Veronika Ročková, "Fast Bayesian Factor Analysis via Automatic Rotations to Sparsity" [preprint, preprint supplement]
- Abstract: Rotational post-hoc transformations have traditionally
played a key role in enhancing the interpretability of factor analysis.
Regularization methods also serve to achieve this goal by prioritizing sparse
loading matrices. In this work, we bridge these two paradigms with a unifying
Bayesian framework. Our approach deploys intermediate factor
rotations throughout the learning process, greatly enhancing the effectiveness
of sparsity inducing priors. These automatic rotations to sparsity are
embedded within a PXL-EM algorithm, a Bayesian variant of parameter-expanded EM
for posterior mode detection. By iterating between soft-thresholding of small
factor loadings and transformations of the factor basis, we obtain (a) dramatic
accelerations, (b) robustness against poor initializations and (c) better
oriented sparse solutions. To avoid the pre-specification of the factor
cardinality, we extend the loading matrix to have infinitely many columns with
the Indian Buffet
Process (IBP) prior. The factor dimensionality is learned from the
posterior, which is shown to concentrate on sparse matrices. Our deployment of
PXL-EM performs a dynamic posterior exploration, outputting a solution path
indexed by a sequence of spike-and-slab priors. For accurate recovery of the
factor loadings, we deploy the Spike-and-Slab LASSO prior, a two-component
refinement of the Laplace prior
(Rockova
2015). A companion criterion, motivated as an integral lower bound, is
provided to effectively select the best recovery. The potential of the
proposed procedure is demonstrated on both simulated and real high-dimensional
gene expression data, which would render posterior simulation impractical.
- Time and place: 4 pm on Monday, 8 February 2016, in 125 Scaife Hall
As always, the talk is free and open to the public.
Enigmas of Chance;
Bayes, anti-Bayes
Posted at February 03, 2016 22:32 | permanent link