Notebooks

## The Thomson Ability-Sampling Model

16 Apr 2015 17:44

An alternative to the factor model in psychometrics (and potentially other applications of factor analysis). I have written about it at great length here, and in my data analysis notes, which I hereby incorporate by reference. I'll just use this to record some ideas for possible work; if anyone wants to take them up before I get around to them, drop me a line, or at least put me in the acknowledgments!

Thomson vs. Erdos-Renyi. Thomson's original model sampled "bonds" or "abilities" (i.e., latent variables) without replacement. It's much easier to analyze, however, if you use simple Bernoulli sampling, and naturally the two come to much the same thing in the large-size limit. This is reminiscent to me of the two versions of the Erdos-Renyi random graph model, where you fix either the number of edges (so sampling without replacement) or the probability of an edge (Bernoulli sampling); is there something to this connection --- say the appearance of a single general factor corresponding to the emergence of a giant component?

Geometry vs. covariance. Thomson's model produces the same patterns of correlations as factor models (more exactly, can be made to come arbitrarily close with arbitrarily high probability). This naturally raises the question of how one might distinguish between the two simply from the data, as opposed to actual scientific knowledge of causal mechanisms. Correlations, clearly, won't do the job. But: if we have $p$ observables, and $q < p$ factors, then the expected values of the observables must always lie on a $q$-dimensional linear subspace of the full $p$-dimensional space. Unless I am missing something, however, if I have $q > p$ abilities in the Thomson model, there is no geometric constraint on the expected values of observable vectors. (Maybe there's something subtle I'm missing from the sampling process?) Might this provide a test? In both models our data equals expected vectors plus noise, so the factor model doesn't predict that observations will fall exactly on a hyper-plane, but perhaps something could be done with this.