Attention conservation notice: I have no taste, and no qualifications to opine on the fountainheads of the western philosophical tradition, the history of 17th century science, political philosophy, cognitive psychology, the transmission of inequality, or even social-scientific measurement.
Then according to what we are saying now, Theaetetus, it seems that if we take expertise in appropriation, in hunting, in animal-hunting, in land-animal-hunting, in the hunting of humans, by persuasion, in private, involving selling for hard cash, offering a seeming education, the part of it that hunts rich and reputable young men is --- to go by what we are saying now --- what we should call the expertise of the sophist.while another (268) is
The expert in imitation, then, belonging to the contradiction-producing half of the dissembling part of belief-based expertise, the word-conjuring part of the apparition-making kind from image-making, a human sort of production marked off from its divine counterpart --- if someone says that the one who is 'of this family kind, of this blood' is the real sophist, it seems his account will be the truest.
Scientifiction and Fantastica; Enigmas of Chance; Writing for Antiquity; Philosophy; Commit a Social Science; The Dismal Science; The Beloved Republic; Teaching: Statistics of Inequality and Discrimination; The Progressive Forces; Minds, Brains, and Neurons; The Collective Use and Evolution of Concepts; The Continuing Crises The Great Transformation
Posted at December 31, 2021 23:59 | permanent link
Attention conservation notice: I have no taste, and no qualifications to opine about how to conduct either social science, or the German Social Democratic Party at the end of the 19th century.
Books to Read While the Algae Grow in Your Fur;
Pleasures of Detection, Portraits of Crime;
Commit a Social Science;
Constant Conjunction Necessary Connexion;
The Progressive Forces
Posted at November 30, 2021 23:59 | permanent link
Attention conservation notice: An academic job ad.
We are looking to hire this year, both on the teaching track and the tenure track. It's a great department and you should apply if you're at all interested in professing statistics, even or indeed especially if your background isn't traditional stats. (I say this despite the fact that every application we get now means more work for me later.) If any reader has questions I might be able to answer, please don't hesitate to get in touch.
Posted at November 23, 2021 11:00 | permanent link
Attention conservation notice: 1400 words on the development economics of space colonization from someone who is neither an economist nor even a rocket scientist. Yet another semi-crank notion, quietly nursed for many years, drafted in this form in 2011, posted a decade later becauseI can't stand to do any more grading and want to procrastinateof Very Important Reasons I am not at liberty to reveal at this time.
So, what with the end of space shuttle flights and all, my feed-reader has been filled with people bemoaning the state of human space flight. While I share the sheer romantic longing for it (expressed with greater or lesser sophistication), if we want to consider other rationales for sending people into space, it's hard to come up with anything which can't be done better by robots. The only one I can think of is providing, as it were, a distributed back-up system for humanity --- places which could carry on the species should the Earth becomes uninhabitable. If this is the point, it imposes some constraints which are not, I think, sufficiently appreciated.
Colonies which could help in this way have to be at least potentially self-sufficient, without dependence on the Earth --- no spare parts, no processed intermediate inputs, nothing. Since there are no natural environments off Earth in which people can live, they will have to create artificial environments, which means that extra-terrestrial human societies must be industrial civilizations. Self-sufficiency means creating, in miniature, a whole industrial ecology.
Go read Brian Hayes's Infrastructure if you haven't already; I'll wait. We're talking about replicating all of those functions, and more. Now, remember that all the technologies whose complexities Hayes documents so lovingly have been developed to assume, and to make use of: gravity of 9.8 m s-2, ambient temperatures between ~230 and ~320 K, an unlimited supply of atmosphere which is about 20% oxygen at a pressure of about 105 N m-2, abundant and cheap liquid water, etc. Moreover, our technologies assume that their environment is big, so they can dump waste products, starting with heat and mechanical vibrations, into the environment. Simply sticking terrestrial machinery inside a small, fragile, carefully-controlled artificial environment is not going to work well. (You want to try running a smelter inside your space habitat?) So duplicating these capacities for a space colony will mean re-designing everything to fit local conditions profoundly different from anything we've faced before.
This will take a lot of design work and trial-and-error, hence it will be expensive: the workers and designers could have been doing other things, the gear and machine parts and material resources could have been put to other uses. How are these development costs to be recovered? The extra-terrestrial market, we will have to assume, will begin and long remain very much smaller than Earth's, so sharing those fixed development costs over a small population implies high average costs. (Colonies in different parts of the solar system will face different local conditions, and need to develop largely different technologies, so we can treat this colony by colony.) What about expanding the market by exporting? Suppose momentarily a complete subsidy for the fixed costs, and so think about marginal cost pricing. For exportable items, their cost at Earth will equal marginal cost of production in space plus marginal cost of interplanetary transport. Unless making comparable items on Earth is (almost literally) astronomically more expensive, there will be no export market for the colonies. And this is assuming, again that investors were willing to write off all development costs.
(At this point, readers may be tempted to invoke comparative advantage, and say that even if Space is less efficient at producing everything than Earth is, both Space and Earth will be better off if Space makes what it is relatively better at. Carefully examined, however, what the classic Ricardian argument proves is that there is an opportunity cost to not using the less-efficient country's factors of production, viz., the stuff which it could have, inefficiently, produced. To minimize the opportunity cost of letting those factors go idle, they should be employed in their least-inefficient use. So even if making widgets costs 1000 times as much in Space as on Earth, if widgets are the least-inefficient of Space's factors of production, it should make widgets, and trade them for other things. But this presumes that Space and its factors would exist without the trade. Since, for us, the whole question is whether there should be any workers, capital, etc., in Space, this line of argument just doesn't apply.)
Unless people come up with something valuable which can be made in space but cannot, or almost cannot, be made on Earth, it's hard to think of any manufactured goods which it would be sensible to export from space. What might make sense would be for space colonies to find comparatively cheap natural resources, requiring minimal on-site processing, and export them to Earth, in exchange for, well, everything else. Ideally the exports from the colonies would also be very stable physically and chemically, so they could be sent by slow, low-energy, automated (and therefore cheap) orbits to Earth. When you figure out what those resources are, especially ones that Earth doesn't already have in abundance, let the worlds know; please don't say "helium 3". Alternatively, one thing which can be produced on (say) Titan vastly more cheaply than on Earth is the experience of being on Titan: encapsulated in the form of science or entertainment, that experience could be shipped very cheaply to Earth, which might be willing to pay for it. Of course, neither an economy based on resource-extraction nor one based on scientific papers and reality TV would be self-sufficient. The logic of endogenous comparative advantage would, in fact, lock in place the mother of all core-periphery divisions, with the space colonies as the eternally dependent periphery.
A colony could, I suppose, decide to impose on itself the costs of developing its own industrial infrastructure, so as to replace imports from Earth. Those costs, to repeat, would be very high. Moreover, there's really no substitute for experience and experiment in improving technologies, so the initial quality and reliability will be low. Since, again, the local market will be small, it will not be able to support many producers, perhaps just one in each sector. There will be little scope for a diversity of local approaches to the problems of the industry, slowing innovation. There will also be little or no competition, with all that entails.
The picture of space colonies which might actually become self-sufficient, then, looks something like this. The population is forced by its leaders to endure endless privations to build monopolistic industries which produce inferior goods to those already available on the universal market, grimly tending towards autarky while exporting primary goods for the time being, on the promise that one day all of these sacrifices will be redeemed when they become the future of humanity. Somehow, I doubt there are many who find the idea of building socialism in one habitat compelling; Ken MacLeod may know them all by name.
(I have assumed everything stays within the solar system, because, pace Krugman, interstellar trade makes no sense at all. A civilization which could command enough energy to accelerate a large object to a significant fraction of the speed of light, so that trips between nearby stars take only decades, has no economic problem. At perhaps-attainable velocities, with thousands or tens of thousands of years of travel time, exchange is economically irrelevant, though it might still be attempted for cultural reasons. The obstacles in the way of human interstellar travel are of course immense. I have long thought it vastly more plausible to send robots which could then build suitable environments in which to grow human beings [also recently proposed by Charlie Stross], and that involves bio-engineering hand-waving of epic proportions.)
Comment, Nov. 2021: On re-reading, my treatment of the Ricardian argument is a little cavalier, but I don't feel energetic enough to write out and solve a New Economic Geography model where population and comparative advantage are both endogenous. If anyone is inspired to do this properly, though, I'd be genuinely fascinated to read it, and promise to link here.
Update, 16 January 2022:: Tweaked the phrasing about opportunity costs in the 4th paragraph a little (and I hope removed more typos than I added).
The Eternal Science of These Infinite Spaces; The Dismal Science; Modest Proposals
Posted at November 23, 2021 10:45 | permanent link
\[ \newcommand{\ModelDim}{d} \]
Attention conservation notice: Academic self-promotion.
So I have a new preprint:
I've been interested for a long time in methods for simulation-based inference. It's increasingly common to have generative models which are easy (or at least straightforward) to simulate, but where it's completely intractable to optimize the likelihood --- often it's intractable even to calculate it. Sometimes this is because there are lots of latent variables to be integrated over, sometimes due to nonlinearities in the dynamics. The fact that it's easy to simulate suggests that we should be able to estimate the model parameters somehow, but how?
An example: My first Ph.D. student, Linqiao Zhao, wrote her dissertation on a rather complicated model of one aspect of how financial markets work (limit-order book dynamics), and while the likelihood function existed, in some sense, the idea that it could actually be calculated was kind of absurd. What she used to fit the model instead was a very ingenious method which came out of econometrics called "indirect inference". (I learned about it by hearing Stephen Ellner present an ecological application.) I've expounded on this technique in detail elsewhere, but the basic idea is to find a second model, the "auxiliary model", which is mis-specified but easy to estimate. You then adjust the parameters in your simulation until estimates of the auxiliary from the simulation match estimates of the auxiliary from the data. Under some conditions, this actually gives us consistent estimates of the parameters in the simulation model. (Incidentally, the best version of those regularity conditions known to me are still those Linqiao found for her thesis.)
Now the drawback of indirect inference is that you need to pick the auxiliary model, and the quality of the model affects the quality of the estimates. The auxiliary needs to have at least as many parameters as the generative model, the parameters of the auxiliary need to shift with the generative parameters, and the more sensitive the auxiliary parameters are to the generative parameters, the better the estimates. There are lots of other techniques for simulation-based inference, but basically all of them turn on this same issue of needing to find some "features", some functions of the data, and tuning the generative model until those features agree between the simulations and the data. This is where people spend a lot of human time, ingenuity and frustration, as well as relying on a lot of tradition, trial-and-error, and insight into the generative model.
What occurred to me in the first week of March 2020 (i.e., just before things got really interesting) is that there might be a short-cut which avoided the need for human insight and understanding. That week I was teaching kernel methods and random features in data mining, and starting to think about how I wanted to revise the material on simulation-based inference for my "data over space and time" in the fall. The two ideas collided in my head, and I realized that there was a lot of potential for estimating parameters in simulation models by matching random features, i.e., random functions of the data. After all, if we think of an estimator as a function from the data to the parameter space, results in Rahimi and Recht (2008) imply that a linear combination of \( k \) random features will, with high probability, give an \( O(1/\sqrt{k}) \) approximation to the optimal function.
Having had that brainstorm, I then realized that there was a good reason to think a fairly small number of random features would be enough. As we vary the parameters in the generative model, we get different distributions over the observables. Actually working out that distribution is intractable, that's why we're doing simulation-based inference in the first place, but it'll usually be the case that the distribution changes smoothly with the generative parameters. That means that if there are \( \ModelDim \) parameters, the space of possible distributions is also just \( \ModelDim \)-dimensional --- the distributions form a \( \ModelDim \)-dimensional manifold.
And, as someone who was raised in the nonlinear dynamics sub-tribe of physicists, \( \ModelDim \)-dimensional manifolds remind me about state-space reconstruction and geometry from a time series and embedology. Specifically, back behind the Takens embedding theorem used for state-space reconstruction, there lies the Whitney embedding theorem. Suppose we have a \( \ModelDim \)-dimensional manifold \( \mathcal{M} \), and we consider a mapping \( \phi: \mathcal{M} \mapsto \mathbb{R}^k \). Suppose that each coordinate of \( \phi \) is \( C^1 \), i.e., continuously differentiable. Then once \( k=2\ModelDim \), there exists at least one \( \phi \) which is a diffeomorphism, a differentiable, 1-1 mapping of \( \mathcal{M} \) to \( \mathbb{R}^k \) with a differentiable inverse (on the image of \( \mathcal{M} \)). Once \( k \geq 2\ModelDim+1 \), diffeomorphisms are "generic" or "typical", meaning that they're the most common sort of mapping, in a certain topological sense, and dense in the set of all mappings. They're hard to avoid.
In time-series analysis, we use this to convince ourselves that taking \( 2\ModelDim+1 \) lags of some generic observable of a dynamical system will give us a "time-delay embedding", a manifold of vectors which is equivalent, up to a smooth change of coordinates, to the original, underlying state-space. What I realized here is that we should be able to do something else: if we've got \( \ModelDim \) parameters, and distributions change smoothly with parameters, then the map between the parameters and the expectations of \( 2\ModelDim+1 \) functions of observables should, typically or generically, be smooth, invertible, and have a smooth inverse. That is, the parameters should be identifiable from those expectations, and small errors in the expectations should track back to small errors in the parameters.
Put all this together: if you've got a \( \ModelDim \)-dimensional generative model, and I can pick \( 2\ModelDim+1 \) random functions of the observables which converge on their expectation values, I can get consistent estimates of the parameters by adjusting the \( \ModelDim \)-generative parameters until simulation averages of those features match the empirical values.
Such was the idea I had in March 2020. Since things got very busy after that (as you might recall), I didn't do much about this except for reading and re-reading papers until the fall, when I wrote it up as grant proposal. I won't say where I sent it, but I will say that I've had plenty of proposals rejected (those are the breaks), but never before have I had feedback from reviewers which made me go "Fools! I'll show them all!". Suitably motivated, I have been working on it furiously all summer and fall, i.e., wrestling with my own limits as a programmer.
But now I can say that it works. Take the simplest thing I could possibly want to do, estimating the location \( \theta \) of a univariate, IID Gaussian, \( \mathcal{N}(\theta,1) \). I make up three random Fourier features, i.e., I calculate \[ F_i = \frac{1}{n}\sum_{t=1}^{n}{\cos{(\Omega_i X_t + \alpha_i)}} \] where I draw \( \Omega_i \sim \mathcal{N}(0,1) \) independently of the data, and \( \alpha_i \sim \mathrm{Unif}(-\pi, \pi) \). I calculate \( F_1, F_2, F_3 \) on the data, and then use simulations to approximate their expectations as a function of \( \theta \) for different \( \theta \). I return as my estimate of \( \theta \) whatever value minimizes the squared distance from the data in these three features. And this is what I get for the MSE:

OK, it doesn't fail on the simplest possible problem --- in fact it's actually pretty close to the performance of the MLE. Let's try something a bit less well-behaved, say having \( X_t \sim \theta + T_5 \), where \( T_5 \) is a \( t \)-distributed random variable with 5 degrees of freedom. Again, it's a one-parameter location family, and the same 3 features I used for the Gaussian family work very nicely again:

OK, it can do location families. Since I was raised in nonlinear dynamics, let's try a deterministic dynamical system, specifically the logistic map: \[ S_{t+1} = 4 r S_t(1-S_t) \] Here the state variable \( S_t \in [0,1] \), and the parameter \( r \in [0,1] \) as well. Depending on the value of \( r \), we get different invariant distributions over the state-space. If I sampled \( S_1 \) from that invariant distribution, this'd be a stationary and ergodic stochastic process; if I just make it \( S_1 \sim \mathrm{Unif}(0,1) \), it's still ergodic but only asymptotically stationary. If I used the same 3 random Fourier features, well, this is the distribution of estimates from time series of length 100, when the true \( r=0.9 \), so the dynamics are chaotic:

I get very similar results if I use random Fourier features that involve two time points, i.e., time-averages of \( \cos{(\Omega_{i1} X_{t} + \Omega_{i2} X_{t-1} + \alpha+i)} \), but I'll let you look at those in the paper, and also at how the estimates improve when I increase the sample size.
Now I try estimating the logistic map, only instead of observing \( S_t \) I observed \( Y_t = S_t + \mathcal{N}(0, \sigma^2) \). The likelihood function is no longer totally pathological, but it's also completely intractable to calculate or optimize. But matching 5 (\( =2\times 2 + 1 \)) random Fourier features works just fine:

At this point I think I have enough results to have something worth sharing, though there are of course about a bazillion follow-up questions to deal with. (Other nonlinear features besides cosines! Non-stationarity! Spatio-temporal processes! Networks! Goodness-of-fit testing!) I will be honest that I partly make this public now because I'm anxious about being scooped. (I have had literal nightmares about this.) But I also think this is one of the better ideas I've had in years, and I've been bursting to share.
Update, 21 June 2022: a talk on this, in two days time.
Update, 12 September 2023: a funded grant.
Posted at November 17, 2021 20:30 | permanent link
Attention conservation notice: I have no taste, and no qualifications to opine on the history of monsters in 18th century France, medieval political philosophy, the history and archaeology of images of monsters, trends in mortality and inequality in early 21st century America, or the comparative sociology of slavery. (Monsters, monsters everywhere.)
With the expansion of urban settlements throughout Mesopotamia during the fourth millennium BC, the trajectory toward standardization and modularity in material culture intensified markedly. Systems of modular construction, based on the assembly of standardized and interchangeable components, are evident not just in imagery at this time, but also across such diverse technological domains as mud-brick architecture and ceramic commodity packaging... These wider developments in material culture underpinned the invention, around 3300 BC, of the protocuneiform script. This new system of information storage was initially designed for bookkeeping purposes in large urban institutions, which acted as the religious and economic hubs of the earliest cities. It was based on a principle of differentiation whereby materials, animals, plants, and labor were divided into fixed subclasses and units of measurement, organized according to abstract criteria of number, order, and rank. Many of the earliest known administrative tablets thus functioned in a manner comparable to modern punch cards and balance sheets. In order for such a recording system to function, every named commodity---each beer or oil jar, each dairy vessel, and their contents, and each animal of the herd---had to be interchangeable with, and thus equivalent to, every other of the same administrative class. A smaller number of early inscriptions, known as lexical lists, appear to have had no direct administrative function, and may reflect the intellectual milieu of the earliest scribes, who engaged, as part of their training, in "fanciful paradigmatic name-generating exercises" for a wide range of subjects.
The invention of a novel repertory of composite figures can be seen to "fit" very logically into this urban and bureaucratic milieu. In pictorial art, new standards of anatomical precision and uniformity, evident in both miniature and monumental formats, echoed wider developments in material culture. Through the medium of sealing practices, miniature depiction remained closely tied to the practice of administration, which required the multiplication of standardized and clearly distinguishable signs for the official marking of commodities and documents. Variability among seal designs was generated through often-tiny adjustments in the appearance or arrangement of figures and motifs. These did not alter the overall visual statement, but allowed each design to fulfill its designated role as a discrete identifier within the larger administrative system to which it belonged.
In its search for new subject matter, it is hardly surprising that the "bureaucratic eye" was increasingly drawn to the possibilities of composite figuration... Not only did a composite approach to the rendering of organic forms greatly multiply the range of possible subjects for depiction. As Barbara Stafford points out, the counterfactual images that it produced also serve to emphasize details of anatomy that would normally "slip by our attention or be absorbed unthinkingly," becoming noticeable only when disaggregated from their ordinary contexts. Composites thus encapsulated, in striking visual forms, the bureaucratic imperative to confront the world, not as we ordinarily encounter it---made up of unique and sentient totalities---but as an imaginary realm made up of divisible subjects, each comprising a multitude of fissionable, commensurable, and recombinable parts. [pp. 69--73, omitting footnotes and references to figures]
Books to Read While the Algae Grow in Your Fur; Commit a Social Science; Statistics of Inequality and Discrimination; The Dismal Science; Philosophy Islam and Islamic Civilization; Scientifiction and Fantastica; Writing for Antiquity; The Collective Use and Evolution of Concepts; Psychoceramics; Pleasures of Detection, Portraits of Crime
Posted at October 31, 2021 23:59 | permanent link
Attention conservation notice: I have no taste, and no qualifications to opine on threats to modern democracies, the history of China and Europe c. 1600, or the sociology of the French Revolution of 1789.
Books to Read While the Algae Grow in Your Fur; Pleasures of Detection, Portraits of Crime; Writing for Antiquity; Scientifiction and Fantastica; The Beloved Republic; The Continuing Crises; Tales of Our Ancestors
Posted at September 30, 2021 23:59 | permanent link
Attention conservation notice: I have no taste, and no qualifications to opine on criticism of cultural criticism, the sociology and demography of race in America, the political philosophy of doing something about climate change, or Afrocentric historiography.
Books to Read While the Algae Grow in Your Fur; Commit a Social Science The Beloved Republic; Philosophy; Pleasures of Detection, Portraits of Crime; Writing for Antiquity; Enigmas of Chance; Physics
Posted at August 31, 2021 23:59 | permanent link
Attention conservation notice: Sniping at someone else's constructive attempt to get the philosophy of mathematics to pay more attention to how mathematicians actually discover stuff, because it uses an idea that pushes my buttons. Assumes you know measure-theoretic probability without trying to explain it. Written by someone with absolutely no qualifications in philosophy, and precious few in mathematics for that matter. Largely drafted back in 2013, then laid aside. Posted now in lieu of new content.
Wolfgang points to an interesting post [archived] at "A Mind for Madness" on using Bayesianism in the philosophy of mathematics, specifically to give a posterior probability for conjectures (e.g., the Riemann conjecture) given the "evidence" of known results. Wolfgang uses this as a jumping-off point for looking at whether a Bayesian might slide around the halting problem and Gödel's theorem, or more exactly whether a Bayesian with \( N \) internal states can usefully calculate any posterior probabilities of halting for another Turing machine with \( n < N \) states. (I suspect that would fail for the same reasons my idea of using learning theory to do so fails; it's also related to work by Aryeh "Absolutely Regular" Kontorovich on finite-state estimation, and even older ideas by the late great Thomas Cover and Martin Hellman.)
My own take is different. Knowing how I feel about the idea of using Bayesianism to give probabilities to theories about the world, you can imagine that I look on the idea of giving probabilities to theorems with complete disfavor. And indeed I think it would run into insuperable trouble for purely internal, mathematical reasons.
Start with what mathematical probability is. The basics of a probability space are a carrier space \( \Omega \), a \( \sigma \)-field \( \mathcal{F} \) on \( \Omega \), and a probability measure \( P \) on \( \mathcal{F} \). The mythology is that God, or Nature, picks a point \( \omega \in \Omega \), and then what we can resolve or perceive about it is whether \( \omega \in F \), for each set \( F \in \mathcal{F} \). The probability measure \( P \) tells us, for each observable event \( F \), what fraction of draws of \( \omega \) are in \( F \). Let me emphasize that there is nothing about the Bayes/frequentist dispute involved here; this is just the structure of measure-theoretic probability, as agreed to by (almost) all parties ever since Kolmogorov laid it down in 1933 ("Andrei Nikolaevitch said it, I believe it, and that's that").
To assign probabilities to propositions like the Riemann conjecture, the points in the base space \( \omega \) would seem to have to be something like "mathematical worlds", say mathematical models of some axiomatic theory. That is, selecting an \( \omega \in \Omega \) should determine the truth or falsity of any given proposition like the fundamental theorem of algebra, the Riemann conjecture, Fermat's last theorem, etc. There would then seem to be three cases:
There are a lot of interesting thoughts in the post about how mathematicians think, especially how they use analogies to get a sense of which conjectures are worth exploring, or feel like they are near to provable theorems. (There is also no mention of Polya: but sic transit gloria mundi.) It would be very nice to have some formalization of this, especially if the formalism was both tractable and could improve practice. But I completely fail to see how Bayesianism could do the job.
That post is based on Corfield's Towards a Philosophy of Real Mathematics, which I have not laid hands on, but which seems, judging from this review, to show more awareness of the difficulties than the post does.
Addendum, August 2021: I have since tracked down an electronic copy of Corfield's book. While he has sensible things to say about the role of conjecture, analogy and "feel" in mathematical discovery, drawing on Polya, he also straightforwardly disclaims the "logical omniscience" of the standard Bayesian agent. But he does not explain what formalism he thinks we should use to replace standard probability theory. (The terms "countably additive" and "finitely additive" do not appear in the text of the book, and I'm pretty sure "\( \sigma \)-field" doesn't either, though that's harder to search for. I might add that Corfield also does nothing to explicate the carrier space \( \Omega \).) I don't think this is because Corfield isn't sure about what the right formalism would be; I think he just doesn't appreciate how much of the usual Bayesian machinery he's proposing to discard.
Posted at August 07, 2021 19:00 | permanent link
Attention conservation notice: An invitation to put a lot of effort into writing about a recondite academic topic, only to have it misunderstood by anonymous strangers.
Having agreed to be an area chair (area TBD), I ought to publicize the call for papers for the first Conference on Causal Learning and Reasoning (CLeaR 2022):
Causality is a fundamental notion in science and engineering. In the past few decades, some of the most influential developments in the study of causal discovery, causal inference, and the causal treatment of machine learning have resulted from cross-disciplinary efforts. In particular, a number of machine learning and statistical analysis techniques have been developed to tackle classical causal discovery and inference problems. On the other hand, the causal view has been shown to facilitate formulating, understanding, and tackling a broad range of problems, including domain generalization, robustness, trustworthiness, and fairness across machine learning, reinforcement learning, and statistics.We invite papers that describe new theory, methodology and/or applications relevant to any aspect of causal learning and reasoning in the fields of artificial intelligence and statistics. Submitted papers will be evaluated based on their novelty, technical quality, and potential impact. Experimental methods and results are expected to be reproducible, and authors are strongly encouraged to make code and data available. We also encourage submissions of proof-of-concept research that puts forward novel ideas and demonstrates potential for addressing problems at the intersection of causality and machine learning.
The proceedings track is the standard CLeaR paper submission track. Papers will be selected via a rigorous double-blind peer-review process. All accepted papers will be presented at the Conference as contributed talks or as posters and will be published in the Proceedings.
Topics of submission may include, but are not limited to:
- Machine learning building on causal principles
- Causal discovery in complex environments
- Efficient causal discovery in large-scale datasets
- Causal effect identification and estimation
- Causal generative models for machine learning
- Unsupervised and semi-supervised deep learning connected to causality
- Machine learning with heterogeneous data sources
- Benchmark for causal discovery and causal reasoning
- Reinforcement learning
- Fairness, accountability, transparency, explainability, trustworthiness, and recourse
- Applications of any of the above to real-world problems
The deadline is 22 October 2021; further details are available at the conference website.
(I should write up my "Apology for Causal Discovery" as a proper paper or at least essay, rather than a pair of slide decks and a video which [like all recordings of me] I can't stand to watch, but that's so far back in the queue I could cry.)
Posted at August 07, 2021 15:45 | permanent link
Attention conservation notice: I have no taste, and no qualifications to opine on culture-bound syndromes and contagious hysterias, the history and economics of socialist planning, economic inequality, or Islamic theology.
Update, 28 August 2021: Fixed an editing fragment that turned a sentence about Alacevich and Soci into mush.
Books to Read While the Algae Grow in Your Fur; Islam and Islamic Civilization; Writing for Antiquity; Pleasures of Detection, Portraits of Crime; The Dismal Science; The Progressive Forces; Psychoceramica; Minds, Brains, and Neurons; Actually, "Dr. Internet" Is the Name of the Monster's Creator
Posted at July 31, 2021 23:59 | permanent link
Attention conservation notice: I have no taste, and no qualifications to opine on cryptozoology, folklore, economics, or humanistic geography.
Books to Read While the Algae Grow in Your Fur; Scientifiction and Fantastica; Mathematics; Pleasures of Detection, Portraits of Crime; Tales of Our Ancestors; The Dismal Science; Commit a Social Science; Philosophy; Psychoceramics
Posted at June 30, 2021 23:59 | permanent link
Attention conservation notice: Advertisement for a course you won't take, at a university you don't attend. Even if the subject is of some tangential interest, why not check back in a few months to see if the teacher has managed to get himself canceled, and/or produced anything worthwhile?
In the fall I will, again, be teaching something new:
36-313, Statistics of Inequality and Discrimination
9 units
Time and place: Tuesdays and Thursdays, 1:25 -- 2:45 pm, location TBA
Description: Many social questions about inequality, injustice and unfairness are, in part, questions about evidence, data, and statistics. This class lays out the statistical methods which let us answer questions like Does this employer discriminate against members of that group?, Is this standardized test biased against that group?, Is this decision-making algorithm biased, and what does that even mean? and Did this policy which was supposed to reduce this inequality actually help? We will also look at inequality within groups, and at different ideas about how to explain inequalities between groups. The class will interweave discussion of concrete social issues with the relevant statistical concepts.
Prerequisites: 36-202 ("Methods for Statistics and Data Science") (and so also 36-200, "Reasoning with Data")
This is a class I've been wanting to teach for some years now, and I'm very
happy to finally get the chance to feel my well-intentioned but
laughably inadequate efforts crushed beneath massive and justified opprobrium
evoked from all sides bore and perplex some undergrads who
thought they were going to learn something interesting in stats. class for a
change try it out.
There will not be any exams.
My usual policy is to drop a certain number of homeworks, and a certain number of lecture/reading questions, no questions asked. The number of automatic drops isn't something I'll commit to here and now (similarly, I won't make any promises here about the relative weight of homework vs. lecture-related questions).
Posted at June 03, 2021 23:59 | permanent link
Attention conservation notice: I have no taste.
On a different note, over the semester I re-read a lot of textbooks and
monographs for the
undergrad statistical learning class, so I provide some
links here for the ones I mined for examples and problem sets
found especially useful:
Books to Read While the Algae Grow in Your Fur; Scientifiction and Fantastica; Pleasures of Detection, Portraits of Crime; Enigmas of Chance;
Posted at May 31, 2021 23:59 | permanent link
Attention conservation notice: I have no taste, and no qualifications to opine on ethics of any sort.
Books to Read While the Algae Grow in Your Fur; Scientifiction and Fantastica; Pleasures of Detection, Portraits of Crime; Enigmas of Chance; Automata and Calculating Machines
Posted at April 30, 2021 23:59 | permanent link
Attention conservation notice: I have no taste, and no qualifications to opine on the sociology of radio and the music industry, or on movies.
(I didn't finish a lot of books this month, since I'm not counting re-reading bits and pieces of arcane tomes on golem-making as needed for my own shambling creation.)
Books to Read While the Algae Grow in Your Fur; The Continuing Crises; Commit a Social Science; Networks; Pleasures of Detection, Portraits of Crime; Heard About Pittsburgh PA
Posted at March 31, 2021 23:59 | permanent link
Attention conservation notice: 1000-word grudging concession that a bete noire might have a point, followed immediately and at much greater length by un-constructive hole-poking; about social media, by someone who's given up on using social media; also about the economics of recommendation engines, by someone who is neither an economist nor a recommendation engineer.
Because he hates me and wants to make sure that I never get back to any (other) friend or collaborator, Simon made me read Jack Dorsey endorsing an idea of Stephen Wolfram's. Much as it pains me to say, Wolfram has the germ of an interesting idea here, which is to start separating out different aspects of the business of running a social network, as that's currently understood. I am going to ignore the stuff about computational contracts (nonsense on stilts, IMHO), and focus just on the idea that users could have a choice about the ranking / content recommendation algorithms which determine what they see in their feeds. (For short I'll call them "recommendation engines" or "recommenders".) There are still difficulties, though.
— Back in the dreamtime, before the present was widely distributed, Vannevar Bush imagined the emergence of people who'd make their livings by pointing out what, in the vast store of the Memex, would be worth others' time: "there is a new profession of trail blazers, those who find delight in the task of establishing useful trails through the enormous mass of the common record." Or, again, there's Paul Ginsparg's vision of new journals erecting themselves as front ends to arxiv. Appealing those such visions are, it's just not happened in any sustained, substantial way. (All respect to Maria Popova for Brain Pickings, but how many like her are there, who can do it as a job and keep doing it?) Maybe the obstacles here are ones of scale, and making content-recommendation a separate, algorithmic business could help fulfill the vision. Maybe.
"Presumably", Wolfram says, "the content platform would give a commission to the final ranking provider". So the recommender is still in the selling-ads business, just as Facebook, Twitter, etc. are now. I don't see how this improves the incentives at all. Indeed, it'd presumably mean the recommender is a "publisher" in the digital-advertizing sense, and Facebook's and Twitter's core business situation is preserved. (Perhaps this is why Dorsey endorses it?) But the concerns about the bad and/or perverse effects of those incentives (e.g.) are not in the least alleviated by having many smaller entities channeled in the same direction.
On the other hand, I imagine it's possible that people would pay for recommendations, which would at least give the recommenders a direct financial incentive to please the users. This might still not be good for the users, but at least it would align them more with users' desires, and diversity of those desires could push towards a diversity of recommendations. Of course, there would be the usual difficulty of fee-based services competing against free-to-user-ad-supported services.
Further: as Wolfram proposes it, the features used to represent content are already calculated by the operator. This can of course impose all sorts of biases and "editorial" decisions centrally, ones which the recommenders would have difficulty over-riding, if they could do so at all.
Normally I'd say there'd also be switching costs to lock users in to the first recommender they seriously use, but I could imagine the network operators imposing data formats and input-output requirements to make it easy to switch from one recommender to another without losing history.
— Not quite so long ago as "As We May Think", but still well before the present was widely distributed, Carl Shaprio and Hal Varian wrote a quietly brilliant book on the strategies firms in information businesses should follow to actually make money. The four keys were economies of scale, network externalities, lock-in of users, and control of standards. The point of all of these is to reduce competition. These principles work — it is no accident that Varian is now the chief economist of Google — and they will apply here.
Someone else must have proposed this already. This conclusion is an example of induction by simple enumeration, which is always hazardous, but compelling with this subject. I would be interested to read about those earlier proposal, since I suspect they'll have thought about how it actually could work.
*: Back of the envelope, say the prediction error is
$O(n^{-1/2})$, as it often is. The question is then how utility to the user
scales with error. If it was simply inversely proportional, we'd get utility
scaling like $O(n^{1/2})$, which is a lot less than the $O(n)$ claimed for
classic network externalities
by Metcalfe's law
rule-of-thumb. On the other hand it feels more sensible to say that going
from an error of $\pm 1$ on a 5 point scale to $\pm 0.1$ is a lot more valuable
to users than going from $\pm 0.1$ to $\pm 0.01$, not much less valuable.
Indeed we might expect that even perfect prediction would have only
finite utility to users, so the utility would be something like
$c-O(n^{-1/2})$. This suggests that we could have multiple very large
services, especially if there is a cost to switch between recommenders. But it
also suggests that there'd be a minimum viable size for a service, since if
it's too small a customer would be paying the switching cost to get worse
recommendations. ^
The Dismal Science; Actually, "Dr. Internet" Is the Name of the Monster's Creator
Posted at March 26, 2021 14:03 | permanent link
I can't remember if Henry Farrell came up with this phrase, or I did, as the title for a possible joint project. I also forget whether we meant "Monster's", singular, or "Monsters'", plural; as time passes I lean towards the latter.
See also Linkage
Posted at March 26, 2021 14:01 | permanent link
Attention conservation notice: Asking for help finding something that you don't know about, that you don't care about, and that a bad memory might have just confabulated.
I have a vivid memory of reading, in the 1990s, an online discussion (maybe just two people, maybe as many as four) about what online fora, search engines, the Web, "agents", etc., were doing to the way people acquire and use knowledge, and indeed to what we mean by "knowledge". My very strong impression is that one of the participants was linked somehow with the MIT Media Lab, and taking a very strong social-constructionist line (unsurprisingly, given that affiliation). At some point the discussion turned to her experiences with an online forum related to a hobby of hers (tropical fish? terraria?). The person I'm thinking of said something like, the consensus of that forum just were knowledge about \$HOBBY. One of her interlocutors made an objection on the order of, why do you trust those random people on the Internet to have any idea what they're talking about? To which the reply was, basically, come on, who'd just make stuff up about \$HOBBY?
I have (genuinely!) thought of this exchange often in the 20-plus years since I read it. But when I recently tried to find it again, to check my memory and to cite it in a work-in-glacial-progress, I've been unable to locate it. (The fact that I don't recall any names of the participants, or the venue, doesn't help.) I am prepared to learn that, because this is something I've thought of often, my mind has re-shaped it into a memorable anecdote, but I'd still like to see what this started from. Any leads readers could provide would be appreciated.
The hive mind Lucy Keer (with an assist from Mike Traven) delivers:
Definitely me! :) I think you're referring to my story about a guy on USENET who was a legendary flamer/troll, EXCEPT when he talked about tropicalfish he was incredibly knowledgeable and helpful.
Incidentally, a lot of my book "Should You Believe Wikipedia? Online Community Design and the Social Construction of Knowledge" (coming out in a few months, Cambridge University Press) is about this general topic.
— Amy Bruckman (@asbruckman) March 26, 2021
Specifically, the seed around which this story nucleated in my memory may have been a January 1996 piece by Prof. Bruckman in Technology Review — it has the right content (sci.aquaria!), the right date, my father subscribed to TR and I'd even have been visiting my parents when that issue was current. Only it's not a conversation between multiple people but a solo-author essay, it's not primarily about the social aspects of knowledge but about how to find congenial on-line communities and make (or re-make) ones that don't suck (the lost wisdom of the Internet's early Bronze Age), and contains nothing like "who'd just make stuff up about \$HOBBY?" (In short: Bartlett (1932) meets Radio Yerevan.)
More positively, I very much look forward to reading Bruckman's book (there's an excerpt/precis available on her website).
Actually, "Dr. Internet" Is the Name of the Monster's Creator; The Collective Use and Evolution of Concepts
Posted at March 26, 2021 12:32 | permanent link
\[ \newcommand{\Expect}[1]{\mathbb{E}\left[ #1 \right]} \newcommand{\Prob}[1]{\mathbb{P}\left( #1 \right)} \newcommand{\Cov}[1]{\mathrm{Cov}\left[ #1 \right]} \newcommand{\Var}[1]{\mathrm{Var}\left[ #1 \right]} \]
Attention conservation notice: An 800-word, literally academic exercise about an issue in causal inference. Its point is familiar to those in the field, and deservedly obscure to everyone else. Also, too cutesy and pleased with itself by at least half.
I wrote the first version of this for the class where we do causal inference long enough ago that I actually don't remember when --- 2011? 2013? (In retrospect I had probably read Milton Friedman's thermostat analogy but didn't consciously remember it at the time.) Posted now because I've gone over the point with two different people in the last month.
The temperature outside \( (X) \) is a direct cause of the temperature inside my house \( (Y) \). But every morning I measure the temperature, and adjust my heating/cooling system \( (C) \) to try to maintain a constant temperature \( y_0 \). For simplicity, we'll say that all the relations are linear, so \[ \begin{eqnarray} X & \sim & \mathrm{whatever}\\ C|X & \leftarrow & a+bX + \epsilon_1\\ Y|X,C & \leftarrow & X-C + \epsilon_2 \end{eqnarray} \] where \( \epsilon_1 \) and \( \epsilon_2 \) are exogenous, independent, mean-zero noise terms. We can think of \( \epsilon_1 \) as a combination of my sloppiness in measuring the temperature and in tuning the heating/cooling system; \( \epsilon_2 \) is sheer fluctuations.
Exercise: Draw the DAG.
To ensure that the expectation of \( Y \) remains at \( y_0 \), no matter the external temperature, we need \[ \begin{eqnarray} y_0 & = & \Expect{Y|X=x}\\ & = & \Expect{X - a + bX + \epsilon_1 + \epsilon_2|X=x}\\ & = & (1-b)x -a \end{eqnarray} \] Since this must hold for all \( x \), we need \( b=1, a=-y_0 \).
What follows from this?
Exercise: Build your character by doing the algebra.
So, as long as control isn't perfect, the naive statistician (or experienced econometrician...) who just does a kitchen-sink regression will actually get the relationship between \( Y \), \( X \) and \( C \) right, concluding that external temperature and the climate control have equal and opposite effects on internal temperature. Sure, there will be sampling noise, but with enough data they'll approach the truth.
Exercise: What do you get if you regress \( C \) on \( X \) and \( Y \)?
I have implicitly assumed that I know the exact linear relationship between \( X \) and \( Y \), since I used that in deriving how the control signal should respond to \( X \). If I mis-calibrate the control signal, say if \( C = -y_0 +0.999X + \epsilon_1 \), then there is not an exact cancellation and everything works as usual.
Exercise: Suppose that instead of measuring the external temperature \( X \) directly, I can only measure yesterday's temperature \( U \), again with noise. Supposing there is a linear relationship between \( U \) and \( X \), replicate this analysis. Does it matter if \( U \) is the parent of \( X \) or vice versa?
Exercise: "Feedback is a mechanism for persistently violating faithfulness"; discuss.
Exercise: "The greatest skill seems like clumsiness" (Laozi); discuss.
Posted at March 26, 2021 09:08 | permanent link
Attention conservation notice: I have no taste, and no qualifications to opine about geology, policing, law, or the history of Islamic science.
Books to Read While the Algae Grow in Your Fur; Islam and Islamic Civilization; The Beloved Republic; Writing for Antiquity; Scientifiction and Fantastica
Posted at February 28, 2021 23:59 | permanent link
Posted at February 12, 2021 14:45 | permanent link
Attention conservation notice: I have no taste, and no qualifications to opine on the history of science.
Books to Read While the Algae Grow in Your Fur; Writing for Antiquity; Enigmas of Chance; Mathematics; Pleasures of Detection, Portraits of Crime
Posted at January 31, 2021 23:59 | permanent link