By Graham Peterson
Ethnography is good for a lot. Like Shamus Khan and Colin Jerolmack have recently argued, ethnography is, just like the measurement of relative prices, a great way to study revealed values and motivations (sociology speak) and revealed preferences (economics speak).
People have a pretty poor self conscious understanding of the distal, structural, social-aggregate-level mechanisms that drive their behaviors. There isn’t a social science that doesn’t try to catch people unawares, and make bird’s eye inferences about those behaviors. So every social science needs methods that draw inferences on things that don’t come directly out of people’s mouths, pens, or keyboards.
Ethnography is good for that. And yet, people will complain about ethnography — or rather, bad ethnography — invoking the ideals of randomness and representativeness taught in statistics courses. But bad ethnography is bad for a lot of the same reasons bad statistics are.
Bad ethnography comes from convenience samples of people’s personal networks, and samples on the dependent variable without comparison groups. It replicates derivative, routine, and already established theories. It pretends that the author didn’t know what he was going to find before he showed up, then does an elaborate dance in the write-up trying to pretend to be objective.
People who do this drop a lot of “lived experience” and “in process” and “embodied practice” bombs that are supposed to end the conversation with their sheer authority.
Bad statistics does the same things. It comes from convenience samples drawn from few-clicks-away government data, and samples on the dependent variable without comparison or counterfactual groups. It replicates derivative, routine, and already established theories. It pretends the author didn’t know what she was going to find before she showed up, and the write up feigns objectivity.
People who do this drop a lot of “three-asterisk” and “testable” and “control vector” bombs that are supposed to end the conversation with their sheer authority.
Now I want to argue that we need both ethnography and statistics, but not for the reasons I’ve heard some people run to. Some people will claim that we need purely descriptive studies; they abdicate causation and tell us ethnography gives us thick descriptions. I have only heard this argument in the context of methodological debate, though. Any interesting ethnography I’ve ever read has made a host of causal claims, and suggested their robustness with plausible interpretations of data.
Others have argued that ethnography helps us get on the ground and witness the emergence of causal mechanisms as they unfold. You don’t have to step back a million miles, cover your eyes and write down a null, and then make causal claims ex post. You can actually witness and take note of an antecedent, and its consequent, as they happen.
That argument is well and fine for ethnographers and statisticians to both keep their jobs, and do their own thing at their own conferences. But I want to argue that these people need to talk to one another, too, and for a principle reason that I don’t know how to phrase in grammar other than statistical grammar, but I bet can be translated.
Ethnography samples on the tails of distributions (imagine without loss of generality a normal population distribution of some trait or phenomenon), and statistical studies sample on measures of center. Both measures can answer causal questions, because both have their own way of filtering out confounding noise in empirical observation, and illuminating causal mechanisms.
Ethnographers go out into the world and turn up the volume on their variable of interest, in order to increase their signal/noise ratio, by sampling on the extremities of its distribution. Note that this is the same motivation for large N inferential statistics. The idea there is to turn up the N until you can successfully differentiate signal from noise.
So if one wants to study the mechanisms driving social mobility, one goes to a homeless shelter to study downward mobility, not a college campus. That’s not cherry picking — it’s calibration of the measurement instrument. And it turns out that one can turn up signal and turn down noise, both by turning up the N and turning it down, depending on which portion of a population distribution one is sampling on.
Tail sampling makes statistical thinkers nervous. All of the nice results of the central limit theorem (which is built on successive estimates of center – not estimates of tails) fall apart. Estimators lose efficiency and become biased, on purpose. But turning up the volume and sampling on tails is extremely effective for the same reason a caricature works — it exaggerates what is distinctive and different about a particular variable in contrast to the confounding weeds around it.*
Both methods turn up signal and turn down noise. Both methods observe primarily behaviors — stark and nonsensical on their own — and require textual, deductive, rhetorical, analogical, and narrative inference to make any sense of them.
So neither is superior, and neither can give us a whole picture of the population distribution of a social phenomenon, because each excludes, truncates, and draws discussion away from the other’s target on that distribution. Statistical estimation of central tendency by definition and de facto obfuscates what we know about tails, and ethnography by definition and de facto obfuscates what we know about central tendency.
I’m sure archivists, humanists, interviewers, surveyors, and other observers of self conscious narratives fit in here somehow. I’m just not sure how yet.
*Here I have just argued that cartoons are, literally, useful scientific representations of reality. Keep that in mind the next time you call someone’s argument a cartoon. Cartoons are funny because they make explicit, with innuendo and misdirection, tacit common knowledge.
Image credit: behaviorgap.com