By Adam Elkus
It’s been a very interesting time for discussions about modeling and the philosophy of science.
First, my GMU colleague David Masad has a very intriguing post on computational social science (CSS), machine learning, and models.
Just as a data science approach may be insufficient on its own for finding the qualitative and emergent characteristics of a system, agent-based models may benefit from more engagement with data. One common criticism of ABMs is that they lack rigorous foundations. While I think that this is often unfair (particularly when the foundations are rigorous qualitative theory), it is the case that ABMs are often compared with real data only once they are built, either for validation or calibration. As far as I know, using machine learning to fit agent behavior (as I do here) is still uncommon. Ultimately, I think computational social science will need to combine both approaches. Going forward, I’m hoping to extend the type of work I’ve shown here, using data science techniques to understand agent-level behavior and combining it with qualitative theory to situate that behavior within a larger interactive system.
David was responding to pieces by Duncan Watts and Sean J. Taylor that looked at CSS from the perspective of knowledge discovery and automated content extraction. In contrast, David and I go to a program that is more focused on causal mechanisms and models that use qualitative theory as their lodestar. David (rightly) argues that CSS-ers shouldn’t have to choose — the laboratory quality of agent-based models can be combined with data science techniques to make more realistic and useful models. This is an approach already taken by the “cultural algorithms” method.
Elsewhere, fellow GMU’er Russell Thomas has been debating Cliodynamics theorist Peter Turchin about agent-based modeling of human social evolutionary change. Russell’s argument centers around the need for robustness checks, counterfactuals, and sensitivity analysis concerning models:
Validation and verification are also crucial for simulations since they are situated in a broader ontological and epistemological context. The two diagrams below show some of this context. The first diagram comes from a conference paper called “On the meaning of data” and it focuses only on the bare bones of empirical research, which has some similarity to simulation-based research. It’s simplistic, of course, but it gets across the main point: many factors besides the “model” and the “data” are involved in shape the final results, especially the crucial role of framing and interpretation. ….To say that a simulated model accurately predicts the explanandum, as Dr. Turchin has done, only covers the three boxes and relations on the far left [referring to a diagram] — (from bottom to top) “Simulation Model”, “Simulation Model Data/Results”, and “System Data/Results”. It leaves out all the other elements and relations, which you can see are highly relevant to validation and verification. The paper by Sargent goes into these issues in detail.
What do both have in common? Masad and Thomas are both grappling with several dimensions of the “curse of computing.” In the linked post, Artem Kaznatcheev looks at the problem of computer simulations, using automated theorem-proving in mathematics as an example:
For me, the issue is not general surveyability, but internalization. No mathematician fully understands the computational part of the proof, at least no more than a pointy-haired boss understands the task his engineers completed. Although some AI enthusiasts might argue that the computer understands its part of the proof, most mathematicians (or people in general) would be reluctant to admit computers as full participants in the “social, informal, intuitive, organic, human process” (De Millo, Lipton, & Perlisp, 1979; pg. 269) of mathematics. For De Millo, Lipton, & Perlisp (1979), PYTHIAGORA’s verification or the computer assisted part of a proof is simply meaningless; it does not contribute to the community’s understanding of mathematics. This is easiest to see in the odd Goldbach conjecture: what understanding do we gain from the odd numbers that Helfgott’s computer program checked? It teaches us no new mathematics, no new methods, and brings no extra understanding beyond a verification.
In an alternative world without computer proofs, this verification would be absent. On the one hand, this means that alternative Helfgott would only tighten but not resolve the conjecture. On the other hand, the problem would remain open and continue to keep researchers motivated to find completely analytic ways to resolve it. Of course, even in our real world, a few mathematicians will continue looking for a non-computer assisted proof of the weak Goldbach conjecture, just as they continue to do with the four color theorem. However, the social motivation will be lower and progress slower. This is the curse of computing: giving up understanding for an easy verification.
This is part of a theme that the EGT blog crew has explored in the past — the fact that computers (data science or simulations based on qualitative theory) can help us verify without understanding. This is particularly pernicious when we are dealing with systems with many moving parts, systems that are difficult to understand or derive causality from. Kaznatcheev argues that we should foreground constructive analytical representations first before we begin putting them into computers, attempting to first gain a purchase over the object we are attempting to theorize about.
Now that I’ve completed this mini lit review, I’ll give you my take on this difficult problem.
My own view is that, for a discipline that uses “computational” in its title, we seem to be very uninterested in the idea of computation itself and what it means for our research. And I say this both from the perspective of the formal ideas of computation as well as how we use computer programs and technology for our models. Computers are, to us, an instrument that helps us do our research — whether we are discovering patterns about social life with the data-centric social science Watts and Taylor talk about or the modeling that Masad and Thomas engage in. I’m going to focus more on the latter, since it is something I know more about than data science per se.
The Santa Fe-inspired school of CSS uses computer code and programs as a representational language to encompass models of social process. Object-oriented programming, for example, is used because it is thought to be isomorphic with Herbert Simon’s idea of hierarchal complexity. Simon wrote of a “sciences of the artificial” rooted in humanity’s tendency to produce synthetic objects with both inner and outer environments that mimic the adaptation and design of organic life forms. In a classic essay towards the end of the book, Simon wrote about the idea of an epistemology that described a number of real-world systems — a nested and ranked ordering of interacting objects that could be treated as near black boxes. These objects, Simon argued, interacted together to become more than the sum of their parts. Modeling in general is about producing a simplified “map” of some real-world referent system, not the system itself. Few modelers believe that their models *are* the territory. Hence CSS as Masad, Russell, and I know it is about making hierarchally complex systems composed of these near black boxes as computer programs and using the programs as a representational language.
The problem, though, is that it is difficult to understand the distorting effect of the representational language. Phil Arena once tweeted, in response to a story about plants that supposedly mathematically calculated, that while math is a useful way of representing things it’s bonkers to say that plants are literally doing math. Unfortunately this isn’t really a new problem. The history of Simon’s “sciences of the artificial” is one long and sometimes creepy story of humans imputing intentionality and anthropomorphic qualities to non-human entities…..and humans imputing mechanistic and computational qualities from non-human entities or symbolic systems to humans. In particular, we’ve always been fascinated with automata, from antiquarian curiosities to modern science fiction’s HAL and WOPR.
A large part of computational social science revolves around artificial agents that we instrument for the purpose of science. The idea of “generative social science” is about constructing societies of computational agents that simulate some real-world thing of interest. In essence, we’ve taken the 18th century chess-playing automations and their cousins and slaved them to act out our ideas in the hope we’ll learn something about real human beings and the social aggregates they create. Simon’s extended analogy between naturally produced objects and synthetic ones in terms of things governed by “inner” and “outer” environments is fun but also problematic.
There is a reasonable question embedded here about what this really tells us about the real world, particularly since the goal of computation itself (and artificial intelligence in particular) has always been to migrate the aspects of cognition least representative of human behavior and cognition to machines. As J.C.R Licklider wrote in the early 60s, computers are meant to tackle what is most difficult and frustrating for us so we can free ourselves up for creative thought and problem formulation. And computers struggle to capture the aspect of human cognition that we barely think about — the “frame problem“:
To most AI researchers, the frame problem is the challenge of representing the effects of action in logic without having to represent explicitly a large number of intuitively obvious non-effects. But to many philosophers, the AI researchers’ frame problem is suggestive of wider epistemological issues. Is it possible, in principle, to limit the scope of the reasoning required to derive the consequences of an action? And, more generally, how do we account for our apparent ability to make decisions on the basis only of what is relevant to an ongoing situation without having explicitly to consider all that is not relevant?
There are three responses to this, all of which have pros and cons.
First, we can double down and argue that programs and code are a suitable language to model (in a stylized manner) what individuals, institutions, and societies engage in every day. Our petri dish of agents are enough like the real thing that we can use them. This is a persuasive argument, but the problem is that we’re still limited in our theory development to what we can represent with machines. We have to account for the frame problem and the curse of computing. Granted, that isn’t exactly a bad thing — the cognitive science community has gotten along quite fine with simulation engines like SOAR and ACT-R that represent cognition in a way that fits with the demands of computer programming and computation. But we have to always keep this in the back of our heads.
The second perspective, which I’ve toyed with, is the idea of accepting that simulated agents, no matter how cognitively realistic or data-primed we can make it, is never going to tell us more than just what we can do with computer algorithms….and that our agents are simply more sophisticated versions of the 18th-19th century automata crowd attractions. This would put a premium on the idea that the theoretical elements of computation itself — not necessarily what we can represent with the models — is the real prize. Like the Platonist view of mathematics as something that exists independently of human agreement, we could say that computation itself is a neutral and objective language to deductively examine formal properties of society. This is something that Artem Kaznetcheev has done quite a bit with his idea of evolution and scientific progress as machine learning. Computation, like formal proofs in game theory, can deduce qualities about society that stand on the basis of mathematical logic. To quote Katnatcheev again:
For over twenty-three hundred years, at least since the publication of Euclid’s Elements, the conjecture and proof of new theorems has been the sine qua non of mathematics. The method of proof is at “the heart of mathematics, the royal road to creating analytical tools and catalyzing growth” (Rav, 1999; pg 6). Proofs are not mere justifications for theorems; they are the foundations and vessels of mathematical knowledge. Contrary to popular conception, proofs are used for more than supporting new results. Proofs are the results, they are the carriers of mathematical methods, technical tricks, and cross-disciplinary connections.
Of course, at its most basic level, a proof convinces us of the validity of a given theorem. The dramatic decisiveness of proofs with respect to theorems is one of the key characteristics that set mathematics apart from other disciplines. A mathematical proof is unique in its ability to reveal invalid conclusions as faulty even to the author of that conclusion. Contrast this with Max Planck’s conception of progress in science:
A new scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die, and a new generation grows up that is familiar with it.
Further, unlike in science, a mathematical conclusion is shown to be faulty by the proofs and derivations of other mathematicians, not by external observations.
I’m personally sympathetic to this idea — with one important caveat. Social aggregates are emergent, probabilistic, and interactive. Mathematics as a language has some important limitations in the way it can represent those qualities, particularly in the difficulty of creating scientific tools that can be used as environments for tinkering and creative thought. Science is often cast either as a process of either hypothesis-testing or deduction from first principle. Science in practice, however, is often messy, creative, and improvisational — and computing, though sometimes a “curse,” has the potential to serve as an aid to science. Finally, how research is represented is also important. Models and theorists are connectors and communicators, and a model that produces an appealing and interactive narrative can serve as a means of connection. This is something that the data science community understands rather intuitively, even if it can unfortunately produces “the one map that explains everything about ___” hackery. Hence no matter what we do, computers and simulation ought to be the instrument of our science.
The third response is to simply say “so what?” and argue that all of this philosophical desiderata I’ve just elucidated is besides the point. Can the model predict? Does it fit with real world data? Is it robust? etc etc. I would admit that it’s hard to argue against this — at the end of the day, that is what journal reviewers and grant-givers care about. But — perhaps due to my pre-CSS background in mostly qualitative research and theory — I think that the logic of representation and the philosophical assumptions we make about the language of science can’t easily be dismissed. Ultimately, though we are in the business of accounting for variation, issues of understanding and explanation sit at the core of what we do. If models are maps, accuracy isn’t necessarily the sole criterion — a poorly designed map will also mislead those who use it.
I agree with Masad that models must engage data more, with Kaznatcheev that formal properties and analytical correctness is crucial, and with Thomas that a model also does not speak for itself. But I’d also submit that the process of how we deal with the subject of computation — from both a technological perspective (the “curse of computing”) as well as a theoretical one (how we engage with the formal logic of computation) are the defining questions of the discipline.