“The Economics of Scale-Up” by Jonathan Guryan
Most randomized controlled trials (RCT) of social programs test interventions at modest scale. While the hope is that promising programs will be scaled up, we have few successful examples of this scale-up process in practice. Ideally we would like to know which programs will work at large scale before we invest the resources to take them to scale. But it would seem that the only way to tell whether a program works at scale is to test it at scale. Our goal in this paper is to propose a way out of this Catch-22. We first develop a simple model that helps clarify the type of scale-up challenge for which our method is most relevant. Most social programs rely on labor as a key input (teachers, nurses, social workers, etc.). We know people vary greatly in their skill at these jobs. So social programs, like firms, confront a search problem in the labor market that can lead to inelastically-supplied human capital. The result is that as programs scale, either average costs must increase if program quality is to be held constant, or else program quality will decline if average costs are held fixed. Our proposed method for reducing the costs of estimating program impacts at large scale combines the fact that hiring inherently involves ranking inputs with the most powerful element of the social science toolkit: randomization. We show that it is possible to operate a program at modest scale n but learn about the input supply curves facing the firm at much larger scale (S × n) by randomly sampling the inputs the provider would have hired if they operated at scale (S × n). We build a simple two-period model of social-program decision making and use a model of Bayesian learning to develop heuristics for when scale-up experiments of the sort we propose are likely to be particularly valuable. We also present a series of results to illustrate the method, including one application to a real-world tutoring program that highlights an interesting observation: The noisier the program provider’s prediction of input quality, the less pronounced is the scale-up problem.