Railroads of the Raj: Estimating the Impact of Transportation Infrastructure


1 American Economic Review 2018, 108(4-5): 899–934 https://doi.org/10.1257/aer.20101199 Railroads of the Raj: Estimating the Impact of † Transportation Infrastructure By Dave Donaldson* How large are the benefits of transportation infrastructure projects , and what explains these benefits? This paper uses archival data from colonial India to investigate the impact of India’s vast railroad net- , work. Guided by four results from a general equilibrium trade model I find that railroads: ( 1 ) decreased trade costs and interregional price gaps; ( 2 ) increased interregional and international trade; ( ) increased real income levels; and ( 4 ) that a sufficient statistic 3 for the effect of railroads on welfare in the model accounts well for the observed reduced-form impact of railroads on real income in the data. ( JEL H54, L92, N75, O22, R12, R42 ) In 2007, almost 20 percent of World Bank lending was allocated to transporta- tion infrastructure projects, a larger share than that of education, health, and social services combined ( World Bank 2007 ) . These projects aim to reduce the costs of trading. In prominent models of international and interregional trade, reductions in trade costs will increase the level of real income in trading regions. Unfortunately, despite an emphasis on reducing trade costs in both economic theory and contem- porary aid efforts, we lack a rigorous empirical understanding of the extent to which transportation infrastructure projects actually reduce the costs of trading, and how the resulting trade cost reductions affect welfare. In this paper I exploit one of history’s great transportation infrastructure proj- ects, the vast network of railroads built in colonial India India, Pakistan, and ( Bangladesh—henceforth, simply “India” ) , to make three contributions to our under - standing of transportation infrastructure improvements. In doing so I draw on a * MIT Department of Economics, 50 Memorial Drive, Cambridge, MA 02142, and NBER email: [email protected] ( mit.edu ) . This paper was accepted to the AER under the guidance of Penny Goldberg, Coeditor. I am extremely grateful to Timothy Besley, Robin Burgess, and Stephen Redding for their encouragement and support through- out this project. Richard Blundell, Chang-Tai Hsieh, Costas Arkolakis, and Samuel Kortum provided detailed and valuable advice, and seminar audiences at Berkeley, BU, Brown, the CEPR ( Development Economics Conference ) , Chicago, CIFAR, the Econometric Society European Winter Meetings, Harvard, the Harvard-Hitotsubashi-Warwick Economic History Conference, IMF, LSE, MIT, Minneapolis Federal Reserve Applied Micro Workshop ) , NBER ( Summer Institute ITI ) , Northwestern, Nottingham, NYU, Oxford, Penn, Penn State, Philadelphia Federal Reserve, ( Princeton, Stanford, Toronto, Toulouse, UCL, UCLA, Warwick, Wharton, World Bank, and Yale, as well as two anonymous referees, made thoughtful comments that improved this work. I am also grateful to Erasmus Ermgassen, Rashmi Harimohan, Erin Hunt, Sritha Reddy, and Oliver Salzmann for research assistance; and to the Bagri Fellowship, the British Academy, CIFAR, the DFID Program on Improving Institutions for Pro-Poor Growth, the Economic and Social Research Council [ RES-167-25-0214 ] , the Nuffield Foundation, the Royal Economic Society, and STICERD for financial support. The author declares that he has no relevant or material financial interests that relate to the research described in this paper. † Go to https://doi.org/10.1257/aer.20101199 to visit the article page for additional materials and author disclosure statement. 899

2 900 APRIL 2018 THE AMERICAN ECONOMIC REVIEW comprehensive new dataset on the colonial Indian economy that I have constructed. First, I estimate the extent to which railroads improved India’s trading environment ( i.e., reduced trade costs, reduced interregional price gaps, and increased trade . Second, I estimate the reduced-form welfare gains higher real income lev- ) flows ( els ) that the railroads brought about. Finally, I assess, in the context of a general equilibrium trade model, how much of these reduced-form welfare gains could be plausibly interpreted as newly exploited gains from trade. ( then The railroad network designed and built by the British government in India known to many as “the Raj” ) brought dramatic change to the technology of trad- ing on the subcontinent. Prior to the railroad age, bullocks carried most of India’s commodity trade on their backs, traveling no more than 30 km per day along India’s sparse network of dirt roads Deloche 1994 ) . By contrast, railroads could trans- ( port these same commodities 600 km in a day, and at much lower per unit distance freight rates. As the 67,247 km long railroad network expanded from 1853 to 1930, ( local administrative regions ) , bringing them out of it penetrated inland districts near-autarky and connecting them with the rest of India and the world. I use the arrival of the railroad network in each district to investigate the economic impact of this striking improvement in transportation infrastructure. This setting is unique because the British government collected detailed records of economic activity throughout India in this time period. Remarkably, however, these records have never been systematically digitized and organized by researchers. I use these records to construct a new, district-level dataset on prices, output, daily rainfall, and interregional and international trade in India, as well as a digital map of India’s railroad network in which each 20 km segment is coded with its year of opening. This dataset allows me to track the evolution of India’s district economies before, during, and after the expansion of the railroad network. The availability of records on trade is particularly unique and important here. Information interregional on trade flows within a country is rarely available to researchers, yet the response of these trade flows to a transportation infrastructure improvement says a great deal ( as I describe explicitly below ) . about the potential for gains from trade To guide my empirical analysis I develop a Ricardian trade model with many regions, many commodities, and where trade occurs at a cost. Because of geograph- ical heterogeneity, regions have differing productivity levels across commodities, which creates incentives to trade in order to exploit comparative advantage. A new railroad link between two districts lowers their bilateral trade cost, allowing con- sumers to buy goods from the cheapest district, and producers to sell more of what they are best at producing. There are thousands of interacting product and factor markets in the model. But the analysis of this complex general equilibrium problem is tractable if production heterogeneity takes a convenient but plausible functional form, as shown by Eaton and Kortum ( 2002 ) . I use this model to assess empirically the importance of one particular mechanism linking railroads to welfare improvements: that railroads reduced trade costs and thereby allowed regions to gain from trade. Four results in the model drive a natural four-step empirical analysis, as follows. Step 1: Inter-district price differences are equal to trade costs ( in special cases ) . That is, if a commodity can be made in only one district ( the “origin” ) but is consumed

3 VOL. 108 NO. 4-5 901 DONALDSON: RAILROADS OF THE RAJ ( “destinations” , then that commodity’s origin-destination price in other districts ) difference is equal to its origin-destination trade cost. Empirically, I use this result ( ) by to measure trade costs which, like all researchers, I cannot observe directly exploiting widely traded commodities that could only be made in one district. Using inter-district price differentials, along with a graph theory algorithm embedded in a ( ) routine, I estimate the trade cost parameters govern- nonlinear least squares NLS ing traders’ endogenous route decisions on a network of roads, rivers, coasts, and railroads. This is a novel method for inferring trade costs in networked settings. My resulting parameter estimates reveal that railroads significantly reduced the cost of trading in India. Bilateral trade flows take the “gravity equation” form. Step 2: That is, holding constant exporter- and importer-specific effects, bilateral trade costs reduce bilateral trade flows. Empirically, I used the estimate from a gravity equation, in conjunction with the trade cost parameters estimated in Step 1, to identify all of the relevant unknown parameters of the model. Railroads increase real income levels. That is, when a district is con- Step 3: nected to the railroad network, its real income rises. Empirically, I find that railroad access raises real income by 16 percent. This reduced-form estimate could arise through a number of economic mechanisms. A key goal of Step 4 is to assess how much of the reduced-form impact of railroads on real income can be attributed to gains from trade due to the trade cost reductions found in Step 1. Step 4: There exists a sufficient statistic for the welfare gains from railroads. That is, despite the complexity of the model’s general equilibrium relationships, the impact of the railroad network on welfare in a district is captured by its impact on one endogenous variable: the share of that district’s expenditure that it sources from itself. A result similar to this appears in a wide range of trade models but has 1 Empirically, I regress real not, to my knowledge, been explored empirically before. income on this sufficient statistic ( as calculated using the model’s parameter esti- mates obtained in Steps 1 and 2 ) ( which capture alongside the regressors from Step 3 ) the reduced-form impact of railroads . When I do this, the estimated reduced-form coefficients on railroad access ( from Step 3 ) fall by more than one-half and the sufficient statistic variable is itself highly predictive. This finding provides support for Result 4 of the model and implies that decreased trade costs account for about one-half of the real income impacts of the Indian railroad network. These four results demonstrate that India’s railroad network improved the trading ( Steps 1 and 2 ) environment ( Step 3 ) , and suggest that and generated welfare gains these welfare gains arose in large part because railroads allowed regions to exploit gains from trade Step 4 ) . ( 1 Arkolakis, Costinot, and Rodríguez-Clare ( 2012 ) show that this prediction applies to the Krugman ( 1980 ) , Eaton and Kortum ( 2002 ) , and Chaney ( 2008 ) models of trade, but these authors do not test this prediction empirically.

4 902 APRIL 2018 THE AMERICAN ECONOMIC REVIEW A natural concern when estimating the impact of infrastructure projects is that of bias due to a potential correlation between project placement and unobserved changes in the local economic environment. These concerns are likely to be less as described in Section II ( ) important in my setting because military motives for railroad placement usually trumped economic arguments, the networked nature of railroad technology inhibited the ability of planners to target specific locations precisely, and planning documents reveal just how hard it was for technocrats to agree on the efficacy of railroad plans. Nevertheless, to mitigate concerns of selec- tion bias, I estimate the “effects” of over 40,000 km of railroad lines that reached advanced stages of costly surveying but, for three separate reasons that I document in Section VI, were never actually built. Reassuringly, these “placebo” lines never display spurious effects. This paper contributes to a growing literature on estimating the economic effects 2 as well as to a literature on estimating the “social of large infrastructure projects, 3 A distinguishing feature of my approach is that, in savings” of railroad projects. addition to estimating reduced-form relationships between infrastructure and wel- fare, as in the existing literature, I fully specify and estimate a general equilibrium 4 The model makes auxiliary predictions and model of how railroads affect welfare. suggests a sufficient statistic for the role played by railroads in raising welfare, all of which shed light on the economic mechanisms that could explain my reduced- form estimates. Using a model also improves the external validity of my estimates ( because the primitive in my model ) is specified explicitly and the cost of trading ( ) such as tariff liberalization or road construction is portable to a range of settings in which the welfare benefits of trade-cost-reducing polices might be sought. By contrast, my reduced-form estimates are more likely to be specific to the context of railroads in colonial India. This paper also contributes to a rich literature concerned with estimating the wel- fare effects of openness to trade, because the reduction in trade costs brought about 5 by India’s railroad network rapidly increased each district’s opportunities to trade. Again, the fact that my empirical approach connects explicitly to an estimable, gen- eral equilibrium model of trade offers advantages over the existing literature. The model suggests a theoretically consistent way to measure “openness,” sheds light on trade openness raises welfare, and provides a natural way to study changes in why openness to both internal and external trade at the same time. 2 ( 2011 ) estimates the effect of electrification on labor force participation in South For example, Dinkelman Africa; Duflo and Pande ( 2007 ) estimate the effect of dam construction in India on agriculture; Jensen ( 2007 ) evaluates how the construction of cellular phone towers in South India improved efficiency in fish markets; and Michaels 2008 ) estimates the effect of the US interstate highway system on the skilled wage premium. An earlier ( ( ) , pioneered the use of econometric methods in estimating the benefits of literature, beginning with Aschauer 1989 infrastructure projects. 3 ( 1964 ) Fogel ( 1983 ) first applied the social savings methodology to railroads in the United States, and Hurd performed a similar exercise for India. In Section VE, I compare my estimates to those from using a social savings approach. 4 The use of general equilibrium modeling, on its own, to evaluate transportation projects here is not novel. For example, both Williamson ( 1974 ) and Herrendorf, Schmitz, and Teixeira ( 2012 ) use calibrated general equilibrium models to study the impact of railroads on the antebellum US economy. 5 ) ( ) , Alcalá and Ciccone ( Frankel and Romer 1999 , Feyrer ( 2009 ) , and others use cross-country regres- 2004 sions of real GDP levels on “openness” ( defined in various ways ) to estimate the effect of openness on welfare. ) Pavcnik 2002 ) , Trefler ( 2004 ( , and Topalova ( 2010 ) among others instead analyze trade liberalizations within one country by exploiting cross-sectional variation in the extent of liberalization across either industries or regions.

5 VOL. 108 NO. 4-5 903 DONALDSON: RAILROADS OF THE RAJ The next section describes the historical setting in which the Indian railroad net- work was constructed and the new data that I have collected from that setting. In Section II, I outline a model of trade in colonial India and the model’s four results. Sections III through VI present a four-step empirical analysis that follows these four theoretical results. Section VII concludes. I. Historical Background and Data In this section I discuss some essential features of the colonial Indian economy and the data that I have collected in order to analyze how this economy changed with the advent of railroad transport. I go on to describe the transportation system in India before and after the railroad era, and the institutional details that determined when and where railroads were built. A. New Data on the Indian Economy, 1870–1930 In order to evaluate the impact of the railroad network on economic welfare in colonial India, I have constructed a new panel dataset on 235 Indian districts. The dataset tracks these districts annually from 1870–1930, a period during which 98 percent of British India’s current railroad lines were opened. Table 1 contains descriptive statistics for the variables that I use in this paper and describe throughout this section. Online Appendix A contains more detail on the construction of these variables. During the colonial period, India’s economy was predominantly agricultural, with 6 For ) . agriculture constituting an estimated 66 percent of GDP in 1900 Heston 1983 ( this reason, district-level output and area data were only collected systematically in the agricultural sector. Data on agricultural output were recorded for each of 17 ( which accounted for the vast majority of the cropped area of India principal crops ) in 1900 : bajra, barley, cotton, gram, indigo, jowar, jute, linseed, maize, opium, ragi, rice, sesamum, sugarcane, tea, tobacco, and wheat. Retail prices for these 17 crops were also recorded at the district level. I use these price, quantity, and area figures to construct a measure of real agricultural income per acre that provides the best available measure of district-level economic welfare in this time period. Real incomes were low during my sample period, but there was 35 percent growth between the beginning and end of the sample ( approximately 1870 to 1930 ) , 7 Real incomes were low because crop yields were low, according to my estimates. 8 both by contemporaneous international standards and by Indian standards today. One explanation for low yields that featured heavily in Indian agricultural textbooks 6 ( 2011 ) argue benefited from access to railroads in the Factory-based industry, which Atack, Haines, and Margo United States, amounted to only 1.6 percent of India’s GDP in 1900. 7 For comparison, Heston ( 1983 ) estimates that in 1869, on the basis of purchasing power exchange rates, per capita income in the United States was four times that in India. This income disparity rises to ten if market exchange rates are used instead of purchasing power parity ( PPP ) rates. 8 / For example, the yield of wheat in India’s “breadbasket,” the province of Punjab, was 748 lbs. acre in 1896. By contrast, for similar types of wheat, yields in Nevada ( the highest state yields in the United States ) in 1900 were almost twice as high ( see plate 15 of United States Census Office 1902 ) and yields in ( Indian ) Punjab by 2010 were an order of magnitude greater than those in 1896 ( https://data.gov.in/catalog/ district-wise-season-wise-crop-production-statistics ) .

6 904 APRIL 2018 THE AMERICAN ECONOMIC REVIEW Table 1—Descriptive Statistics End of Beginning of Number of observations available data available data 7.086 40.41 29.96 ) base year rupees ( Real agricultural income per acre ( ) 16.35 ( 50.56 ) 7,336 5.17 3.21 current rupees per maund ) ( Price of salt, all sources 1.49 ) ( ( 0.54 ) 1.29 0.75 120,462 ( Crop-specific rainfall shock meters ) 0.67 ( ) 1.38 ( ) 44.63 1,193 19.07 ( millions of 1870 rupees ) Total agricultural exports per trade block ) ( 45.68 ) ( 58.76 Values are sample means over all observations for the year and variable in question, with standard devia- Notes: tions in parentheses. Earliest beginning and latest end of available data are: 1870 and 1930 for agricultural output and real agricultural income; 1861 and 1930 for salt prices; 1870 and 1930 for rainfall; and 1882 and 1920 for trade data. Land area used to calculate income per acre is total cultivated area in first year of sample. A “maund” is equal to 37.3 kg and was the standardized unit of weight in colonial India. Total agricultural exports ( aggregating across commodities and destination blocks using per trade block converted from quantities to values ( in 1870 rupees ) ) all-India average prices in 1870. Data sources and construction described further in online Appendix A. ( such as Wallace 1892 ) was inadequate water supply. Only 12 percent of of the day cultivated land was irrigated in 1885 and while this figure had risen to 19 percent in 9 1930, the vast majority of agriculture maintained its dependence on rainfall. Because rainfall was important for agricultural production, 3,614 meteorolog- ical stations were built throughout the country to record the amount of rainfall at each station on every day of the year. Daily rainfall data were recorded and pub- lished because the distribution of rainfall throughout the year was far more import- ant to farmers and traders than total annual or monthly amounts. In particular, the intra-annual distribution of rainfall governed how different crops ( which were ) grown in distinct stretches of the year were affected by a given year’s rainfall. In Sections IV and VI, I use daily rainfall data collected from India’s meteorological stations to construct crop-specific measures of rainfall and use these as a source of rainfall and employ these as exogenous variation in crop-specific productivity. Commensurate with the increase in real agricultural income levels in India was a significant rise in interregional and international trade. The final component of the dataset that I have constructed on colonial India consists of data on these internal and external trades whenever they occurred via railroad, river, or sea ( data on road ) . The role that these data play in my analysis trade were only very rarely collected is explained in Section IV. B. Transportation in Colonial India Prior to the railroad era, goods transport within India took place on roads, riv- 10 The bulk of inland travel was carried by bull- ers, and coastal shipping routes. ocks, along the road network. On the best road surfaces and during optimal weather 9 These figures encompass a wide definition of irrigation, including the use of tanks, cisterns, and reservoirs as well as canals. See the Agricultural Statistics of India , described in online Appendix A. 1885 is the first year in which comprehensive irrigation statistics were collected. 10 The description of pre-rail transportation in this section draws heavily on the comprehensive treatments of 1985 . Deloche ( 1994 , 1995 ) and Derbyshire ( )

7 VOL. 108 NO. 4-5 905 DONALDSON: RAILROADS OF THE RAJ conditions, bullocks could pull a cart of goods and cover 20–30 km per day. However, high-quality roads were extremely sparse and the roads that did exist were virtually impassable in the monsoon season. For this reason most trade was carried by “pack” ( bullocks which carried goods strapped to their backs and usually traveled directly over pasture land ) , which were considerably slower and riskier than cart bullocks. Water transport was far superior to road transport, but it was only feasible on the 11 In optimal conditions, downstream Brahmaputra, Ganges, and Indus river systems. 12 ) could cover 65 km per day; upstream river traffic ( with additional oar power traffic needed to be towed from the banks and struggled to cover 15 km per day. Extensive river travel was impossible in the rainy monsoon months or the dry sum- mer months and piracy was a serious hazard. Coastal shipping, however, was peren- nially available along India’s long coastline. This form of shipping was increasingly steam-powered after 1840. Steamships were fast and could cover over 100 km per ( ) . day but could only service major ports Naidu 1936 Against this backdrop of costly and slow internal transportation, the appealing prospect of railroad transportation in India was discussed as early as 1832 ( Sanyal ) , though it was not until 1853 that the first track was actually laid. From the 1930 outset, railroad transport proved to be far superior to road, river, or coastal transport ( ) . Trains were capable of traveling up to 600 km per day and they Banerjee 1966 offered this superior speed on predictable timetables, throughout all months of the ( Johnson 1963 ) . Railroad year, and without any serious threat of piracy or damage freight rates were also considerably cheaper: 4–5, 2–4, and 1.5–3 times cheaper in terms of freight rates, than road, river, and coastal transport, respectively. A principal goal of Section III is to estimate how much railroad technology reduced total trade costs, costs which combine all of these attractions of railroads over other modes. C. Railroad Line Placement Decisions Throughout the history of India’s railroads, all railroad line placement decisions were made by the Government of India. It is widely accepted that the Government had three motives for building railroads: military, commercial, and humanitarian, in that order of priority Thorner 1950; Macpherson 1955; Headrick 1988 ) . In 1853, ( ( Lord Dalhousie ) wrote an internal document to head of the Government of India the East India Company’s Court of Directors that made the case for a vast railroad network in India and military motives for railroad-building appeared on virtually 13 These arguments gathered new momentum when the every page of this document. 1857 “mutiny” highlighted the importance of military communications ( Headrick 1988 ) . Dalhousie’s 1853 minutes described five “trunk lines” that would connect 11 Navigable canals either ran parallel to sections of these three rivers or were extremely localized in a small ( Stone 1984 ) number of coastal deltas . 12 Steamboats had periods of success in the colonial era, but were severely limited in scope by India’s seasonal and shifting rivers. 13 For example, from the introduction: “A single glance ... will suffice to show how immeasurable are the polit- ical advantages to be derived from the system of internal communication, which would admit of full intelligence of every event being transmitted to the Government ... and would enable the Government to bring the main bulk of its military strength to bear upon any given point in as many days as it would now require months, and to an extent which at present is physically impossible.” ( House of Commons Papers, 1853 ) .

8 906 APRIL 2018 THE AMERICAN ECONOMIC REVIEW India’s five major provincial capitals along direct routes and maximize the “political advantages” of a railroad network. Between 1853 and 1869, all of Dalhousie’s trunk lines were built, but not with- out significant debate over how best to connect the provincial capitals. Dalhousie and Major Kennedy, India’s Chief Engineer, spent over a decade discussing and surveying their competing, and very different, proposals for a pan-Indian network . This debate indicates the vicissitudes of railroad ( Davidson 1868; Settar 1999 ) planning in India and it was repeated many times by different actors in Indian rail- road history. I have collected planning documents from a number of railroad expan- sion proposals that, along with Kennedy’s proposal, were debated and surveyed at length, but were never actually built. As discussed in Section VD, I use these plans in a “placebo” strategy to check that unbuilt lines display no spurious “impact” on the district economies in which they were nearly built. ( by and large, the As is clear from Figure 1, the railroad network in place in 1930 same network that is open today ) had completely transformed the transportation sys- tem in India. Track open for traffic reached 67,247 km, constituting the fourth-largest network in the world. From their inception in 1853 to their zenith in 1930, railroads were the dominant form of public investment in British India. But influential observers were highly critical of this public investment priority: the Nationalist historian, Romesh 14 and Mahatma Dutt, argued that they did little to promote agricultural development, [ ] promote evil” ( Gandhi Gandhi argued simply that “it is beyond dispute that railroads 1938, p. 36 ) . In the remainder of this paper, I use new data to assess quantitatively the effect of railroads on India’s trading environment and agricultural economy. II. A Model of Railroads and Trade in Colonial India In this section I develop a general equilibrium model of trade among many regions in the presence of trade costs. The model is based on Eaton and Kortum ( 2002 , but with more than one commodity, and serves two purposes. First, it deliv- ) ers four results concerning the response of observables to trade cost reductions. Second, I estimate the unknown parameters of the model and use the estimated model to assess whether the observed reduction in trade costs due to the railroads can account, via the mechanism stressed in this model, for the observed increase in welfare due to railroads. Both of these features inform our understanding of how transportation infrastructure projects can raise welfare. A. Model Environment D regions ( indexed by either o or The economy consists of depending on d whether the region in question is the origin, o , or the destination, d , of a trade ) . There are K commodities ( indexed by k ) , each available in a continuum ( with mass normalized to 1 of horizontally differentiated varieties ( indexed by j ) . In my empir - ) ical application I work with data on prices, output, and trade flows that refer to com- modities, not individual varieties. While my empirical setting will consider 70 years 14 For example, from his landmark textbook on Indian economic history: “Railways ... did not add to the pro- duce of the land” ( Dutt 1904, p. 174 ).

9 907 VOL. 108 NO. 4-5 DONALDSON: RAILROADS OF THE RAJ Panel B. 1870 Panel A. 1860 Panel C. 1880 Panel D. 1890 Panel E. 1900 Panel F. 1910 Panel G. 1920 Panel H. 1930 Figure 1. The Evolution of India’s Railroad Network, 1860–1930 in Notes: These figures display the decadal evolution of the railroad network ( railroads depicted with thick lines ) ( . The first railroad lines were laid in 1853. The fig ) the outline of which is depicted with thin lines colonial India - ure is based on a GIS database in which each approximately ) 20 km long railroad segment is coded with a year of ( opening variable. Author’s calculations based on official publications. See online Appendix A for details. Source: of annual observations, for simplicity the model is static; I therefore suppress time subscripts until they are necessary. ) Consumer Preferences .—Each region o is home to a mass ( normalized to 1 L of identical agents, each of whom owns units of land. Land is geographically o immobile and supplied inelastically. Agents have Cobb-Douglas preferences over

10 908 APRIL 2018 THE AMERICAN ECONOMIC REVIEW CES ) and constant elasticity of substitution ( k ) preferences over commodities ( j ) within each commodity; that is, their utility function is varieties ( K 1 μ k ε k _ k dj ∫ = ∑ )) j ( , ln C ( U 1 ( ) o o ε ) ( 0 k = 1 k σ − 1 k k ____ ( j ) is consumption, ε is the constant elasticity of ≐ C where σ ( where k k o σ k μ = 1 . Agents rent out their land at the rate of ∑ per unit and r ) , and substitution k o k L to maximize utility from consumption. r use their income o o j of the commodity Production and Market Structure k can be .—Each variety produced using a constant returns to scale production technology in which land is 15 Importantly, land is homogeneous and can be allo- the only factor of production. cated to the production of any variety of any commodity without adjustment costs, consistent with a long-run interpretation that informs the empirical analysis below. k k of commodity j denote the amount of variety ( j ) that can be produced with z Let o k ( j ) z one unit of land in region o . I follow Eaton and Kortum ( 2002 ) in modeling o k drawn from a Type-II extreme value as the realization of a stochastic variable Z o distribution whose parameters vary across regions and commodities in the following manner: − θ k k k k ) ( exp z ) ≐ Pr , Z ≤ z = ( − A z ) ( ) 2 ( F o o o k ≥ 0 and θ 0 . These random variables are drawn independently for > where A k o k increases the A each variety, commodity, and region. The exogenous parameter o captures probability of high productivity draws and the exogenous parameter θ k inversely ) how variable the ( log ) ( k in any region is productivity of commodity around its ( log ) average. There are many competitive firms in region o with access to the technology above; 16 pre-trade These firms therefore charge a consequently, firms make zero profits. k k is the land rental r ( j ) = r , where / z ) j ( ) i.e., “free on board” costs price of p ( o o o oo o . rate in region .—Without opportunities to trade, consumers in region d Opportunities to Trade must consume even their region’s worst draws from the productivity distribution in equation ( 2 ) . The ability to trade breaks this production-consumption link. This allows consumers to import varieties from other regions in order to take advantage 15 ( This is clearly an extreme assumption, made here for parsimony though all results would be unaffected if agricultural production were a Cobb-Douglas aggregator of land and other inputs as long as those inputs are immo- ) . However, if crops differ in their factor intensities bile as in Heckscher-Ohlin models of trade ) , factor intensities ( are endogenous to factor prices, or factors are mobile, then while the four results in Section IIB would be unaffected k π , based on the factor market-clearing equilibrium of the model, 18 in equation ( the procedure used to compute ) oot would need to be altered. I return to the discussion of labor mobility in Section VA. 16 My empirical application is to the agricultural sector. This sector was characterized by millions of small- holding farmers who were likely to be price-taking producers of undifferentiated products ( j in the model ) . varieties For example, in the 1901 census in the province of Madras, workers in the agricultural sector ( 67.9 percent of the almost 20 million strong workforce ) were separately enumerated by their ownership status, and 35.7 percent of these workers were owner-cultivators, or proprietors of extremely small-scale farms ( Risley and Gait 1903 ) .

11 VOL. 108 NO. 4-5 909 DONALDSON: RAILROADS OF THE RAJ of the favorable productivity draws available there, and allows producers to produce more of the varieties for which they received the best productivity draws. These two mechanisms constitute the gains from trade in this model. However, there is a limit to trade because the movement of goods is subject to . These trade ( trade costs ) which include transport costs and other barriers to trade costs take the convenient and commonly used “iceberg” form. That is, in order for k ≥ 1 units of the commodity must to arrive in region d , T one unit of commodity k od k = Throughout this ( 1 . T ; trade is free when be produced and shipped in region o od and a destination region d ; paper I refer to trade flows between an origin region o k . Trade costs ) o to d , refer to quantities from T all bilateral variables, such as od ) cheaper to ship ( are assumed to satisfy the property that it is always weakly to region d , rather than via some third region m : that is, directly from region o k k k k ≤ T T = . Finally, I normalize T 1 . In my empirical setting I proxy for T oo od om md k with measures calculated from the observed transportation network, which incor - T od and region . Railroads o porates all possible modes of transport between region d k and creating T enter this transportation network gradually over time, reducing od more gains from trade. Trade costs drive a wedge between the price of an identical variety in two dif- k produced k of commodity ( j ) denote the price of variety j ferent regions. Let p od in region o , but shipped to region d for consumption there. The iceberg formula- tion of trade costs implies that, under perfect competition, any variety in region k k k k ) times more than it does in region o ; that is, p j ( j ) = T ( p d T will cost oo od od od k k z T . ) / j ( r = o o od .—Consumers have preferences for all variet- Equilibrium Prices and Allocations j ies k . But they are indifferent about along the continuum of varieties of commodity where a given variety is made: they simply buy from the region that can provide the ( after accounting for trade costs ) . I therefore solve for the variety at the lowest cost d actually pay, given that they will only equilibrium prices that consumers in a region ( ) . buy any particular given variety from the cheapest source region including their own k ) , is sto- ( j The price of a variety sent from region o to region d , denoted by p od k k ( j ) . Since is drawn ) ( j z z chastic because it depends on the stochastic variable o o k ( is the real- ) j ( p , CDF in equation ) 2 ( from the cumulative distribution function ) od k drawn from the CDF P ization of a random variable od − θ k k θ k k k k ( p ) ≐ Pr ( P . ≤ p ) = 1 − exp p − A r T G ( 3 ) ( ) ] [ od od o o od This is the price distribution for varieties ( of commodity k ) made in region o that could d . The price distribution for the varieties that potentially be bought in region k ) ) p ( is the dis- consume consumers in whose CDF is denoted by ( G actually will d d D regions of the world: tribution of prices that are the lowest among all D k k G ( p ) = 1 − ∏ ) p 1 − G ( ] [ od d = o 1 D − θ k θ k k k exp − 1 = − A ∑ r T p . ( ) od o o [ ] ) ( o 1 =

12 910 APRIL 2018 THE AMERICAN ECONOMIC REVIEW d Given this distribution of the actual prices paid by consumers in region , it is straightforward to calculate any moment of the prices of interest. The price moment that is relevant for my empirical analysis is the expected value of the equilibrium j k found in region d , which is given by price of any variety of commodity − / θ 1 k D − θ k k k k k k ) ≐ p p = λ ∑ , A ( r T j ) ( 4 ( ) E ] [ d o d od o 1 ] [ 1 = o 1 k 17 __ where λ . ≐ Γ( 1 ) In my empirical application below I treat these expected + 1 θ k 18 prices as equal to the observed prices collected by statistical agencies. derive ) 3 2002 ) , Eaton and Kortum Given the price distribution in equation ( ( two important properties of the trading equilibrium that carry over to the model here. First, the price distribution of the varieties that any given origin actually sends ( i.e., the distribution of prices for which this origin is region d ’s to destination d is the same for all origin regions. This implies that the share of cheapest supplier ) allocate to varieties from region o must be d expenditure that consumers in region d ( because the price equal to the probability that region o supplies a variety to region , does not depend on the d per variety, conditional on the variety being supplied to k k k k / on com- = π d , where X is total expenditure in region X ) . That is, X origin od d od od k k d is total expenditure in region ≐ ∑ X on X o modities of type k from region , d o od k d is the probability that region sources any variety of π k commodities of type , and od k is given by . Second, this probability o from region k commodity π od k θ k X od k k k k k − θ _ k p = λ A ( r , T ) = π 5 ( ) ( ) o d od 3 o od k X d k k − θ k , and this equation makes use of the definition of the expected ) = ( λ where λ 1 3 k ) ( 4 ) . from equation ( i.e., p value of prices d ( 5 ) characterizes trade flows conditional on the endogenous land rental Equation k ) ( and all other regions’ land rental rates, which appear in p . It remains r rate, d o to solve for these land rents in equilibrium, by imposing the condition that each region’s trade is balanced. Region o ’s trade balance equation requires that the total must equal the total value of all ) L ( r o income received by land owners in region o o ∞ t z − 1 − 17 Γ( · ) is the Gamma function defined by t Γ( z ) = e ∫ dt . 0 18 A second price moment that is of interest for welfare analysis is the exact price index over all varieties of 1 1 − σ / k k − σ k 1 1 k ∫ ≐ commodity k for consumers in region ( p d . Given ( j )) ( CES ) preferences, this is dj p , which ̃ d d [ ] 0 ( + θ σ 1 a condition I assume throughout ) . The exact price index is given by is only well defined here for < k k 1 /( 1 − σ ) k k γ θ + 1 − σ k k k k k k k __ ________ ≐ = λ and γ p ≐ Γ p . That is, if statistical agencies sampled varieties , where λ ̃ 2 d 2 d k ( ) [ ] θ k λ 1 in proportion to their weights in the exact price index, as opposed to randomly as in the expected price formulation of equation ( 4 ) , then this would not jeopardize my empirical procedure because the exact price index is proportional to expected prices.

13 VOL. 108 NO. 4-5 911 DONALDSON: RAILROADS OF THE RAJ o commodities made in region including region o and sent to every other region ( 19 That is, . ) itself k k ) 6 ( r , = ∑ ∑ X L = L ∑ π r μ ∑ k o d o od od d k d k d where the last equality uses the fact that ( with Cobb-Douglas preferences ) expendi- k will be a fixed share ) μ of the total income in k ( X d ture in region on commodity d k L ) . Each of the D regions has its own trade balance equation of region ( r d i.e., of d d ) as the numéraire good, so the this form. I take the rental rate in the first region ( r 1 that solves this equilibrium of the model is the set of unknown rental rates r D − 1 d system of − 1 ( nonlinear ) independent equations. D Four Results B. In this section I state explicitly four important results that emerge from the ( i.e., model outlined above, in the order in which they drive my empirical analysis ) Steps 1– 4 . RESULT 1: ( in special cases Price differences measure trade costs In the presence of trade ). costs, the price of identical commodities will differ across regions. In general, the cost of trading a commodity between two regions places only an upper bound on their price differential. However, in the special case of a homogeneous commodity that can only be produced in one origin region, equation ( 4 ) predicts that the ( log ) price differential between the origin o d will of this commodity and any other region ( log cost of trading the commodity between them. That is, be equal to the ) o o o , − ln p = ln T ( 7 ln p ) od d o k is replaced by o where the commodity label to indicate that this equation is only true for commodities that can only be made in region . This result is important for o o ) , which are never com- my empirical work below because it allows trade costs ( T od pletely observed, to be inferred. But it is important to note that this result, essentially just the assumption of free arbitrage over space, net of trade costs, is not a testable o . prediction in the absence of direct data on T od RESULT 2: Bilateral trade flows take the “gravity equation” form. Equation ( 5 ) describes bilat- eral trade flows explicitly, but I restate it here in logarithms for reference: ( log ) bilat- eral trade of any commodity from any region o to any other region d is given by k k k k k k X ln . = ln λ + + ln A − θ ln r − θ ln T p + θ ln X ln ) 8 ( od k k od d d o k o k 19 The essential assumption here is that the trade balance is fixed and exogenous, not that it is fixed to zero. The assumption of fixed district-level trade balance is not innocuous but I am unaware of any direct evidence on this point.

14 912 APRIL 2018 THE AMERICAN ECONOMIC REVIEW This is the gravity equation form for bilateral trade flows, which is common to many widely used trade models: bilateral trade costs reduce bilateral trade flows, condi- tional on importer- and exporter-specific terms. RESULT 3: o is equal Railroads increase real income levels. In this model, welfare in district 20 , which is given by real land rents: ( to its real income per unit land area ) , W o r r o o _________ __ ) W 9 ( = . ≐ o μ k ̃ K k P o p ∏ ( ) ̃ k = o 1 Unfortunately, the multiple general equilibrium interactions in the model are too complex to admit a closed-form solution for the effect of reduced trade costs on wel- 21 - ( To make progress in generating qualitative predictions to guide my empir fare. ) I therefore assume a much simpler environment for the purpose of ical analysis obtaining Result 3 only. I assume: there are only three regions called Y , and Z ) ; ( X , k so I will dispense with the ( there is only one commodity superscripts on all vari- ) A ; and ables L ) ; the regions are symmetric in their exogenous characteristics ( i.e., o o and the three regions have symmetric trade costs with respect to each other. I consider the comparative statics from a local change around this symmetric equilibrium that ( reduces the bilateral trade cost symmetrically between two regions X and Y ) . say It is straightforward to show as is done in online Appendix B ) that ( W d X _ 0. < ( ) 10 T d YX That is, real income in a region say, X ) rises when the bilateral cost of trading ( between that region and any other region ( say, Y ) falls. RESULT 4: There exists a sufficient statistic for the welfare gains from railroads. Using the real income per unit of land ( d = o , ( log ) evaluated at ) bilateral trade equation 5 can be rewritten as μ μ k k k k _ _ , = Ω + π ln ln A − ∑ ∑ ( ) ln 11 W oo o o θ θ k k k k k ∑ μ ln γ ( where . This result states that welfare is Ω ≐ − up to the constant, Ω ) k k ( exogenous ) local productivity levels a function of only two terms, one involving k ( , and a second term that I will refer to as “the trade share” ) i.e., the fraction A ( o k , which equals 1 in buys from itself, o ’s expenditure that region o π of region oo ) . Because of the complex general equilibrium relationships in the model, autarky 20 k p , defined in footnote 18. o Recall that is the CES price index for commodity k in region ̃ o 21 Eaton and Kortum ( 2002, p. 1758 ) derive analytical expressions for the case of one sector and multiple regions but only under the extreme cases in which trade costs are either zero ( T = 1 ) or prohibitive ( T ∞ → o d o d for all o ≠ d ) .

15 VOL. 108 NO. 4-5 913 DONALDSON: RAILROADS OF THE RAJ ( , the full vector ) the full matrix of trade costs between every bilateral pair of regions of productivity terms in all regions, and the sizes of all regions all influence welfare ( that is, every exogenous variable in the model other in region o . But these terms affect welfare only through their effect on the trade share. ) than local productivity k terms over the appropriately weighted sum of π Put another way, the trade share ( oo k ) is a sufficient statistic for welfare in region o , once local productivity is goods controlled for. If railroads affected welfare in India through the mechanism in the ( by reducing trade costs, giving rise to gains from trade ) , then Result 4 states model that one should see no additional effects of railroads on welfare once the trade share k is controlled for. ) π ( oo From Theory to Empirics C. with To relate the static model in Section II to my dynamic empirical setting ( , I take the simplest possible approach and assume that all ) 70 years of annual data of the goods in the model cannot be stored, and that interregional lending is not possible. Furthermore, I assume that the stochastic production process described in Section IIA is drawn independently in each period. These assumptions imply that the static model simply repeats every period, with independence of all decision mak- ing across time periods. Throughout the remainder of the paper I therefore add the t to all of the variables ( both exogenous and endogenous ) subscript in the model, but μ are fixed over time. , σ , and θ I assume that all of the model parameters k k k The four theoretical results outlined in Section IIB take a naturally recursive order, both for estimating the model’s parameters, and for tracing through the impact of railroads on welfare in India. I follow this order in the four empirical sections that follow ( i.e., Steps 1– 4 ) . In Step 1, I evaluate the extent to which rail- roads reduced trade costs within India using Result 1 to relate the unobserved trade k ) to observed features of the transportation network. In T ( costs term in the model odt Step 2, I use Result 2 to measure how much the reduced trade costs found in Step 1 increased trade in India. This relationship allows me to estimate the unobserved , and ) ( the elasticity of trade flows with respect to trade costs θ model parameter k k 22 to rainfall, which is an exoge- ) ( to relate the unobserved productivity terms A ot nous and observed determinant of agricultural productivity. Steps 1 and 2 therefore deliver estimates of all of the model’s parameters. In Step 3, following Result 3 I estimate how the level of a district’s real income is affected by the arrival of railroad access to the district. However, the empirical finding in Step 3 is reduced-form in nature and could arise through a number of such as enhanced mobility labor, capital, or technology ) . ( possible mechanisms Therefore, in Step 4 I use the sufficient statistic suggested by Result 4 to compare reduced-form effects of railroads on the level of real income ( found in Step 3 ) the ( as estimated in Steps 1 and 2 ) . with the effects predicted by the model 22 k A poten- The productivity terms are unobserved because they represent the location parameter on region o ’s ot 2 productivity distribution of commodity k , in equation ( tial ) . The productivities actually used for production in region o will be a subset of this potential distribution, where the scope for trade endogenously determines how the potential distribution differs from the distribution actually used to produce.

16 914 APRIL 2018 THE AMERICAN ECONOMIC REVIEW Empirical Step 1: Railroads and Trade Costs III. In the first step of my empirical analysis, I estimate the extent to which railroads reduced the cost of trading within India. Because this paper explores a trade-based mechanism for the impact of railroads on welfare, it is important to assess whether railroads actually reduced trade costs. Further, the relationship between railroads and trade costs, which I estimate in this section, is an important input for Steps 2 and 4 that follow. A. Empirical Strategy 23 But Result 1 suggests Researchers never observe the full extent of trade costs. : if a homogeneous commodity can a situation in which trade costs can be inferred of that commodity ( only be made in one region, then the difference in retail prices ) between the origin region and any other consuming region is equal to the cost of 24 trading between the two regions. Throughout Northern India, several different types of salt were consumed, each of which was regarded as homogeneous and each of which was only capable of being made at one unique location. For example, traders and consumers would speak of “Kohat salt” ( which could only be produced at the salt mine in the Kohat ) or of “Sambhar salt” ( which could only be produced at the Sambhar Salt region 25 And official price statistics would report a distinct price for each different ) . Lake type of salt. I have collected data on salt prices in Northern India, in which the prices of six regionally differentiated types of salt are reported annually from 1861–1930. Crucially, because salt is an essential commodity, it was consumed ( and therefore sold at markets where its price could be easily recorded ) throughout India both before and after the construction of railroads. I use these salt price data, with the help of Result 1, to estimate how Indian rail- ( ) of Result 1 as follows: 7 roads reduced trade costs. To do this I estimate equation o o o o . = β + ε ) α , + β + δ ln LCRED ( R ( 12 ) ln p odt od odt dt ot t      o ⏟ o = ln p ot T ln = odt o is the price of type- o salt ( that is, salt that can only be made In this equation, p dt in region ) in destination district o in year t . I estimate this equation with an d 23 Even when shipping receipts are observed, as in Hummels ( ) , these may fail to capture other barriers to 2007 trade, such as the time goods spend in transit, or the risk of damage or loss in transit. 24 ( 2004, p. 78 ) suggest the solu- In their survey of attempts to estimate trade costs, Anderson and van Wincoop [ region ] for each product. We are not aware tion I pursue here: “A natural strategy would be to identify the source ( 2008 ) of any papers that have attempted to measure trade barriers this way.” Recent work by Keller and Shiue on nineteenth century Germany and Andrabi and Kuehlwein ( 2010 ) on colonial India documents that when two markets are connected by railroad lines, these markets’ prices ( for similar commodities ) converge. This approach demonstrates that railroads lowered trade costs, but does not aim to estimate the level of trade costs or the magnitude of the effect of railroads on trade costs. 25 The leading ( nine-volume ) ( 1889 ) , describes the market for commercial dictionary in colonial India, Watt salt in this manner, as do Aggarwal ( 1937 ) and the numerous provincial Salt Reports that were brought out each and price data collectors year. Based on the descriptions in Watt 1889 ) , it is plausible that consumers ( ( ) could distinguish between the salt types that would typically sell in a given region. Kohat salt, for example, is a rock salt with a pink hue; Sambhar salt, by contrast, is powdery and often contained, at that time, small amounts of yellow or brown residue.

17 VOL. 108 NO. 4-5 915 DONALDSON: RAILROADS OF THE RAJ o 26 β origin-year fixed effect ) to control for the price of type- o salt at its origin o ( ot o ) because I do not observe salt prices exactly at the point where they leave i.e., p ( ot My price data are at the district level and are based on records of the the source. ( ) price of a commodity averaged over 10–15 retail markets in a district. The remainder of equation 12 ) describes how I model the relationship between ( o ) , , which are unobservable, and the railroad network ( denoted by R T trade costs t odt ) α , , R ( LCRED which is observable. The core of this specification is the variable t lowest-cost route effective distance between the origin and which measures the o d destination . I describe this variable in detail below. The t districts in any year captures the elasticity of trade costs with respect to “effective distance.” parameter δ o ) which con- β This specification also includes an origin-destination fixed effect ( od trols for all of the time-invariant determinants of the cost of trading salt between o and d ( such as the distance from o to districts , or caste-based or ethnolinguistic d o is an error term that o differences between and d that may hinder trade ) . Finally, ε odt captures any remaining unobserved determinants of trade costs or measurement ( o 27 . ) error in p ln dt The variable LCRED ( R models the cost of trading goods between any two , α ) t locations under the assumption that agents take the lowest-cost route, using any modes of transportation, available to them. Two inputs are needed to calculate the effective length of the lowest-cost route between districts and d in year t . The first o , which I denote t input is the network of available transportation routes open in year . A network is a collection of nodes and arcs. In my application, nodes are R by t finely spaced points in space, and arcs are available means of transportation between the nodes ) . In modeling ( hence an arc could be a rail, river, road, or coast connection detailed in online Appendix A ) I allow agents to travel on navigable this network ( rivers, the coastline, the road network, and the railroad network open in year . t The second input is the relative cost of traveling along each arc, which depends on which mode of transportation the arc represents. I model these costs as being pro- portional to distance, where the proportionality, the per unit distance cost, of using coast rail road river , α ) , α . α , each mode is denoted by the vector of parameters α ≐ ( α rail = 1 so the other three elements of α represent costs relative to α I normalize the cost of using railroads. Since only relative costs affect the identity of the lowest cost route, this normalization has no bearing on the actual route taken between any is measured as a , α ) R ( pair of districts. Because of this normalization, LCRED odt t railroad-equivalent distance; in this sense, a finding that all of the non-rail elements of α are greater than 1 would imply that India’s expanding railroad network shrunk “effective distance,” or distance measured in a railroad-equivalent sense. The parameter α is unknown, so I treat it as a vector of parameters to be esti- ) α , α ( LCRED , it is possible to calculate mated. Conditional on a value of R odt t quickly using Dijkstra’s shortest-path algorithm ( Ahuja, Magnanti, and Orlin 1993 ) . But since α is unknown, I estimate it using nonlinear least squares ( NLS ) . That is, I 26 o has its own fixed effect in each year t . I use this notation when referring to fixed That is, each salt origin effects throughout this paper. 27 In this specification and all others in this paper, I allow this error term to be heteroskedastic and serially cor - related within districts ( or trade blocks, in Section IV ) in an unspecified manner.

18 916 APRIL 2018 THE AMERICAN ECONOMIC REVIEW α search over values of , recomputing the lowest-cost routes at each step, to find the 28 12 ) value that minimizes the sum of squared residuals in equation ( . Data B. I use data on retail prices of six types of salt, observed annually from 1861–1930 ( in other regions, reported in an unbalanced panel of 133 districts of Northern India . Further details on the data I salt prices were not broken down by region of origin ) use in this and other sections of this paper are provided in online Appendix A. Results C. OLS ) estimates of equation Table 2 presents ordinary least squares 12 ) . In col- ( ( umn 1 I estimate the effect of the lowest-cost route effective distance on trade costs ( α when the relative costs of each mode are set to observed historical relative ) freight rate estimates. I use the relative per unit distance freight rates described in road river coast . 25 3 = 4 . 5 , α all = . 2 0 , and α ( = at their midpoints Section IB α : ) ( relative to the freight rate of railroad transport, normalized to 1 ) . Column 1 demon- strates that the elasticity of trade costs with respect to the lowest-cost route effective distance, calculated at observed freight rates, is 0.088, and this is statistically signif- icant at the 5 percent level. However, as argued in Section IB, it is possible that these observed relative freight rates do not capture the full benefits such as increased certainty or time ( savings ) of railroad transport relative to alternative modes of transportation. For this reason the NLS specification in column 2 estimates the relative freight rates i.e., the parameters α ) that minimize the sum of squared residuals in equation ( 12 ) . ( ( Column 2 is my preferred specification. When the mode-wise distance costs α ) i.e., are not restricted to be equal to the observed freight rates, the estimated elasticity of trade costs with respect to effective distance i.e., δ ) rises to 0.169. Even when ( controlling for all unobserved, time-constant determinants of trade costs between all salt sources and destinations, as well as unrestricted shocks to the source price of each salt type, reductions in trade costs along lowest-cost routes ( estimated from ) railroad-driven time variation in these routes alone have a large effect on reducing salt price gaps over space. The nonlinear specification in column 2 also estimates the relative trade costs by mode that best explain observed salt price differentials. The estimated relative cost of each of the three alternative modes of transport is larger than 1 and has an esti- ( mated bootstrapped 95 percent confidence interval that exceeds 1 ) , implying that these alternative modes are more expensive ( per unit distance ) than rail travel. These non-rail mode estimates are, by and large, similar to the historically observed freight rate estimates used in column 1, with estimated confidence intervals that span the his- torical rates, except for the case of coastal shipping which evidently had a greater cost elasticity with respect to distance than one might conclude from freight rates alone. 28 In practice, I use a grid search over values of α from 1 to 10 with grid sizes of 0 . 125. Standard errors are bootstrapped using a similar grid search ( but with a coarser grid size of 0.5 ) .

19 VOL. 108 NO. 4-5 917 DONALDSON: RAILROADS OF THE RAJ Table 2—Railroads and Trade Costs: Step 1 Dependent variable: log salt price at destination ( 1 ) ( 2 ) log effective distance to source, along lowest-cost route 0.088 ( at historical freight rates ) ( 0.028 ) log effective distance to source, along lowest-cost route 0.169 ( at estimated mode costs ) [ 0.062, 0.296 ] Estimated mode costs per unit distance: 1 Railroad ( normalized to 1 ) N / A Road 2.375 [ 1.750, 10.000 ] 2.250 River ] [ 1.500, 6.250 Coast 6.188 5.875, 10.000 ] [ 7,345 7,345 Observations 2 0.946 0.946 R Notes: Regressions estimating equation ( 12 ) using data on 6 types of salt ( listed in online Appendix A ) , from 133 districts in Northern India, annually from 1861 to 1930. Column 1 and column 2 estimated by OLS and NLS respectively; both include salt type × year and salt type × destination fixed effects. “Effective distance to source, because railroad freight rate is normalized ( along lowest-cost route” measures the railroad-equivalent kilometers to 1 ) between the salt source and the destination district, along the lowest-cost route given relative mode costs per unit distance. “Historical freight rates” used are 4.5, 3.0, and 2.25 respectively for road, river, and coastal mode costs per unit distance, all relative to rail transport. Standard errors corrected for clustering at the destination dis- trict level are reported in parentheses of column 1, and bootstrapped 95 percent confidence intervals are reported in column 2. An important caveat when interpreting the results in this section, and the results in Steps 2 and 4 that depend on the estimates here, is that railroads may have done more to reduce trading frictions, as estimated here, than simply to reduce the phys- ical costs of transporting goods. For example, railroads may have made it easier for price information to spread, whether directly via the movement of traders or ( which traveled for free on the railroads ) or indirectly via the telegraph lines post ( since telegraph lines were used for the rail- that followed railroad lines in space 29 Because of the symbiotic relationship among ) roads’ traffic signaling technology . railroads, telegraphs, and the postal service ( Kerr 2007 ) , the results here capture the composite effects of railroads on trade costs that combine a number of possible channels. To summarize, the results in column 2 of Table 2 contain two important findings. ˆ First, the coefficient on the lowest-cost route effective distance ( δ ) is positive, which ( implies that trade costs increase with effective distance in railroad-equivalent kilo- ˆ ) . And second, the estimated mode-specific per-unit distance costs ) meters ( α are all ( and statistically significantly so ) , implying that railroads played a greater than 1 29 from 1878 Describing the movement of traders, Kerr ( 2007, p. 109 ) quotes from an account ( ) in a Madras Chetty newspaper emphasis and parentheses in original ) : “The Madras Chetty [ ( = Chettiar, a Tamil trading caste ] hears of something to be bought at Coimbatore, he no longer sends a note, he goes there, views the article he pro- poses to buy and buys them himself . Nothing suits him so well, no one need to be trusted, not even his own brother, he himself has the iron horse at his disposal, and can do the work himself.”

20 918 APRIL 2018 THE AMERICAN ECONOMIC REVIEW - role in reducing effective distance when compared to alternative modes of transpor tation. I use the estimates in column 2 in Steps 2 and 4 to follow. Empirical Step 2: Railroads and Trade Flows IV. The first step of my empirical strategy demonstrated that India’s railroad network reduced trade costs. I now estimate the extent to which this reduction in trade costs affected trade flows within India. This step is important for two reasons. First, an expansion of trade volumes as a result of the railroad network is a necessary con- dition for the mechanism linking railroads to welfare gains in the model. Second, as I show below, estimating the model’s gravity equation allows all of the model’s parameters to be inferred. Equipped with these parameter estimates, I am able to explore empirically Result 4 in Section VI. Empirical Strategy A. Result 2 of the model suggests a particular relationship between bilateral trade flows and bilateral trade costs, a gravity equation describing trade between any two k ( ) 12 introduced in equation T regions. Substituting the empirical specification for odt 8 yields ( ) into equation k k k ˆ ˆ 13 ) ln ( X ) + ln A − θ ln r α − θ , δ ln LCRED ( R = β t ot k ot od odt odt k k k k + θ p + ln X ln + ε . odt dt k dt k to region o from region in d k refers to the value of exports of commodity X Here, odt and the other variables were defined in Section II. Note that this substitution year t ˆ ˆ δ , α assumes that the empirical estimates of trade cost parameters ( ) obtained from Step 1, using data on salt, are valid for any commodity . This assumption is made k ( out of necessity since trade cost estimates are not available for any commodity but ) , but I discuss below some tests that fail to reject it. salt ( 13 ) in two stages, with two goals in mind. My I estimate a version of equation . As is typical in the empirical θ first goal is to estimate the unknown parameters k gravity equation literature, estimation of equation ( 13 ) is complicated by the pres- k k X , p . Fortunately, because my interest , and ) r ( ence of endogenous regressors dt ot dt , that is, in how the trade cost reductions brought about here lies in the coefficient θ k by railroads translated into expansions in trade flows, I estimate this equation in the following manner: k k k k k ˆ ˆ δ = β . + β + β − θ ln LCRED ( R , α ) + ε ( 14 ln ) X ot odt t k od dt odt odt k k β In this specification, the term is an origin-year-commodity fixed effect and β is dt ot a destination-year-commodity fixed effect ( the inclusion of these two fixed-effects k k k k is an , θ ln r β , ln p and , and ln X )) in equation ( 13 A ln absorbs the terms ot ot ot dt od k origin-destination-commodity fixed effect ( the inclusion of which was motivated in Section III by the concern that some costs of trading may be unobservable ) . I

21 VOL. 108 NO. 4-5 919 DONALDSON: RAILROADS OF THE RAJ estimate this equation separately for each of the agricultural commodities in my for each commodity . k θ trade flows dataset, in order to estimate a value of k is to estimate the determinants of My second goal in estimating equation ( 13 ) k ˆ θ . Armed with estimates of , obtained from the underlying productivity terms, A k ot k in a above, it is possible to estimate the determinants of A 14 ( estimating equation ) ot k k to observables by assuming that is a function A A second stage as follows. I relate ot ot k . As argued in Section I, rainfall N of a crop-specific rainfall shock, denoted by RAI ot was an important determinant of agricultural productivity in India because most land was un-irrigated. However, a given distribution of annual rainfall would affect each crop differently because each crop has its own annual timetable for sowing, growing, and harvesting, and these timetables differ from district to district. To shed light on these crop- and district-specific agricultural timetables, I use the 1967 edi- ( Directorate of Economics and Statistics 1967 ) , tion of the Indian Crop Calendar which lists sowing, growing, and harvesting windows for crops and districts in my k , I use daily rainfall data to calculate the N RAI sample. To construct the variable ot t that fell between the first sowing date and the last harvest amount of rainfall in year 30 in district o . k date listed for crop It is then possible to estimate the relationship between rainfall and productivity k ) 14 ( can ) in equation ( β by noting that the exporter-commodity-year fixed effect ot k k = ln A 13 ) − θ ( ln r , by comparing equations β be interpreted in the model as ot k ot ot k k N ) ) and rainfall ( RAI 14 A ( . I model the relationship between productivity ) and ( ot ot k k = κ . Guided by this relationship, RAI N A ln in a parsimonious semi-log manner: ot ot k k ˆ ˆ ˆ ̃ ˆ ln X + θ ) ln r + θ and δ ≐ ln LCRED ( R α , ln X I define the variable t k k odt odt odt ot κ in the following estimating equation: estimate the parameter k k k k k ̃ . = β + ε + β + β N + κ RAI X 15 ln ( ) ot dt odt ot odt od k k β , represent exporter-commodity, commodity-year, and β , and β The terms t ot o exporter-year fixed effects, respectively. I include these terms to control for unob- served determinants of exporting success that do not vary across regions, commod- and time. As a result, the coefficient κ is estimated purely from the variation ities 31 ( ) is The final term in equation 15 in rainfall over space, commodities and time. k that includes any determinants of exporting success, other than ) ε ( an error term odt rainfall, that vary across regions, commodities and time. In summary, the two-stage method described above estimates the parameter for which I have trade data. This method for each of the agricultural goods k θ k k and A estimates the relationship between the unobserved productivity terms also ot k ) ( governed by the parameter κ . RAI N crop-specific rainfall ot 30 k RAI N as the total rainfall between the first The results are largely insensitive to alternatively measuring to sowing date and the first harvest date since very little rain fell in the harvest window. 31 This within-block-year identification strategy therefore estimates the effect, κ , that is common to all crops. While in practice crops may differ in their rainfall sensitivities some of this heterogeneity is likely to be captured by k the use of crop-specific rainfall amounts, RAI N . ot

22 920 APRIL 2018 THE AMERICAN ECONOMIC REVIEW Data B. ) and ( 15 ) using data on the physical quantities of internal ( I estimate equations 14 among 47 regions known as trade blocks ) trade ( , over rail and river transport routes, 32 for 14 principal agricultural commodities plus salt, annually from 1882 to 1920. ( as explained in detail Because four of the trade blocks comprise major port cities and the internal trade data to / in online Appendix C ) from each major port included trade / to from foreign countries via the major port city in question, these estimates also 15 , I ( ) incorporate the bulk of international trade flows. When estimating equation k , averaged over districts within trade ( RAI N use the crop-specific rainfall measure ot o ) described briefly above ( and in more detail in online Appendix A ) and, block lacking reliable data on land rental rates, I use nominal agricultural output per acre since in the model these two measures are equivalent ( ) . as a measure of r ot C. Results 14 ) . While the ultimate ( Table 3 presents OLS estimates of variants of equation for ) is to estimate the unknown parameters θ reason for estimating equation ( 14 k each commodity , I begin by reporting estimates from a specification that pools k ( 14 ) across commodities. I do this to explore the plausibility estimates of equation δ , which relates the lowest-cost route effective of my assumption that the parameter ˆ ) ) to trade costs and was estimated using only α , ( LCRED ( distance variable R odt t ( salt , is constant across all agricultural commodities. one commodity ) 14 ) ( Column 1 of Table 3 presents estimates of equation pooled across com- modities. The results in column 1 provide support for Result 2 of the model, as the lowest-cost route measure is estimated to reduce bilateral trade ( conditional on the ) with a statistically significant elasticity of ( minus ) 1.603. This fixed effects used pooled point estimate is in line with a large body of work on estimating gravity equations reported in Head and Disdier 2008 ) . ( In column 2 of Table 3 I investigate the possibility that the elasticity of trade flows with respect to lowest-cost route effective distance varies by commodity in a manner that would suggest that trade costs differ in an important way across com- ˆ ) , α R ( LCRED modities. I do this by including interaction terms between the t odt variable and two commodity-specific characteristics ( each measured in the earliest cross-section for which data are available ) as observed in : weight per unit value ( , and “freight class” ) an indicator used 1890 export data, averaged over all of India ( by railroad companies in 1859 to distinguish between “high-value” and “low-value” ) . The results in column 2 are not supportive of the notion that commodities goods had elasticities of trade with respect to distance that depend on either weight or 32 Data on many disaggregated manufacturing products were similarly collected but are not necessary for the estimates in this paper. The agricultural commodities available cover the 17 crops listed in Section IA, with the exception of barley, maize, and ragi which were not disaggregated separately in trade data publications. In addition, ( 14 ) at the the crops of bajra and jowar were tabulated as one aggregate commodity. Because I estimate equation ˆ ln LCRED ( R , α trade block level, I construct the regressor ) od from the average of all district pairs within the odt t trade block pair ( and take the location of external regions to be their largest commercial centers: ( Goalpara for the province of Assam, Hyderabad for the composite native states region, and Karachi for the province of Sindh ) . As in most international and intranational trade settings, I do not observe trade from region o to itself so those trade flows do not enter my gravity equation estimates here.

23 VOL. 108 NO. 4-5 921 DONALDSON: RAILROADS OF THE RAJ Table 3—Railroads and Trade Flows: Step 2 Dependent variable: log value of exports 1 ) ( 2 ) ( log effective distance beween origin and destination along lowest-cost route 1.603 − 1.701 − ( 0.533 ) 1.141 ( ) ( ) − 0.946 log effective distance beween origin and destination along lowest-cost route × ) ( weight per unit value of commodity in 1890 ) ( 3.634 1.286 ( log effective distance beween origin and destination along lowest-cost route ) ( 1.243 × ) ( high-value railroad freight class of commodity in 1859 ) Observations 142,541 142,541 2 R 0.901 0.901 Notes: Regressions estimating equation ( 14 ) using data on 15 commodities and 47 trade blocks annually from 1882 to 1920. Regressions include origin and destination fixed effects, separately for each commodity and year. “Effective distance between origin and destination along lowest-cost route” measures the railroad-equivalent kilo- meters ) ( due to the normalization of railroad distance cost to 1 between the centroid of the origin and destination trade blocks in question, along the lowest-cost route given relative freight rates for each mode of transport as esti- ( mated in Table 2 ) . “Weight per unit value in 1890” is the weight ( in maunds ) per rupee, as measured by 1890 prices. “Railroad freight class in 1859” is an indicator variable for all commodities that were classified in the higher ( more ) low-value commodities expensive ) freight class in 1859; salt is in the omitted category ( . Heteroskedasticity robust standard errors adjusted for clustering at the exporter-importer block level are reported in parentheses for columns 1 and 2 respectively. 33 freight class in a statistically significant manner. This lends support to the main- tained assumption throughout this paper that trade cost parameters for the shipment ( obtained in Step 1 can be applied to other commodities, as is necessary of salt ) given the absence of origin-specific product differentiation as was the case of salt, without doing injustice to the data. ( 14 ) one commodity at a time ( for each of the agricul- Finally, I estimate equation ) , in order to obtain estimates of the com- tural commodities in the trade flows data for each commodity. The mean across all of these θ parative advantage parameters k estimates is 7.80, with a range from − 9.60 to 29.21. The estimates for two crops ( opium and tea ) are in the inadmissible ( i.e., negative ) range, but neither estimate is statistically significantly different from zero at the 5 percent level. This average esti- mate is close to the preferred estimate of 8.28 in Eaton and Kortum ( 2002 ) obtained from intra-OECD trade flows in 1995, treating all of the manufacturing sector as one commodity, though it is somewhat higher than other estimates in the literature ) 2014 such as those from Simonovska and Waugh ( or Costinot, Donaldson, and ranging from 4.5 to 6.5 ) ( 2012 ) for the OECD in the 1990s. ( Komunjer in this section is As described above, the second goal in estimating equation ( 14 ) ) ( potential , the parameter that relates crop-specific rainfall to κ produc- to estimate k ) and obtain a value of ( . I do this by estimating equation ) 15 in the model tivity ( A ot ˆ ) 0 . κ ( with a standard error, clustered by exporter-importer pair, of 0.151 = , 496 implying that a one standard deviation ( 0.921 across the entire sample ) increase in crop-specific rainfall causes a 46 percent increase in agricultural productivity as ( k . This suggests that rainfall has a positive and statisti- ) in the model defined by A ot cally significant effect on productivity, as expected given the importance of water 33 2, As reported in Table 3, the change in the total , that is, that for the full model inclusive of fixed effects, due R to the addition of these interaction variables is similarly inconsequential.

24 922 APRIL 2018 THE AMERICAN ECONOMIC REVIEW as dis- ( in crop production and the paucity of irrigated agriculture in colonial India . cussed in Section I ) In summary, the results from this section demonstrate that railroads significantly expanded trade in India. This finding is in line with Result 2 and suggests that the expansion of trade brought about by the railroad network could have given rise to welfare gains due to increasingly exploited gains from trade. A second purpose of ( estimated in this section was to use the empirical relationship between trade costs and θ Step 1 ) and trade flows to estimate the remaining unknown model parameters, k k . These parameters are important inputs for Step 4. A ot V. Empirical Step 3: Railroads and Real Income Levels Steps 1 and 2 have established that Indian railroads significantly reduced trade costs and expanded trade flows, findings which suggest that railroads improved the trading environment in India. I now go on to investigate some of the welfare conse- quences of railroad expansion in India by estimating the effect of railroads on real income levels. A. Empirical Strategy Result 3 of the model states that a district’s real income will increase when it is connected to the railroad network. This result motivates an estimating equation of the form r ot __ + ε . L RAI = β + γ + β ) ln ( 16 o ot t ot ̃ ( ) P ot ̃ In this equation, r / P the appropriate ( represents real agricultural income per acre ot ot ) and year t . There exist no systematic data o welfare metric in the model in district on land rents or values in this time period, but in the model nominal land rents are equal to nominal output per unit area. As described in Section I, plentiful output the dominant sector of India’s colo- data were collected in the agricultural sector ( 34 Finally, I construct a consumer price . r nial economy ) , so I use these to measure ot 35 ̃ . P index, over agricultural goods, to measure ot 34 Real income per acre is equal to welfare ( for a representative agent ) in the model, but may not be in my empirical setting because output per acre may diverge from output per capita if the population of each district is endogenous, and related to railroad expansion. Population could be endogenous for two reasons. First, fertility and mortality may have been endogenous to railroad expansion in colonial India: in a Malthusian limit, fertility and mortality would adjust to any agricultural productivity improvements e.g., due to railroads ) and hold output per ( capita constant. However, the potential for endogenous fertility and mortality responses is likely to vary from set- ting to setting so while knowledge of an effect of railroads on output per acre is transferable to alternative settings, an effect on output per capita is potentially less so. Second, migration could respond to differential productivity improvements over space. Migration, however, was extremely limited in colonial India when compared to other countries in the same time period ( a feature that is still true today, and that Munshi and Rosenzweig 2016 argue is due to informal insurance provided by localized caste networks ) , and the little migration that occurred was vastly skewed toward women migrating to marry ( Davis 1951; Rosenzweig and Stark 1989 ) . 35 ( 9 ) . However, it would be unsurprising if a price index In the model this price index is given in equation calculated strictly as suggested by a theory fits that theory well. I therefore use a flexible price index ( the Törnqvist price index, of which the price index in equation ( 9 ) is a special case ) as is commonly done when constructing real GDP measures from national income accounts.

25 VOL. 108 NO. 4-5 923 DONALDSON: RAILROADS OF THE RAJ The key regressor of interest in equation 16 RAI L ( , a dummy variable that is ) is ot in which some part of district is on the railroad network. I t equal to 1 in all years o ) and year ( β levels, so ) β ( 16 ) using fixed effects at the district estimate equation ( t o that the effect of railroads is identified entirely from variation within districts over time, after accounting for common shocks affecting all districts. The district fixed effect is particularly important because it controls for permanent features of districts that may have made them both agriculturally productive, and attractive places in which to build railroads. Result 3 states that the coefficient o ’s railroad access will be positive. γ on district whether stressing the gains from goods trade or ( A number of alternative theories ) could make similar predictions about the sign of this coefficient. For this otherwise reason, in Step 4 below I go beyond the qualitative test of the model provided by γ quantitative performance of the model in predicting real the sign of and assess the 36 income changes due to the expansion of the railroad network. ( in Section VC ) by estimating equation ( 16 ) using OLS. Unbiased OLS I begin ) and the regres- ( ε estimates require there to be no correlation between the error term ot , conditional on the district and year fixed effects. This requirement would ) L sor ( RAI ot fail if railroads were built in districts and years that were expected to experience real agricultural income growth, or if railroads were built in districts that were on differing unobserved trends from non-railroad districts. For this reason, in Section VD I also estimate three different “placebo” specifications in order to assess the potential mag- nitude of bias in my OLS results due to nonrandom railroad placement. Data B. 16 ) using annual data on real agricultural income ( per acre of I estimate equation ( in an unbalanced panel of 192 districts, from 1870 to 1930. This variable ( calcu- ) land lated as nominal agricultural output calculated from the physical output of each of the 17 principal crops listed in Section IA valued at local retail prices, deflated by a local 37 was described ) consumer price index, and then divided by the district’s land area is L briefly in Section I and in more detail in online Appendix A. The variable RAI ot a dummy variable for the presence of a railroad line anywhere in district o in year t . C. Baseline Results Column 1 of Table 4 presents OLS estimates of equation ( ) . The coefficient 16 estimate is 0.164, implying that in the average district, the arrival of the railroad network is associated with a rise in income of over 16 percent. This OLS estimate 36 Similarly, various models, like that in Section II and beyond, could motivate potentially important departures ( from the simple functional form used in equation ) , a functional form chosen to capture only first-order features 16 of the data, in line with the first-order departures from symmetry motivated by Result 3. In principle, these depar - tures could be explored empirically. Step 4 examines the sufficient statistic of Result 4 that, in contrast to the simple specification in equation ( 16 ) , describes the precise functional form ( one that captures nonlinear, heterogeneous treatment effects and treatment spillovers suggested by the model in Section II, albeit non-analytically. ) 37 This land area denominator is fixed over time and so is irrelevant here given that equation ( 16 ) uses the log of real income per acre and conditions on district fixed effects. Note that this measure of land area therefore allows for increases in the cultivation margin due to rail access to be incorporated into the treatment effect γ , as seems desirable given that this is a potentially important response.

26 924 APRIL 2018 THE AMERICAN ECONOMIC REVIEW Table 4—Railroads and Real Income Levels: Step 3 Dependent variable: log real agricultural income 1 ) ( 2 ) ( 3 ( ( 4 ) ) Railroad in district 0.164 0.158 0.160 0.167 ( 0.049 ) ( 0.048 ) ( 0.050 ) ( 0.050 ) Unbuilt railroad in district, abandoned after proposal stage 0.057 ( 0.058 ) Unbuilt railroad in district, abandoned after reconnaissance stage 0.013 ( 0.099 ) Unbuilt railroad in district, abandoned after survey stage 0.069 − ( ) 0.038 0.067 ( Unbuilt railroad in district, included in Lawrence Plan 1869–1873 ) × ( post-1871 indicator ) ( 0.104 ) Unbuilt railroad in district, included in Lawrence Plan 1874–1878 ) − 0.019 ( × ( post-1874 indicator ) ( 0.092 ) 0.095 ( ) Unbuilt railroad in district, included in Lawrence Plan 1879–1883 × ) ( post-1879 indicator ) ( 0.084 0.072 ( Unbuilt railroad in district, included in Lawrence Plan 1884–1888 ) − ( ) post-1884 indicator ( × ) 0.075 0.047 ( ) Unbuilt railroad in district, included in Lawrence Plan 1889–1893 × ( post-1889 indicator ) ( 0.049 ) − ( Unbuilt railroad in district, included in Lawrence Plan 1894–1898 ) 0.088 0.086 ( post-1894 indicator ) ( × ) ( Unbuilt railroad in district, included in Kennedy plan, high-priority ) − 0.0001 year-1848 0.002 ) ( ) × ( 0.001 Unbuilt railroad in district, included in Kennedy plan, low-priority ( ) 0.003 ( year-1848 ) ( × ) Observations 7,086 7,086 7,086 7,086 2 R 0.848 0.848 0.848 0.848 ( ) using real income constructed from crop-level data on 17 principal 16 Notes: OLS regressions estimating equation , from 192 districts in India, annually from 1870 to 1930. All regres- ) agricultural crops ( listed in online Appendix A sions include district fixed effects and year fixed effects. “Railroad in district” is a dummy variable whose value is 1 if any part of the district in question is penetrated by a railroad line. “Unbuilt railroad in district, abandoned after X stage” is a dummy variable whose value is 1 if a line that was abandoned after “X” stage penetrates a district, in all years after the line was first mentioned as reaching stage “X” in official documents. Stages “X” are: “proposal,” - where the line was mentioned in official documents; “reconnaissance,” where the line route was explored by sur veyors in rough detail; and “survey,” where the exact route of the line and nature of all engineering works were decided on after detailed survey. “Lawrence 1868 plan” was a proposal for significant railroad expansion by India’s ) ( Governor General that was not implemented; the plan detailed proposed dates of construction in 5-year segments over the next 30 years, which are used in the construction of this variable. “Kennedy plan” was an early construc- , ) divided into high- and low-priorities ( tion-cost minimizing routes plan drawn up by India’s chief engineer in 1848 which was rejected in favor of Dalhousie’s direct routes plan. Heteroskedasticity-robust standard errors corrected for clustering at the district level are reported in parentheses. is in line with Result 3 and suggests that railroads may have had a large effect on real income in India. In the following subsection I investigate the robustness of this finding to concerns over the nonrandom placement of railroads. D. Three “Placebo” Checks In this subsection I explore the plausibility of concerns about bias due to endog- enous railroad placement by estimating the effects of “placebo” railroad lines: over 40,000 km of railroad lines that came close to being constructed but, for three sep- arate reasons, were never actually built. I group these placebo lines into three cate- gories as follows.

27 VOL. 108 NO. 4-5 925 DONALDSON: RAILROADS OF THE RAJ .—From 1870–1900, India’s Railways Department Four-Stage Planning Hierarchy used one constant system for the evaluation of new railroad projects. Line proposals received from the Indian and provincial governments would appear as “proposed” Railway Report . This invited further discussion, and in the Department’s annual if the proposed line survived this criticism it would be “reconnoitered.” Providing this reconnaissance uncovered no major problems, every meter of the proposed line ( would then be “surveyed,” this time in painstaking and costly detail usually tak- 38 These detailed surveys would provide accurate ing several years to complete ) . estimates of expected construction costs, and lines whose surveys revealed modest costs would then be passed on to the Government to be “sanctioned,” or given final approval. The railroad planning process was therefore arranged as a four-stage hier - archy of tests that proposed lines would have to pass. ( 16 ) Column 2 of Table 4 presents an estimate of equation that additionally includes regressors for railroad lines abandoned at the first three of these planning 39 If line placement decisions were driven stages, with separate coefficients on each. by unobservable determinants of changes in agricultural income then unbuilt lines would exhibit spurious effects ( relative to the excluded category, areas in which ) on agricultural income in OLS regressions with lines were never even discussed district fixed effects. Further, it is likely that lines that reached later planning stages would exhibit larger spurious effects than the lines abandoned early on because ( higher expected benefits would be required to justify the increasingly costly survey process ) . However, the coefficients on unbuilt lines reported in column 2 are never statistically significantly different from zero, or of a similar magnitude as that corre- sponding to built lines. Importantly, the coefficients on each hierarchical stage of the approval process do not display a tendency to increase as they reach advanced stages of the planning process. These findings cast doubt on the extent to which India’s Railways Department was selecting districts for railroad projects on the basis of correlation with the error term in equation ( 16 ) . Lawrence’s Proposal .—In 1868, Viceroy John Lawrence head of the Government ( ) proposed and had surveyed a 30-year railroad expansion plan, broken into of India 5-year segments, that would begin where Dalhousie’s trunk lines ( described in 40 Lawrence consulted widely about the optimal routes for this left off. Section IC ) railroad expansion, and drew upon his 26 years of experience as an administrator in India. Upon his retirement in 1869, construction on Lawrence’s plan had just begun. But Lawrence’s successor, the Earl of Mayo, immediately halted construction and vetoed Lawrence’s proposal. Mayo was a newcomer to administration in India and a fiscal conservative, and he wasted no time in criticizing the high costs of railroad 38 Reconnaissance was a form of low-cost survey of possible track locations ( typically within 100 m of their eventual location ) , along with a statement of all necessary bridges, tunnels, cuttings, and embankments. As Davidson ( 1868 ) and the standard engineer’s textbook of the day, Wellington ( 1877 ) , make clear, surveying was much more detailed. The goal of a survey was to identify the exact position of the intended lines and to provide a ( down to the estimated number of bricks required to build each bridge ) . precise statement of all engineering works 39 The fourth stage, sanctioning, appears to have never been reached by an unbuilt line. A previous version of this paper ( Donaldson 2010 ) estimated a coefficient for this category on the basis of one line in Madras but I have subsequently discovered that the line was in fact eventually built. 40 These segments appear in the plan ( published in 1868 ) as “to be built over the next 5 years,” “to be built between 6 and 10 years from now,” etc.

28 926 APRIL 2018 THE AMERICAN ECONOMIC REVIEW construction in India. Instead, Mayo followed a more cautious approach to railroad expansion and Lawrence’s plan was never built. However, Lawrence’s plan provides a useful window on the trajectory that he and his Government expected in the dis- tricts where they planned to expand the railroad network. If anyone was capable of forecasting developments in each district’s trading environment, developments that may be correlated with the error term in equation 16 ) , it was likely to be Lawrence. ( 16 ) and additionally include lines that To check for this, I estimate equation ( were part of Lawrence’s proposal. Because Lawrence’s proposal was broken into six five-year segments, I allow for separate coefficients on each of these segments and assume that the stated lines in a given five-year period would have opened at the 41 This provides an additional check: lines that Lawrence beginning of the period. proposed to be built in relatively early time segments were presumably more attrac- tive, higher priority proposals, that in addition were made under a shorter forecast horizon. Therefore, to the extent that Lawrence was able to forecast district-level developments, larger spurious effects should be found on these segments. Column 3 of Table 4 presents estimates of coefficients on the lines that were identified in Lawrence’s proposal. The coefficients on these lines are never sta- tistically significantly different from zero and substantially smaller than that on ( ) unbuilt lines . Further, the estimated coefficients on Lawrence’s early proposals are no larger on average than those on his later proposals. This is in contrast to what one would expect if Lawrence were attempting to allocate railroads to districts he expected to grow, but where his ability to forecast growth was weaker at more dis- tant forecast horizons. Kennedy’s Proposal .—India’s early line placement followed the suggestions of ( Lord Dalhousie ) , but only after Dalhousie’s then head of the Government of India decade-long debate with Major Kennedy ( then India’s Chief Engineer, who was charged with planning India’s first railroad lines ) over optimal route choice. Kennedy was convinced that railroad construction would be extremely expensive in India ( Davidson 1868 ) . He therefore sought to connect Dalhousie’s chosen pro- vincial capitals with a network of lines that followed the gentlest possible gradients, 42 along river gradients and the coastline wherever possible. Kennedy’s 1848 proposal is useful for my identification strategy because it singles out districts with low perceived railroad construction costs. Geographical features ( such as topography, vegetation, and climate ) may that favor low construction costs also favor agricultural production, and may result in differential unobservable trends in the real agricultural income of districts with favorable construction conditions; if favorable construction conditions drove railroad placement decisions then OLS estimates of equation 16 ) would erroneously attribute unobserved trends to railroad ( construction. I therefore estimate equation ( 16 ) while including a variable that is 41 One exception concerns the first period ( 1869–1873 ) which I allow to take effect with a three-year lag, as seems plausible given typical construction periods and as is necessary to distinguish this regressor from the main RAI L . These estimates vary only slightly if this regressor is omitted instead. effect of ot 42 The network that was built, by contrast, took straight lines in almost all circumstances, requiring in many cases ( such as the Thal and Bhor Ghats ) some of the most advanced railroad engineering works the world had ever seen ( Andrew 1883 ) . By 1869 it was clear that Kennedy’s pessimistic construction cost estimates were, if anything, underestimates. Indeed, high construction costs were a major factor in Mayo’s decision to abort Lawrence’s plan, as described above when introducing my second placebo variable.

29 VOL. 108 NO. 4-5 927 DONALDSON: RAILROADS OF THE RAJ an interaction between an indicator variable that captures districts that would have been penetrated by Kennedy’s proposed network and a time trend. If this variable predicts real agricultural income then this would be a concern for my identification strategy as it would suggest that the features that Kennedy found favorable for rail- ( ) road construction features that are presumably just as favorable to his successors are correlated with real agricultural income growth. Because Kennedy subdivided his proposal into high and low priority lines, I also look for differential trends across these designations. Column 4 of Table 4 presents these results, which examine the extent to which ( inexpensive districts in which to locations identified in Major Kennedy’s proposal construct a vast railroad network ) display different real agricultural income trends ( from other districts. The coefficients on Kennedy’s two types of identified lines high and low priority are both close to zero and not statistically significantly different ) from zero. Crucially, the inclusion of this variable does not change appreciably the coefficient on built railroads. This is reassuring, as it suggests that controlling for ( time-varying effects of the ) unobserved geographical features that India’s chief the engineer thought were important for building railroads cheaply has little bearing on the results estimated above. Summary and Relation to “Social Savings” Methodology E. The three sets of “placebo” results in Table 4 display a consistent pattern. Regardless of the expert choosing potential railroad lines ( India’s public works department, India’s most senior administrator at the height of his 26-year Indian career, or India’s chief engineer ) , or their motivation in doing so ( lines attractive to the government for many potential reasons, commercially attractive lines, or low costs of construction , unbuilt lines that these experts wanted to build are not sta- ) tistically significantly correlated with time-varying unobservable determinants of real agricultural income growth. These results cast doubt on the extent to which the Government of India was willing or able to allocate railroads to districts on the basis of their expected evolution ( or factors correlated with this evolution ) in real agri- cultural income. This is perhaps unsurprising given the strong military motivations for building railroads in India outlined in Section I, the difficulty in forecasting the ( as evidenced by the stark disagreements attractiveness of competing railroad plans among top-level Indian administrators described in Section VD ) , and the challenges of targeting precisely a highly networked infrastructure such as railroads. Taken together, the results in Table 4 suggest that my key estimate in column 1, that railroads caused a large ( 16 percent ) increase in real agricultural income in India, can be interpreted as a plausibly unbiased estimate of the effect of railroads on real agricultural income in India. This finding is also plausible when consid- ered in the context of the large “social savings” literature on railroads. A social savings calculation in my context estimates the benefits of railroads to be equal to 11.2 percent of agricultural income, which is lower than ( but still within the 95 per - 43 However, because numerous ) the estimate in Table 4. cent confidence interval of 43 The social savings approach ( Fogel 1964 ) seeks to estimate the decrease in national income that would have resulted had railroads not existed, and if the factors of production used in the railroad sector had instead

30 928 APRIL 2018 THE AMERICAN ECONOMIC REVIEW authors have pointed out that the social savings methodology suffers from both pos- ( due, for example, to the typical assumption of elastic transport demand itive bias ) due, for example, to a neglect of returns to scale as in David and negative bias ( 1969 , estimates of the benefits of railroads from conventional econometric meth- ) odologies that compare exposed to unexposed regions, like that I pursue here, are of additional value. The final step of my empirical analysis explores whether the benefits due to rail- ) roads estimated in this section ( a 16 percent rise in real income are plausible in the context of the model in Section III. That is, I explore whether it is plausible that ( estimated in Step 1 ) , when introduced the reduction in trade costs due to railroads into the environment of heterogeneous technologies that existed in colonial India ) , could have raised living standards by the estimated 16 percent. ( estimated in Step 2 VI. Empirical Step 4: A Sufficient Statistic for Railroad Impact Steps 1 and 2 of this paper have argued that railroads significantly improved the ability to move goods cheaply within India. Step 3 demonstrated that railroads also substantially raised the level of real agricultural income. These two sets of results are qualitatively consistent with each other, in the context of the model in Section ) there should be gains from ( and trade flows expand II: that is, when trade costs fall trade, and these gains will show up as a rise in real income. In this section I explore quantitatively consistent with each other whether these two sets of results are also in the context of the model. Because the reduced-form impact estimated in Step 3 could arise through a number of mechanisms, the exercise in this section can also be thought of as determining the share of the observed reduced-form impact of rail- roads that can be explained by the trade-based mechanism in the model. A. Empirical Strategy In order to compare the reduced-form impact of the railroad network on each district’s real agricultural income estimated in Step 3 ) to the impact that is pre- ( ( 11 ) , restated here for dicted by the model, I exploit Result 4. This result is equation convenience: μ μ r k k ot k k _ _ __ ln . π = Ω + ∑ ln A ∑ − 17 ln ) ( ot oot ̃ θ θ ) ( P k k k k ot ̃ r Result 4 thus states that real agricultural income ( P / ) is, up to a constant, a ot ot k k π , the share of ) and the “trade share” ( A ( function of only two terms: technology ot oot o ) district , each appropriately summed over all ’s expenditure that it buys from itself commodities . The former term is taken to be exogenous ( and driven by rainfall ) , k been employed in their next-best substitute ( Fishlow 2000 reviews this literature ) . The calculation reported here is simply that due to Hurd 1983 ) expressed as a share of agricultural income. It is not straightforward to compare ( the reduction in transport prices used by Hurd ( 1983 ) in this calculation with those estimated for salt in Table 2 because the constant elasticity of distance functional form in equation ( 12 ) , chosen here for its similarity to that used prominently in the international and interregional trade literatures, is different from that implicitly used in the social savings approach.

31 VOL. 108 NO. 4-5 929 DONALDSON: RAILROADS OF THE RAJ ( heterogeneous, general while the latter term is endogenous and captures all of the equilibrium effects that railroads could generate in this model. ) To estimate this equation, I substitute in estimates for the unobserved productivity k θ , the unknown parameters and μ , and the unobserved trade share term A terms k ot k k 44 . I discuss these in turn. First, the goal of Step 2 was to estimate the parameter κ π oot k k θ as well as the parameters = κ RAI N ; I use the A ln in the modeled relationship ot ot k K 45 Second, here. ) in conjunction with the data on RAI N estimates obtained in Step 2 ( ot are simply consumer expenditure shares and I estimate these as μ the parameters k k 46 Finally, I obtain a measure of predicted π by solving for this variable in such. oot i.e., by solving equation ( 6 )) conditional on all estimated the model equilibrium ( ˆ ˆ ˆ ˆ ˆ ˆ all dis- )) and the value of all exogenous variables ( Θ κ ( parameters , θ μ , α , , δ ≐ ( , the entire transportation network, tricts’ rainfall series, denoted by the vector N RAI t 47 . , and all districts’ land sizes, L ) I refer to the estimated trade share term as R t k ˆ to denote its dependence on both estimated parameters and ( Θ , RAI N ) , R L , π t t oot all exogenous variables. It is important to note that there are multiple reasons to expect this estimated trade share to be unequal to the ) data equivalent; ( unobserved in logs, and ( what matters for the procedure below is an estimate that correlates well with the data equivalent. This contrasts ) conditional on the fixed effects used below ( ) , feasible in richer to the approach pioneered by Dekle, Eaton, and Kortum 2008 data settings, that calibrates all model parameters to match ex ante data exactly. ( i.e., equation ( 17 )) states that, once rainfall ( through the relationship, Result 4 k k ( is controlled for = κ RAI N ) , estimated in Step 2 and weighted over com- A ln ot ot k t in year ) ( ) in the manner suggested by this equation k modities π , the trade share oot is a sufficient statistic for the impact of the entire railroad network open in year t on real income in year . To explore Result 4 empirically I estimate equation ( 16 ) from t k , ) π Step 3 but additionally include the sufficient statistic variable, the trade share ( oot and adjust for rainfall: ˆ μ r k ot k __ _ ˆ − ∑ κ N RAI ( ln ) 18 ot ̃ ˆ ( ) ] [ P k θ ot k ˆ μ k k ˆ __ L ∑ RAI + γ + β . + ε ) , + ψ ( Θ R , RAI N , L π ln = β oot t t ot o t ot ˆ [ ] k θ k k i.e., If the trade share ( π is truly a sufficient statistic for the impact of railroads, ) oot as predicted by the model, then when the trade share is included in equation ( 18 ) all other railroad variables should lose predictive power. That is, Result 4 states that the 44 k π intra-trade block rail The term is unobserved because I do not observe the trade of a district with itself ( oot shipments were never, to my knowledge, recorded . ) k 45 One exception concerns the estimated values of A ( I use for the four main port cities Bombay, Calcutta, ot Karachi, and Madras in India, whose exports to inland Indian destinations include all sea trade imported from for - ) eign countries in which I do not observe rainfall ) . Online Appendix C discusses my method for obtaining estimates ( k A for the grain crops that are , for these regions. Another exception concerns the values of , as well as of L of θ k o ot ( all of which I set equal to the estimated value corresponding to the aggregate crop group of missing or aggregated ) or whose estimates are negative ( which I set equal bajra and jowar, given the similarities among these grain crops to the lowest positive estimated value of θ . ) k 46 I estimate these Cobb-Douglas weights as the average ( over trade blocks and years in which these are avail- able ) expenditure share for commodity k , where expenditure is calculated as output plus net imports. 47 For the fixed land size L of each district I use the average total cultivated area across all years. o

32 930 APRIL 2018 THE AMERICAN ECONOMIC REVIEW Table 5—A Sufficient Statistic for Railroad Impact: Step 4 log real ag. income, corrected for rainfall: 1 ) ( ( ) 2 Railroad in district 0.258 0.124 ( 0.050 ) ( 0.050 ) “Trade share,” as computed in model − 1.587 ( 0.177 ) Observations 7,086 7,086 2 R 0.835 0.844 Notes: ( 18 ) using real income OLS Regressions estimating equation constructed from crop-level data on 17 principal agricultural crops ( listed in online Appendix A ) , from 192 districts in India, annually from 1870 to 1930. Dependent variable is log real income, corrected for crop-specific rainfall of each of 17 crops, weighted across crops as in equation ( 18 ) . Regressions include district fixed effects and year fixed effects. “Railroad in district” is a dummy variable whose value is one if any part of the district in question is penetrated by a railroad line. “Trade share” is the share of a district’s expenditure that it buys from itself; this variable is computed in the equilibrium of the model, where the model parameters are set to those estimated in Steps 1 and 2, and the exogenous variables ( the transportation network, rainfall, and district are as observed. Heteroskedasticity-robust standard errors land sizes ) corrected for clustering at the district level are reported in parentheses. coefficient significantly should be zero in this regression while it was statistically γ different from zero in Step 3. Further, taking the model equation ( 17 ) literally, Result 48 will equal − 1. ψ 4 also states that the coefficient B. Results The results from this section are presented in Table 5. As a benchmark, column 1 k ˆ , ( Θ π i.e., ( while omitting the “trade share” variable ) 18 estimates equation ( oot R ) is large ( , i.e., RAI L ) ) L , . The coefficient on the railroad access dummy N RAI t ot t and statistically significant. This estimate is larger than that in column 1 of Table 4 but still within its 95 percent confidence interval. While the reduced-form result in column 1 could reflect the increased opportunities to trade that railroads brought ) , other possible mechanisms an effect for which I found evidence in Step 1 ( about could also be at work. Following the strategy laid out in equation , column 2 of Table 5 adds the 18 ) ( k ˆ RAI N , R ) , L ) to the regression in column 1. ( Θ , i.e., π ( trade share variable t oot t γ on the railroad access Consistent with Result 4 of the model, the coefficient dummy variable, which was statistically and economically significant in column 1, though its 95 percent confidence ( falls considerably by a factor of more than two . This is consistent with the notion that a substantial ) interval does not include zero share of the impact of railroads on real agricultural income is working through the sufficient statistic predicted by the model. 48 k ˆ , ( Θ ) L π The computed trade share term, , R , , is a generated regressor, so conventional standard N RAI oot t t errors obtained when using it will be incorrect. This is of little consequence here, however, because the empirical - procedure in this section is concerned primarily with the magnitude of point estimates rather than statistical infer ence about these estimates.

33 VOL. 108 NO. 4-5 931 DONALDSON: RAILROADS OF THE RAJ In further agreement with Result 4, the coefficient on the trade share term is negative and statistically significant, implying that the trade share, when measured in a model-consistent manner, is a strong determinant of real agricultural income. Notably, the model parameters that enter the trade share term were not estimated using data that enter the current estimating equation, so the impressive fit of the trade share term was not preordained. However, it is noteworthy that the model’s prediction of a coefficient of − 1 on the trade share is rejected at the 5 percent level. in col- ) L RAI ( Finally, taking the point estimate of 0.124 on railroad access ot 0 . 124 ____ of the ) 0.52 = − ( umn 2 seriously implies that a little over one-half 1 i.e., 258 . 0 total impact of the railroads estimated in column 1 can be explained by the mecha- nism of enhanced opportunities to trade according to comparative advantage, repre- sented in the model. The results in Table 5 establish a quantitative connection between the earlier results in this paper, that railroads improved the ability to trade within India ( Steps 1 and 2 ) ( Step 3 ) . These results suggest that the and that railroads raised real incomes important welfare gains that railroads brought about can be well, but by no means fully, accounted for by the specific mechanism and parameterization ) of Ricardian ( comparative advantage-based gains from trade modeled here. VII. Conclusion This paper has aimed to make three contributions to our understanding of the effects of large transportation infrastructure projects in the context of an enormous expansion in transportation infrastructure: the construction of colonial India’s rail- road network. Using a new panel of district-level data that I have collected from archival sources, my first contribution is to estimate the effect of India’s railroads on the trading environment there. I find that railroads reduced the cost of trading, reduced inter-regional price gaps, and increased trade volumes. My second contribution is to estimate the effect of India’s railroads on a proxy for economic welfare in colonial India. I find that when the railroad network was extended to the average district, real agricultural income in that district rose by approximately 16 percent. While it is possible that railroads were deliberately allocated to districts on the basis of time-varying characteristics unobservable to researchers today, I find little evidence for this potential source of bias to my results in three separate placebo checks. These reduced-form findings suggest that railroads brought welfare gains to colonial India, but say very little about the economic mech- anisms behind these gains. Finally, my third contribution is to shed light on the mechanisms at work by relating the observed railroad-driven reduction in trade costs to the observed railroad-driven increase in welfare. To do so requires an estimable, general equilib- rium model of trade with many regions, many goods, and unrestricted trade costs. I extend the work of Eaton and Kortum ( 2002 ) to construct such a model and estimate its unknown parameters using auxiliary model equations. The model identifies a sufficient statistic for the effect of trade cost reductions on real income, which, when estimated and computed according to the model’s equilibrium, is a strong predictor of the evolution of real income in Indian districts over time and accounts empiri- cally for more than one-half of the observed real income effect of railroads. This is

34 932 APRIL 2018 THE AMERICAN ECONOMIC REVIEW consistent with a mechanism in which railroads raised real income in India because they reduced the cost of trading, and enabled India’s heterogeneous districts to enjoy previously unexploited gains from trade due to Ricardian comparative advantage. But these results imply substantial scope for other channels of influence from rail- roads to growth as well. While the findings in this paper argue that railroads caused an increase in the level of real incomes in India, a component of economic welfare about which this paper volatility of real incomes over time. As in much of the has been silent concerns the developing world today, colonial India’s precarious monsoon rains and its rain-fed agricultural technologies made real income volatility extremely high. Famines were a perennial concern. One potentially important question for future research con- cerns the extent to which transportation infrastructure systems, like India’s railroad network, can help regions to smooth away the effects of local weather extremes on local well-being. REFERENCES The Salt Industry in India. Aggarwal, Shugan C. 1937. Delhi: Government of India Press. 1993. Ahuja, Ravindra K., Thomas L. Magnanti, and James B. Orlin. Network Flows: Theory, Algo- Upper Saddle River, NJ: Prentice-Hall. rithms, and Applications. Alcalá, Francisco, and Antonio Ciccone. Quarterly Journal of Eco- 2004. “Trade and Productivity.” nomics 2 ) : 613–46. 119 ( 2004. “Trade Costs.” Journal of Economic Literature Anderson, James E., and Eric van Wincoop. 42 ( ) : 691–751. 3 2010. “Railways and Price Convergence in British India.” Andrabi, Tahir, and Michael Kuehlwein. ( 2 ) : 351–77. Journal of Economic History 70 Indian Railways. London: W. H. Allen & Co. Andrew, William P. 1883. Arkolakis, Costas, Arnaud Costinot, and Andrés Rodríguez-Clare. 2012. “New Trade Models, Same American Economic Review 102 Old Gains?” 1 ) : 94–130. ( Aschauer, David Alan. Journal of Monetary Economics 23 1989. “Is Public Expenditure Productive?” 2 ( : 177–200. ) Atack, Jeremy, Michael Haines, and Robert A. Margo. 2011. “Railroads and the Rise of the Factory: Evidence for the United States, 1850–1870.” In Economic Evolution and Revolution in Historical Time, edited by Paul W. Rhode, Joshua L. Rosenbloom, and David F. Weiman, 162–79. Stanford, CA: Stanford University Press. Banerjee, Tarasankar. Internal Market of India ( 1834–1900 ) . Calcutta: Academic Publishers. 1966. 2008. “Distorted Gravity: The Intensive and Extensive Margins of International Chaney, Thomas. 98 ( 4 American Economic Review : 1707–21. Trade.” ) 2012. “What Goods Do Countries Trade? A Costinot, Arnaud, Dave Donaldson, and Ivana Komunjer. Review of Economic Studies 79 ( 2 Quantitative Exploration of Ricardo’s Ideas.” : 581–608. ) David, Paul A. 1969. “Transport Innovation and Economic Growth: Professor Fogel on and off the Economic History Review 22 ( Rails.” ) : 506–25. 3 Davidson, Edward. 1868. The Railways of India: With an Account of Their Rise, Progress and Con- struction. London: E. and F. N. Spon. Davis, Kingsley. 1951. Princeton: Princeton University Press. The Population of India and Pakistan. 2008. “Global Rebalancing with Gravity: Mea- Dekle, Robert, Jonathan Eaton, and Samuel Kortum. IMF Staff Papers 55 ( 3 ) suring the Burden of Adjustment.” : 511–40. Deloche, Jean. 1994. Transport and Communications in India Prior to Steam Locomotion: Volume I: Land Transport. Oxford: Oxford University Press. Deloche, Jean. 1995. Transport and Communication in India Prior to Steam Locomotion: Volume 2: Water Transport. Oxford: Oxford University Press. Derbyshire, Ian. 1985. “Opening Up the Interior: The Impact of Railways upon the North Indian Econ- omy and Society, 1860–1914.” PhD diss., University of Cambridge. Dinkelman, Taryn. 2011. “The Effects of Rural Electrification on Employment: New Evidence from South Africa.” American Economic Review 101 ( 7 ) : 3078–3108. Directorate of Economics and Statistics. 1967. Indian Crop Calendar. Delhi: Government of India Press.

35 VOL. 108 NO. 4-5 933 DONALDSON: RAILROADS OF THE RAJ 2010. “Railroads of the Raj: Estimating the Impact of Transportation Infrastructure.” Donaldson, Dave. NBER Working Paper 16487. 2018. “Railroads of the Raj: Estimating the Impact of Transportation Infrastructure: Donaldson, Dave. American Economic Review. Dataset.” https://doi.org/10.1257/aer.20101199. 122 2 ) : 601–46. Quarterly Journal of Economics ( Duflo, Esther, and Rohini Pande. 2007. “Dams.” Dutt, Romesh. 1904. The Economic History of India in the Victorian Age. London: Kegan. Econometrica 70 2002. “Technology, Geography and Trade.” Eaton, Jonathan, and Samuel Kortum. ) : 1741–79. ( 5 Feyrer, James. 2009. “Trade and Income: Exploiting Time Series in Geography.” NBER Working Paper 14910. Fishlow, Albert. 2000. “Internal Transportation in the Nineteenth and Early Twentieth Centuries.” In The Cambridge Economic History of the United States, Volume II: The Long Nineteenth Century, - edited by Stanley L. Engerman and Robert E. Gallman, 543–642. Cambridge: Cambridge Univer sity Press. Fogel, Robert W. 1964. Railroads and American Economic Growth: Essays in Econometric History. Baltimore: Johns Hopkins Press. American Economic Frankel, Jeffrey A., and David H. Romer. 1999. “Does Trade Cause Growth?” 89 ( 3 ) : 379–99. Review 1938. Indian Home Rule. Gandhi, Mahatma. Ahmedabad: Navajivan Publishing House. Head, Keith, and Anne-Célia Disdier. 2008. “The Puzzling Persistence of the Distance Effect on Bilat- Review of Economics and Statistics 90 1 ) : 37–48. eral Trade.” ( The Tentacles of Progress: Technology Transfer in the Age of Imperialism, 1988. Headrick, Daniel R. New York: Oxford University Press. 1850–1940. Herrendorf, Berthold, James A. Schmitz, and Arilton Teixeira. 2012. “The Role of Transportation in International Economic Review ( 3 ) : 693–716. US Economic Development: 1840–1860.” 53 1983. “National Income.” In The Cambridge Economic History of India, Vol. 2, Heston, Alan. edited by Dharma Kumar and Meghnad Desai, 376–462. Cambridge: Cambridge University Press. House of Commons Papers. 1853. “Correspondence from Governor General in India, Relative to Rail- way Undertakings.” British Parliamentary Papers, 1852–53 ( 787 ) : LXXVI.481. Hummels, David. 2007. “Transportation Costs and International Trade in the Second Era of Globaliza- tion.” 21 ( 3 ) : 131–54. Journal of Economic Perspectives Hurd, John M. 1983. “Railways.” In The Cambridge Economic History of India, Vol. 2, edited by Dharma Kumar and Meghnad Desai, 737–61. Cambridge: Cambridge University Press. Jensen, Robert. 2007. “The Digital Provide: Information Technology ) , Market Performance, and Wel- ( Quarterly Journal of Economics 122 3 ) : 879–924. fare in the South Indian Fisheries Sector.” ( The Economics of Indian Rail Transport. 1963. Johnson, J. Bombay: Allied Publishers. Keller, Wolfgang, and Carol H. Shiue. 2008. “Institutions, Technology, and Trade.” NBER Working Paper 13913. 2007. Kerr, Ian J. . Westport, CT: Praeger. Engines of Change: The Railroads that Made India Krugman, Paul. 1980. “Scale Economies, Product Differentiation, and the Pattern of Trade.” American 70 ( Economic Review ) : 950–59. 5 Macpherson, W. J. Economic History Review 8 1955. “Investment in Indian Railways, 1845–1875.” 2 : 177–86. ( ) Michaels, Guy. 2008. “The Effect of Trade on the Demand for Skill: Evidence from the Interstate High- Review of Economics and Statistics way System.” ( 4 ) : 683–701. 90 Munshi, Kaivan, and Mark Rosenzweig. 2016. “Networks and Misallocation: Insurance, Migration, and the Rural-Urban Wage Gap.” American Economic Review 106 ( 1 ) : 46–98. Naidu, Narayanaswami. 1936. Madras. Coastal Shipping in India. 2002. “Trade Liberalization, Exit, and Productivity Improvements: Evidence from Pavcnik, Nina. Review of Economic Studies 69 ( 1 ) : 245–76. Chilean Plants.” 1903. Census of India 1901, Volume 15, Madras. Calcutta: Risley, Herbert H., and Edward A. Gait. Superintendent of Government Printing. Rosenzweig, Mark R., and Oded Stark. 1989. “Consumption Smoothing, Migration, and Marriage: Evidence from Rural India.” Journal of Political Economy 97 ( 4 ) : 905–26. Sanyal, Nalinaksha. Development of Indian Railways. Calcutta: University of Calcutta Press. 1930. Settar, S., ed. 1999. Railway Construction in India: Selected Documents. 3 vols. New Delhi: Northern Book Centre. Simonovska, Ina, and Michael E. Waugh. 2014. “The Elasticity of Trade: Estimates and Evidence.” Journal of International Economics 92 ( 1 ) : 34–50. Stone, Ian. 1984. Canal Irrigation in British India. Cambridge: Cambridge University Press.

36 934 APRIL 2018 THE AMERICAN ECONOMIC REVIEW 1950. Thorner, Daniel. Investment in Empire: British Railway and Steam Shipping Enterprise in India, 1825–1849. Philadelphia: University of Pennsylvania Press. Topalova, Petia. 2010. “Factor Immobility and Regional Impacts of Trade Liberalization: Evidence on American Economic Journal: Applied Economics 2 ( 4 Poverty from India.” : 1–41. ) Trefler, Daniel. 2004. “The Long and Short of the Canada-US Free Trade Agreement.” American Eco- nomic Review 94 ( 4 ) : 870–95. United States Census Office. 1902. United States Census; Volume VI ( Agriculture ) , Part II. Washing- ton, DC: United States Census Office. 1892. Indian Agriculture. Calcutta: Thacker, Spink and Co. Wallace, Robert. Watt, George. 1889. A Dictionary of the Economic Products of India. London: J. Murray. Wellington, Arther Mellen. 1877. The Economic Theory of the Location of Railways. New York: John Wiley & Sons. 1974. Late Nineteenth-Century American Development: A General Equilibrium Williamson, Jeffrey G. History. Cambridge: Cambridge University Press. World Bank. 2007. Evaluation of World Bank Support to Transportation Infrastructure. Washington, DC: World Bank Publications.

Related documents