The Historical Evolution of the Wealth Distribution - MIT Economics

The Historical Evolution of the Wealth Distribution - MIT Economics

NBER WORKING PAPER SERIES THE HISTORICAL EVOLUTION OF THE WEALTH DISTRIBUTION: A QUANTITATIVE-THEORETIC INVESTIGATION Joachim Hubmer Per Krusell Anth...

493KB Sizes 0 Downloads 2 Views

Recommend Documents

The Historical Evolution of the Wealth Distribution: A - Yale Economics
The Historical Evolution of the Wealth Distribution: A. Quantitative-Theoretic Investigation. Joachim Hubmer, Per Krusel

Investment Efficiency and the Distribution of Wealth - MIT Economics
Investment Efficiency and the Distribution of Wealth iii. About the Series. The Commission on Growth and Development led

wealth distribution and human capital - MIT Economics
Abstract. This paper provides a theory of how the wealth distribution of an economy affects the optimal design of its ed

THE DISTRIBUTION OF WEALTH *
[PDF]THE DISTRIBUTION OF WEALTH *https://eml.berkeley.edu/~saez/course/Davies,Shorrocks(2000).pdfCachedSimilarby JB DAVI

The Distribution of Wealth and the Marginal Propensity to - Economics
The Distribution of Wealth and the Marginal Propensity to Consume. Forthcoming, Quantitative Economics. June 3, 2017. Ch

Evolution of the distribution of wealth in an economic environment
We present and analyze a model for the evolution of the wealth distribution ... In [18], the first two moments of the we

The Distribution of Wealth and Fiscal Policy in - NYU Economics
We study the dynamics of the distribution of wealth in an overlapping generation economy with finitely lived agents and

The distribution of wealth and redistributive policies - NYU Economics
Abstract. We study the dynamics of the distribution of wealth in an Overlapping Gen- eration economy with bequest and va

Wealth and the Distribution of Wealth in the Netherlands - IARIW
Aug 21, 2016 - Wealth and the Distribution of Wealth in the. Netherlands. Arjan Bruil (Statistics Netherlands). Paper pr

Spreading the Wealth: The Effect of the Distribution of - NCJRS
Spreading the wealth: The effect of the distribution of income and race/ethnicity across households and neighborhoods on

NBER WORKING PAPER SERIES

THE HISTORICAL EVOLUTION OF THE WEALTH DISTRIBUTION: A QUANTITATIVE-THEORETIC INVESTIGATION Joachim Hubmer Per Krusell Anthony A. Smith, Jr. Working Paper 23011 http://www.nber.org/papers/w23011

NATIONAL BUREAU OF ECONOMIC RESEARCH 1050 Massachusetts Avenue Cambridge, MA 02138 December 2016

For helpful comments the authors would like to thank Chris Carroll, Harald Uhlig, and seminar participants at the 2015 SED Meetings, the 2015 HydraWorkshop on Dynamic Macroeconomics, Johns Hopkins, Indiana, Penn State, SOFI, and Yale. The views expressed herein are those of the authors and do not necessarily reflect the views of the National Bureau of Economic Research. At least one co-author has disclosed a financial relationship of potential relevance for this research. Further information is available online at http://www.nber.org/papers/w23011.ack NBER working papers are circulated for discussion and comment purposes. They have not been peer-reviewed or been subject to the review by the NBER Board of Directors that accompanies official NBER publications. © 2016 by Joachim Hubmer, Per Krusell, and Anthony A. Smith, Jr.. All rights reserved. Short sections of text, not to exceed two paragraphs, may be quoted without explicit permission provided that full credit, including © notice, is given to the source.

The Historical Evolution of the Wealth Distribution: A Quantitative-Theoretic Investigation Joachim Hubmer, Per Krusell, and Anthony A. Smith, Jr. NBER Working Paper No. 23011 December 2016 JEL No. D14,D31,D33,E21,E25,E62,H31 ABSTRACT This paper employs the benchmark heterogeneous-agent model used in macroeconomics to examine drivers of the rise in wealth inequality in the U.S. over the last thirty years. Several plausible candidates are formulated, calibrated to data, and examined through the lens of the model. There is one main finding: by far the most important driver is the significant drop in tax progressivity that started in the late 1970s, intensified during the Reagan years, and then subsequently flattened out, with only a minor bounce back. The sharp observed increases in earnings inequality, the falling labor share over the recent decades, and potential mechanisms underlying changes in the gap between the interest rate and the growth rate (Piketty's r-g story) all fall far short of accounting for the data. Joachim Hubmer Department of Economics Yale University 28 Hillhouse Avenue New Haven, CT 06520 [email protected] Per Krusell Institute for International Economic Studies Stockholm University 106 91 STOCKHOLM SWEDEN and NBER [email protected]

Anthony A. Smith, Jr. Department of Economics Yale University P.O. Box 208268 New Haven, CT 06520 and NBER [email protected]

The Historical Evolution of the Wealth Distribution: A Quantitative-Theoretic Investigation Joachim Hubmer, Per Krusell, and Anthony A. Smith, Jr.∗ January 7, 2017

Abstract This paper employs the benchmark heterogeneous-agent model used in macroeconomics to examine drivers of the rise in wealth inequality in the U.S. over the last thirty years. Several plausible candidates are formulated, calibrated to data, and examined through the lens of the model. There is one main finding: by far the most important driver is the significant drop in tax progressivity that started in the late 1970s, intensified during the Reagan years, and then subsequently flattened out, with only a minor bounce back. The sharp observed increases in earnings inequality, the falling labor share over the recent decades, and potential mechanisms underlying changes in the gap between the interest rate and the growth rate (Piketty’s r − g story) all fall far short of accounting for the data.

1

Introduction

The distribution of wealth in most countries for which there is reliable data is strikingly uneven. There is also recent work suggesting that the wealth distribution has undergone significant movements over time, most recently with a large upward swing in dispersion in several Anglo-Saxon countries.1 For example, according to the estimates in Saez & Zucman (2016) for the United States, the share of overall wealth held by the top 1% has increased from around 25% in 1980 to over 40% today; for the top 0.1% it has increased from less than 10% to over 20% over the same time period. The observed developments have generated strong reactions across the political spectrum. In his 2014 book, Capital in the Twenty-First Century, Piketty is obviously motivated by the growing inequality in itself, but he also suggests that further increases in wealth concentration may lead to both economic and democratic instability. Conservatives in the U.S. have expressed worries as well: is the American Dream really still alive, or might it be that a large fraction of the population simply will no longer be able to productively contribute to society? Given, for example, that parental wealth and well-being are important determinants behind children’s human capital accumulation, this appears to be a legitimate ∗

The authors’ affiliations are, respectively, Yale University; Institute for International Economic Studies, NBER, and CEPR; and Yale University and NBER. For helpful comments the authors would like to thank Chris Carroll, Harald Uhlig, and seminar participants at the 2015 SED Meetings, the 2015 Hydra Workshop on Dynamic Macroeconomics, Johns Hopkins, Indiana, Penn State, SOFI, and Yale. 1 See, e.g., Piketty (2014) and Saez & Zucman (2016).

1

concern regardless of one’s political views. As a result of these concerns, a number of policy changes have been proposed and discussed. The primary aim of the present paper is to understand the determinants of the observed movements in wealth inequality. This aim is basic but well-motivated: to compare different policy actions, we need a framework for thinking about what causes inequality and for addressing how inequality—and other variables—are influenced by any policy proposal at hand. In an effort to understand the movements in wealth inequality, Piketty (2014) and its online appendix suggest specific mathematical theories and as part of the present study we examine those theories.2 Our aim, however, is to depart instead from a more general, and by now rather standard, quantitatively oriented theory used in the heterogeneous-agent literature within macroeconomics: the Bewley-HuggettAiyagari model. This is a very natural setting for the study of inequality. This model incorporates rich detail on the household level along the lines of the applied work in the consumption literature, allowing several sources of heterogeneity among consumers. It is based on incomplete markets and, hence, does not feature the “infinite elasticity of capital supply” of dynastic models with complete markets.3 This model also involves equilibrium interaction: inequality is determined not only by the individual household’s reactions to changes in the economic environment in which they operate but also by their interaction, such as in the equilibrium formation of wages and interest rates, two key prices determining the returns to labor and holding wealth, respectively. Our aim is to see to what extent a reasonably calibrated model can account for the movements in wealth inequality from the mid-1960s and on as a function of a number of drivers, the importance of each of which we then evaluate in separate counterfactuals.4 In this endeavor, we proceed as follows. We build on the model studied in Aiyagari (1994), i.e., we use the core setting of the recent literature on heterogenous agents in macroeconomics.5 This kind of theoretical model is quantitative in nature: it is constructed as an aggregate version of the applied work on consumption. Moreover, in it, inequality plays a central role. We calibrate some key parameters of this model to match the wealth and income distributions in the United States in the mid-1960s and treat these distributions as representing a long-run steady state. In the 1960s, too, the dispersion of wealth was striking, and it is not immediate how to make the basic model match the data in this respect. Building on the formulation in Krusell & Smith (1998), we use preference heterogeneity—in particular, stochastic discount rates that vary across the population at a point in time—to generate individual behavior among the very richest characterized by propensities to save that are stochastic but (almost completely) independent of wealth. We also incorporate idiosyncratic random asset returns, for which recent work by Fagereng et al. (2015) and Bach et al. (2015) uncovers evidence in panel data from Norway and Sweden. Hence, our setting can be viewed as a microfoundation for the kind of models entertained in Piketty & Zucman (2015) (who assume linear laws of motion for wealth accumulation and either random saving propensities or random returns). These models generate a 2

The appendix is available here: http://piketty.pse.ens.fr/files/capital21c/en/Piketty2014TechnicalAppendix.pdf. See also Piketty (1995) and Piketty (1997) which develop theories of the dynamics of the wealth distribution. 3 This elasticity refers to the long-run response of a household’s savings to a change in the interest rate: in particular, with infinitely-lived consumers and complete markets the equilibrium interest rate is pinned down by the rate of time preference. 4 We do not specifically study Piketty’s “Second Fundamental Law”, which is not a theory about inequality per se but about the aggregate capital-output ratio and which has also been extensively examined in Krusell & Smith (2015). 5 The first application in this literature was one to asset pricing (the risk-free rate): Huggett (1993). Aiyagari (1994) addresses the long-run level of precautionary saving, whereas Krusell & Smith (1998) look at business cycles.

2

wealth distribution whose right tail is Pareto-shaped, a feature shown to characterize the data; we discuss this finding and the relation to a number of other papers building on the same kind of reduced form in detail in the paper. With the resulting realistic starting wealth distribution, we then examine a number of potential drivers of wealth inequality over the subsequent period. One is tax rates: beginning around 1980 tax rates fell significantly for top incomes, so that tax progressivity in particular fell substantially. Thus, higher returns to saving in the upper brackets since that time can potentially explain increased wealth gaps between the rich and the poor. Another potential explanation for increased wealth inequality is the rather striking increases in wage/earnings inequality witnessed since the mid-1970s. Since at least Katz & Murphy (1992) it has been well-documented that the education skill premium has risen. Moreover, numerous studies have since documented that the premia associated with other measures of skill have also risen, as have measures of residual, or frictional, wage dispersion.6 In terms of the very highest earners, Piketty & Saez (2003) document significant movements toward thicker tails in the upper parts of the distribution. So to the extent that this increased income inequality has translated into savings and wealth inequality, it could explain some of the changes we set out to analyze. Relatedly, the share of total income paid to capital has increased recently, potentially contributing to increased wealth inequality (see, e.g., Karabarbounis & Neiman (2014b)). We consider this factor as well in this study. Finally, we look at the mechanism suggested in Piketty & Zucman (2015) whereby movements in r − g, i.e., the real interest

rate net of the growth rate of real GDP, would increase wealth inequality. The underlying idea here is that capital owners become richer the higher is the return on their savings, r, whereas workers’ economic strength increases with wage (or GDP) growth. Thus, the overall methodology we follow is to attempt to quantify the mechanisms just mentioned and then to examine their individual (and joint) effects on the evolution of wealth inequality from the 1960s. For the time period considered, we find that the benchmark model does account for a significant share of the increase in wealth inequality. The model is more or less successful depending on what aspect of wealth distribution is in focus. The shares of wealth held by the top 10% or top 1% exhibit net increases that are very similar in the model and in the data, though for the top 0.1% and 0.01% the model does not deliver enough of an increase, especially for the very top group. For the bottom 50%, the model’s fit is also good. Furthermore, the model delivers a time path for the ratio of capital to net output that is similar to the one in the data. As for the timing of the changes, the model delivers a rather smooth increase in inequality, whereas the data shows faster swings, first down and then up. Turning to which specific features explain the largest fractions of the increase in wealth inequality, the marked decrease in tax progressivity is by far the most powerful force for increasing wealth inequality. First, other things equal, decreasing tax progressivity spreads out the distribution of after-tax resources available for consumption and saving. Second, decreasing tax progressivity increases the returns to saving, leading to higher wealth accumulation, especially among the rich for whom wages (earnings) play a smaller role in their decision-making. Wage inequality, on the other hand, on net contributes negatively to wealth inequality: it increases 6

See, e.g., Acemoglu (2002), Hornstein et al. (2005), and Quadrini & Rios-Rull (2015).

3

by more in a model with changes in progressivity unaccompanied by increases in wage inequality than in a model with both types of changes. We follow Heathcote et al. (2010) and Piketty & Saez (2003) in modelling increased wage inequality as an increase in the riskiness of wage realizations around a mean. In a standard additive permanent-plus-transitory model of wages, we use the estimated time series in Heathcote et al. (2010) for the variances of the permanent and transitory shocks to wages. Both of those variances have increased over time, leading to a reduction in wealth inequality for two reasons. First, increasing wage risk dampens the tendency of heterogeneity in discount rates to drive apart the distribution of wealth.7 In particular, as wage risk increases, poorer and less patient consumers—who are less well-insured against this risk through their own savings—engage in additional precautionary saving, compressing the distribution of wealth at the low end. Second, with more risk aggregate precautionary savings increase, reducing the equilibrium interest rate and thereby discouraging wealth accumulation amongst the rich for whom wage risk per se is not so important. In sum, the increasing riskiness of wages compresses the wealth distribution at both ends.8 In addition, we follow Piketty & Saez (2003) by adding a Pareto-shaped tail to the wage distribution so as to match the concentration of earnings at the top of the earning distribution; the standard wage process (as in Heathcote et al. (2010)) does not match this extreme right tail well. Moreover, the right tail has thickened over this period, and accordingly we model this thickening as a gradually decreasing Pareto coefficient, based on the estimates in Piketty & Saez (2003). This element of increased wage inequality does generate more wealth inequality—because it occurs in a segment of the population where most workers are already rather well-insured through their own savings—but it is not so potent as to produce a net overall increase in wealth inequality from higher wage inequality. To allow for an increasing capital share over time we conduct an experiment using a CES production function with a somewhat higher than unitary elasticity between capital and labor. The resulting paths differ only marginally in this experiment than in the usual case with unitary elasticity. Finally, we conduct a more stylized r − g experiment in which r increases exogenously by 50% for a decade before returning to its initial steady-state values. This increase decreases inequality in wealth in the short run, because

the elasticity of saving to r − g turns out to be stronger for the poorer agents in our calibrated model. In

the very long run, an increase in r − g would translate into increased inequality but, as the recent paper

by Gabaix et al. (2016) points out, it takes a long time to fill a long right tail. Hence, even if there were some mechanism driving an increase in r − g during the past half-century, it would not help to explain the observed sharp upward movements in wealth inequality over this period.

What are the implications of our dynamic model of wealth inequality for the future? Quite strikingly, if the progressivity of taxes remains at today’s historically low level, then wealth inequality will continue to climb and reach very high levels by, say, 2100: the top 10% will have an additional 10% of all of wealth, as will (approximately) the top 1%. Thus, decreasing the progressivity of taxes is a rather powerful mechanism for wealth concentration. 7 As Becker (1980) shows, if discount rates are permanently different and there is no wage risk at all, then in the long-run steady state the most patient consumer owns all of the economy’s wealth. 8 Similar forces are at play in Krusell et al. (2009), but in the opposite direction: they find that reductions in wage risk that accompany the elimination of business cycles lead to higher wealth inequality.

4

Our paper begins in Section 2 with a brief literature review, the purpose of which is to put our modeling in a historical perspective. We discuss the data on wealth inequality and its recent trends in Section 3. We describe the basic model in Section 4 and the implied behavior of the very richest in Section 5. Section 6 discusses the calibration in detail and Section 7 the results. We conclude our paper in Section 8 with a brief discussion of potential other candidate explanations behind the increased wealth inequality and, hence, of possible future avenues for research.

2

Connections to the recent macro-inequality literature

The study of inequality in wealth using structural macroeconomic modeling can be said to have started with Bewley (undated), though in Bewley’s paper the focus was not on inequality per se.9 Bewley’s paper was not completed—it stops abruptly in the middle—and the first papers to provide a complete analysis of frameworks like his are Huggett (1993) and Aiyagari (1994). A defining characteristic of these models is that long-run household wealth responds smoothly to the interest rate, so long as the interest rate is not too high (higher than the discount rate in the case without growth). In their early papers, neither Bewley nor Huggett nor Aiyagari focused on inequality per se but rather on other phenomena related to inequality (asset pricing and aggregate precautionary saving in the latter two cases, respectively). Soon after, however, the macroeconomic literature that arose from these analyses began to address inequality directly. There were several reasons for this development. One was the interest in building macroeconomic models with microeconomic foundations in which heterogeneity could influence aggregates, i.e., cases that are in some sense far from aggregation and the typical permanentincome behavior that characterize the complete-markets model.10 Another was an interest in wealth inequality per se and the challenge it posed: the difficulty that these models have in generating significant equilibrium wealth inequality. The difficulty is apparent in Aiyagari (1994), where the wage process is calibrated to PSID data (as an AR(1) in logs): the resulting wealth distribution is slightly more skewed than the wage distribution the model uses as an input, but not by much. The Gini index for wealth, in the stationary distribution of Aiyagari’s model, is only around 0.4, whereas it is around 0.8 in the data. The purpose here is not to go over the entire literature aiming at matching the wealth distribution but several different extensions of the model have been proposed in order to match the data better. On some general level, successful paths forward involve introducing “more heterogeneity”: typically in preferences (such as discount factors, as in Krusell & Smith (1998)), in the wage/earnings process (as in Casta˜ neda et al. (2003)), or in occupation (as in Cagetti & De Nardi (2006) or Quadrini (2000)). More recently, a literature evolved that focuses on explaining the observed Pareto tail at the top of the wealth distribution. Benhabib et al. (2011) show analytically that the stationary wealth distribution in an overlapping-generations (OLG) economy with idiosyncratic capital return risk has a Pareto tail. 9

This model is of course not the first one with theoretical implications for inequality. An early example is Stiglitz (1969) who, building on his 1966 Ph.D. dissertation, studies the dynamics of the distributions of income and wealth in a neoclassical growth model with exogenous linear savings functions. A defining characteristic of the literature in focus here is that consumers face problems much like those studied in the applied consumption literature: they are risk-averse and choose optimal saving in the presence of earnings shocks for which there is not a full set of state-contingent markets. 10 See, e.g., Krusell & Smith (1998) and Guerrieri & Lorenzoni (2011) for this line of work.

5

Analogously, they provide analytical results for an infinite-horizon economy (Benhabib et al., 2015b). In Benhabib et al. (2015a), they conduct a quantitative investigation of social mobility and the wealth distribution in an OLG economy with idiosyncratic returns, which are fixed over a life-time. In a stylized model, Gabaix et al. (2016) demonstrate that the random growth mechanism that can generate the Pareto tail in the wealth distribution (either through idiosyncratic capital return risk or random discount factors) implies very slow transitional dynamics. Furthermore, Nirei & Aoki (2016) consider a stationary Bewley economy with investment risk. In that setting they find that decreasing top tax rates can explain the increasing concentration of wealth at the top. Most of the literature on Bewley models has considered only the stationary (long-run) wealth distribution. Two recent exceptions are Kaymak & Poschke (2016), who in line with our analysis here aim to quantify the contribution of changes in taxes and transfers and in the earnings distribution to changes in the U.S. wealth distribution, and Aoki & Nirei (forthcoming) who study how a one-time drop in tax rates affects transitional dynamics in a setting with investment risk. Relative to these recent contributions, the present paper builds directly on Aiyagari (1994) and matches the wealth distribution with the aid of stochastic, heterogeneous discount rates and idiosyncratic asset returns. As we show below, the randomness in discount rates and rates of return generates capital accumulation dynamics for the very richest that are similar to those in the recent theoretical studies on Pareto tails just cited, including the very slow transitional dynamics. For earnings, we follow Aiyagari (1994) but add a transitory shock to earnings as well as an exogenous Pareto-shaped tail in earnings. Because we also consider transitional dynamics, it is important to investigate how our results might depend on the extent to which agents can foresee the changes in taxes and other exogenous factors; here we consider both perfect foresight and a “myopic” alternative. We do not incorporate assets like land, housing, or stock-market equity but focus on physical capital only. This is potentially an important omission insofar as the returns on these assets are random and have experienced a growing variance over time, as discussed in our concluding remarks in Section 8.

3

Measuring wealth inequality over time

Over the last century, the distribution of wealth in the United States has undergone drastic changes and we very briefly review data from some key studies here. Throughout the time period considered, wealth was heavily concentrated at the top. Figure 1 shows the evolution of the share of total wealth held by the top 1% and the top 0.1%, as measured using different estimation methods.11 Considering all three methods jointly, top wealth inequality exhibits a U-shaped pattern in the twentieth century. Yet, the magnitude of the increase in wealth concentration in the last thirty years differs substantially among estimation methodologies. We will calibrate the initial steady state of our model to the wealth shares estimated by Saez & Zucman (2016) and consequently compare the model transition to their estimates. 11

In Figure 1, the lines labelled “SCF” display findings from the Survey of Consumer Finances, as reported in Saez & Zucman (2016). The lines labelled “Capitalization” display findings from Saez & Zucman (2016), who back out the stock of wealth held by a tax unit from observed capital income tax data. Finally, the lines labelled “Estate tax multiplier” display findings from Kopczuk & Saez (2004), who use observed estate tax data to make inferences about the distribution of wealth. See Kopczuk (2015) for a detailed comparison of the different measurement methods.

6

55 Capitalization, Top 1% Capitalization, Top 0.1% SCF, Top 1% SCF, Top 0.1% Estate tax multiplier, Top 1% Estate tax multiplier, Top 0.1%

50

Wealth Share in %

45 40 35 30 25 20 15 10 5 1920

1930

1940

1950

1960

1970

1980

1990

2000

2010

Figure 1: Top wealth share measurements over time Their estimates are especially useful for us as they allow for considering a group as small as the top 0.01%. Furthermore, they cover a long time period. While the capitalization method that they use to back out wealth estimates does not suffer from the shortcomings of the SCF data (such as concerns about response-rate bias and exclusion of the Forbes 400), it is an indirect way of measuring wealth and as such has other drawbacks. For example, the tax data allows only for a coarse partitioning of capital income in asset classes and within each class returns are effectively assumed to be homogeneous. Therefore, we will in addition contrast our findings to estimates from the Survey of Consumer Finances.12 Another takeaway from Figure 1 is that the wealth distribution was quite stable in the 1950s and 1960s. As, in addition, some of the time series estimates we feed into our model start in 1967, we take this year as the initial steady state in our model.

4

Model framework

In this section, we describe the model economy. We depart from the framework studied by Aiyagari (1994). To generate realistic income and wealth heterogeneity, the model features stochastic discount rates and returns to capital as well as an earnings process centered around a persistent and a temporary component. 12

Bricker et al. (2016) make adjustments to the SCF data, including incorporating the Forbes 400. For the top 0.1% wealth shares these adjustments roughly cancel. For the top 1% shares these adjustments shift the corresponding line in Figure 1 down by approximately 2 to 3 percentage points.

7

4.1

Consumers

Time is discrete and there is a continuum of infinitely lived, ex ante identical consumers (dynasties). Preferences are defined over infinite streams of consumption with von Neumann-Morgenstern utility in constant relative risk aversion (CRRA) form: u(c) =

c1−γ . 1−γ

(1)

In period t, a consumer discounts the future with an idiosyncratic stochastic factor βt that is the realization of a Markov process characterized by the conditional distribution Γβ (βt+1 |βt ), giving rise to the following

objective:

max

(ct )∞ t=0

(

u(c0 ) + E0

"

∞ t−1 X Y

#)

βs u(ct )

t=1 s=0

.

(2)

Labor supply is exogenous. Each period t, a consumer supplies a stochastic amount lt = lt (pt , νt ) of efficiency units of labor to the market that depends on a persistent component pt ∼ Γp (pt |pt−1 ) and a transitory component νt ∼ Γν (νt ). Taking as given a competitive wage rate wt , her earnings are wt lt .

Asset markets are incomplete: consumers cannot fully insure against idiosyncratic shocks, but instead

have access only to a single asset that pays a gross return (1 + rt ηt ), where rt is the average market return and ηt ∼ Γη (ηt ) is a transitory idiosyncratic shock.13

The decision problem of the consumer can be stated parsimoniously in recursive form:

Vt (xt , pt , βt ) = max {u(xt − at+1 ) + βt E [Vt+1 (xt+1 , pt+1 , βt+1 )|pt , βt ]} at+1 ≥a

subject to xt+1 = at+1 + yt+1 − τt+1 (yt+1 ) + Tt+1

(3) (4)

yt+1 = rt+1 ηt+1 at+1 + wt+1 lt+1 (pt+1 , νt+1 )

(5)

Given cash-on-hand xt (all resources available in period t), the optimal savings decision and the resulting value function depend solely on the persistent component in the earnings process pt and the current discount factor βt . Conditional on (pt , βt ), the expectation is taken over (pt+1 , βt+1 ) as well as the transitory shocks to earnings νt+1 and the return on capital ηt+1 . Gross income yt is subject to an income tax τt (·) and each consumer receives a uniform lump-sum transfer Tt .

4.2

Production, government, and equilibrium

Firms are perfectly competitive and can be described by an aggregate constant returns to scale production function F (Kt , L) that yields a wage rate per efficiency unit of labor wt = 13

∂F (Kt ,L) ∂L

as well as an (average)

Fagereng et al. (2015) and Bach et al. (2015) find not only heterogeneity but persistence in idiosyncratic asset returns but a good portion of this persistence stems from richer consumers bearing more aggregate risk, which we do not model here. Furthermore, given that we allow for persistence in discount factors, we find below that we can replicate the wealth distribution in 1967, even in its remotest tails, quite accurately without persistence in idiosyncratic returns.

8

market return on capital rt =

∂F (Kt ,L) ∂K

− δ, where δ ∈ (0, 1) is the depreciation rate. Aggregate labor

supply L is normalized to one throughout.

The government redistributes aggregate income by means of a uniform lump-sum payment, which amounts to a constant fraction λ ∈ [0, 1] of aggregate tax revenues. The remainder is spent in a way such

that marginal utilities of agents are not affected.

A steady-state equilibrium of this economy is characterized by a market clearing level of capital K ⋆ and a lump-sum transfer T ⋆ such that: (i) factor prices are given by their respective marginal products w⋆ =

∂F (K ⋆ ,1) ∂L

and r ⋆ =

∂F (K ⋆ ,1) ∂K

− δ;

(ii) given r ⋆ , w⋆ , and T ⋆ , consumers solve the stationary version of their decision problem, giving rise to an invariant distribution Γ(a, p, β, ν, η); (iii) the government redistributes a fraction λ of total tax revenues, i.e., T⋆ = λ

Z

τ (r ⋆ ηa + w⋆ l(p, ν))dΓ(a, p, β, ν, η);

(iv) and capital markets clear, i.e., ⋆

K =

Z

adΓ(a, p, β, ν, η).

In the benchmark perfect-foresight transition experiment, we start the economy in period t0 in some initial steady state, described by a vector θ ⋆ that parametrizes the tax schedule and earnings process and by the equilibrium objects (K ⋆ , T ⋆ ). Agents are fully surprised and learn about a new exogenous 1 environment (θt )tt=t that will prevail over some transition period t = t0 + 1, t0 + 2, ..., t1 . From t1 0 +1

onwards, the exogenous environment will once again be constant and equal to θt1 . In a perfect-foresight equilibrium, agents are fully informed about future equilibrium objects (Kt , Tt )∞ t=t0 +1 too and optimize accordingly. Capital markets clear and the fraction of tax revenues λ that is redistributed is fixed. In an alternative myopic transition experiment, agents are surprised about the new exogenous environment and equilibrium prices every period. That is, in period t = t0 , t0 +1, ..., t1 −1, given a distribution Γt (xt , pt , βt ), they choose a savings decision rule, at+1 = gt (xt , pt , βt ), assuming that both θt and (rt , wt , Tt )

will prevail forever. In period t + 1, they are accordingly surprised that: one, the exogenous environment has changed to θt+1 ; and, two, that equilibrium factor returns (rt+1 , wt+1 ) and transfers Tt+1 result from capital-market clearing and government-budget balance in period t + 1.14 These two informational structures are, of course, extreme. We chose them because we expect them to bracket a range of informational 14

That is, (rt+1 , wt+1 ) are the marginal products of the net production function F (Kt+1 , 1) − δKt+1 , where Z Kt+1 = gt (xt , pt , βt )dΓt (at , pt , βt , νt , ηt ),

and Tt+1 = λ

Z

τt+1 (rt+1 ηat+1 + wt+1 lt+1 (pt+1 , νt+1 ))dΓt+1 (at , pt , βt , νt , ηt ),

where Γt+1 is the distribution in period t + 1 generated by the period-t distribution Γt and the decision rule gt .

9

assumptions. Given that the results, as will be reported below, turn out to be very similar across the two structures, we are confident that our findings are robust to other variations in this dimension.

5

The right tail of the wealth distribution: approximately Pareto

In this section, we briefly explain the main mechanism that leads to a “fat” Pareto-shaped right tail in the wealth distribution. The same mechanism is at play in the much simpler stochastic-β model originally proposed in Krusell & Smith (1998). Formally, we make use of a mathematical result on random growth by Kesten (1973): consider a stochastic process at = st at−1 + ǫt ,

(6)

where st and ǫt are (for our purposes positive) i.i.d. random variables. If there exists some ζ > 0 such that E[sζ ] = 1 as well as E[ǫζ ] < ∞, then at converges in probability to a random variable A that satisfies

lima→∞ P rob(A > a) ∝ a−ζ , i.e., the right tail of the stationary distribution has a Pareto shape.15

In our setup, st is the asymptotic marginal propensity to save out of wealth, varying in β, and ǫt

represents labor income. Crucially, optimal savings decisions are asymptotically, with increasing wealth, linear in economies with idiosyncratic risk and incomplete markets.16 Assuming a fixed discount rate, Carroll & Kimball (1996) prove in a finite-horizon setting that the consumption function is concave under hyperbolic absolute risk version, which comprises most commonly used utility functions (e.g., CRRA). Hence, the savings rule is convex. However, as household wealth increases, the convexity in the savings rule becomes weaker and weaker.17 Intuitively, as wealth grows large consumers can smooth consumption more and more effectively. Moreover, with CRRA preferences decisions rules are exactly linear in the absence of risk (or with complete markets against such risk). The slope is then larger (smaller) than one as the discount rate is smaller (larger) than the interest rate. In the recent literature on the Pareto tail in the wealth distribution, either saving rates or returns to capital (or both, as in this paper) are assumed to vary randomly across consumers. Saving rules are then asymptotically linear with random coefficients: Benhabib et al. (2015b) show analytically that in this case the unique ergodic wealth distribution has a Pareto distribution in its right tail. Figure 2 shows the marginal propensity to save out of capital holdings (denoted k in the figure) arising from the stochastic-β model under study in the present paper.18 As discussed above, the marginal propensity to save increases in wealth, holding earnings constant, and asymptotes to a constant that depends on the consumer’s discount factor. Figure 3 displays the tail behavior of the stationary wealth distribution. In line with the theoretical results in Benhabib et al. (2015b), the logarithm of its counter 15

The exact conditions as well as a very accessible treatment can be found in Gabaix (2009). In fact, the decision rules are almost linear for all but the very poorest agents, i.e., those close to the borrowing constraint. For this reason, approximate aggregation as introduced in Krusell & Smith (1998) typically works very well. 17 A direct proof for a two-period problem can be found in Krusell & Smith (2006); Carroll (2012) proves the asymptotic linearity of the savings rule in a finite-horizon problem as the horizon grows large. 18 The graphs in this section are derived from a simplified model with a flat tax, to focus on the main mechanism. 16

10

cumulative distribution function becomes linear in the logarithm of assets as assets grow large, indicating that the right tail of the distribution follows a Pareto distribution.

marginal propensity to save

1

0.95

0.9 high beta, high earnings high beta, low earnings low beta, high earnings low beta, low earnings

0.85 2

4

6

8

10

12

14

16

log(k)

Figure 2: Asymptotic marginal propensity to save In light of this result, it is worth noting that the model in Casta˜ neda et al. (2003)—which generates substantial wealth inequality using an earnings process featuring a low-probability but transient veryhigh-earnings state—does not deliver a Pareto tail in wealth. In this model, in which consumers have a common discount rate, marginal propensities to save do not vary but instead converge to the same constant, independently of the level of earnings and as a result the steady-state distribution of wealth does not feature a Pareto tail. This model can deliver such a Pareto tail, however, if the earning process itself has a Pareto tail. In the absence of randomness in either discount rates or returns, however, the wealth distribution inherits not only the Pareto tail of the earnings distribution but also its Pareto coefficient. Because earnings are considerably less concentrated than wealth, the resulting tail in wealth is too thin to match the data in such an alternative model.

6

Calibration

In this section, we describe how we calibrate our model economy. As indicated in Figure 1, the U.S. wealth distribution was roughly stable in the 1950s and 1960s. Since some of the time series estimates we use start in 1967, we take this year as our initial steady state. We set the model period to a year to conform to the tax system.

11

0 -2 -4 -6 -8 -10 -12

log(Prob(K > k)) Top 10% Top 1% Top 0.1% Top 0.01%

-14 -16 -18 -5

0

5

10

15

log(k)

Figure 3: Pareto tail of the wealth distribution

6.1

Basic parameters

We parameterize the production technology and utility function using standard functional forms and parameters. The (gross) production function is given by F (K, L) = K α L1−α . The capital share is set to α = 0.36 and depreciation to δ = 0.048 annually. In an extension (see Section 7.4), we check the sensitivity of our results to using a constant-elasticity-of-substitution production function with (gross) elasticity greater than one. The coefficient of relative risk aversion, γ, is set to 1.5.

6.2

The earnings process

The earnings process is based on the traditional log-normal framework with lt (pt , νt ) = exp(pt + νt ). That is, we assume that the persistent component pt of the earnings process follows a Gaussian AR(1) process with parameters (ρP , σtP ). The autocorrelation coefficient, ρP , is fixed over time, while the innovation standard deviation varies. Likewise, the transitory component νt is also assumed to be normally distributed with standard deviation σtT . We use estimates by Heathcote et al. (2010) that span the period 1967–2000 and assume that the time-varying variances of the innovations are constant thereafter. The left panel of Figure 4 displays the resulting cross-sectional dispersion. The estimates show a significant increase in earnings risk for both components. As is well known, the resulting log-normal cross-sectional distribution of earnings understates the concentration of top labor income quite severely. Because the observed increase in top labor income shares is potentially an important explanation for the observed increase in wealth inequality at the top, we augment the framework for the top 10% earners in such a way that we can directly match the fraction of labor income going to the top 10%, top 1%, top 0.1% and top 0.01%. In concrete terms, we posit

12

0.6

Cross-sectional Standard Deviations

Pareto Tail Coefficient Earnings

3

persistent component transitory component

2.8

0.5 2.6 0.4

2.4 2.2

0.3

2 0.2 1.8 0.1

1.6 1970

1980

1990

2000

2010

1970

1980

1990

2000

2010

Figure 4: Earnings process ingredients lt (pt , νt ) = ψt (pt ) exp(νt ), where

ψt (pt ) =

 exp(pt ) F −1

P areto(κt )



Fpt (pt )−0.9 1−0.9



if Fpt (pt ) ≤ 0.9, if Fpt (pt ) > 0.9.

(7)

−1 Fpt (·) is the cdf of pt and FP−1 areto(κt ) (·) the inverse cdf for a Pareto distribution with lower bound Fpt (0.9)

and shape coefficient κt . Effectively, we thus assume that top earnings are spread out according to a (scaled) Pareto distribution, while earnings for the majority of workers are distributed according to a log-normal distribution. The Pareto tail coefficient on labor income κt is then one additional free parameter to calibrate in each year year. We use estimates on top wage shares from an updated series by Piketty & Saez (2003) spanning 1967–2011 as calibration targets. The right panel of Figure 4 displays the calibrated Pareto tail coefficient κt and Figure 5 displays the resulting top labor income shares. That we can match top labor income shares very well using just a single parameter in each year (i.e., the tail coefficient) simply reflects the fact that the Pareto distribution is a very good description of the cross-sectional earnings distribution at the top. We do not explicitly model unemployment, nor voluntary non-employment or retirement. We do, however, introduce a zero-earnings state, occurring with probability χ0 = 0.075 independently of (pt , νt ) and over time, reflecting both long-term unemployment and shocks that trigger temporary exit from the labor force. This probability is calibrated, together with a borrowing constraint amounting to roughly one yearly lump-sum transfer, so that the initial steady-state wealth distribution matches both the share of wealth held by the bottom 50% and the fraction of the population with negative net wealth.

6.3

Tax system

The progressivity of the U.S. tax system has decreased substantially over the model period. To account for these changes, we use estimates on federal effective tax rates by Piketty & Saez (2007) for the period

13

top 10% share

top 1% share 14

35 12 10

30

8 model data

25 1970

1980

1990

2000

6

2010

1970

top 0.1% share

1980

1990

2000

2010

top 0.01% share

5

2

4

1.5

3 1 2 0.5 1 1970

1980

1990

2000

2010

1970

1980

1990

2000

2010

Figure 5: Top labor income shares in % 1967–2000, keeping them constant thereafter. These comprise the four major federal taxes: individual income, corporate income, estate and gift, and payroll taxes.19 Piketty & Saez (2007) calculate effective average tax rates for eleven income brackets, with a particularly detailed decomposition for top income groups (up to the top 0.01%). We translate this data to our model by means of a step-wise tax function τt (·) with eleven steps. For each bracket, the threshold is set to match its income share in the data and the marginal tax rate such that the resulting average tax rate aligns with the data. Figure 6 shows that the U.S. tax system has indeed become much less progressive over the model period. Note that in our model taxes τt (yt ) are a function of total income yt , consistent with the measurement. A weakness of our calibration is that we do not have separate tax rates for different sources of income, but a strength is that we use effective tax rates, thereby accounting for tax avoidance and changing portfolio composition to the extent that these vary systematically with income. To account for government transfers, we introduce a social safety net in the simplest possible way by assuming that each agent receives an (untaxed) lump-sum transfer Tt every period, its size being a constant fraction λ = 0.6 of tax 19

Given that our model abstracts from the life cycle, it is appropriate to include the estate tax in the tax on total income, thus effectively smoothing out the incidence of this tax over the life cycle. Ignoring the estate tax would mean omitting a major source of decreasing tax progressivity. Piketty & Saez (2007) assume further that the corporate income tax burden falls entirely (and uniformly) on capital income. They argue that this is a middle-ground assumption (regarding the resulting tax progressivity) between assuming that the tax falls solely on shareholders at one extreme and assuming that it is effectively born by labor income at the other extreme.

14

0.8 top rate 5 * average income 3 * average income average income

0.7

0.6

0.5

0.4

0.3

0.2

0.1 1970

1975

1980

1985

1990

1995

2000

Figure 6: Imputed marginal tax rates for selected total income levels revenues.20 Note that the income tax does not distort labor supply in our setting, since we assume the latter is exogenous. This simplification is obviously not a good one for understanding the welfare consequences of changes in tax rates, but because our current focus is on wealth accumulation and its distribution in the population we do not think that it is a major shortcoming.

6.4

Idiosyncratic discount rates and returns to capital

Finally, we calibrate the process for the discount factor as well as the dispersion in the returns to capital to match the right tail of the wealth distribution in the initial steady state. To discipline our exercise, we posit that β follows a Gaussian AR(1) process: βt = ρβ βt−1 + (1 − ρβ )µβ + σ β ǫβt ,

ǫβt ∼ N (0, 1).

Moreover, we assume that the idiosyncratic factor in the return to capital is normally distributed: ηt ∼i.i.d.

N (1, σ η ). Importantly, all these parameters are fixed over time (by varying them freely we could of course track the evolution of the wealth distribution more or less exactly). The mean discount factor determines the equilibrium capital-output ratio and we set it to µβ = 0.92 to match a ratio of capital to net output of about 4 in the initial steady state. The calibrated stochastic-β parameters are ρβ = 0.992 and σ β = 0.0019, implying that the standard deviation of the cross-sectional distribution of discount factors, which does not vary over time, is 0.0148. The idiosyncratic noise in the return to capital is calibrated to σ η = 0.725, 20

About 60% of total federal outlays are mandatory spending, the bulk of it on Social Security, Medicare, Medicaid, and income security programs (CBO, 2015). The remainder is spent on the Department of Defense and other government agencies as well as on interest payments.

15

ρβ

σβ

ση

a

χ0

0.992

0.0019

0.725

0.075

Target

Top 10% share

Top 1%

Top 0.1%

Top 0.01%

−0.24

Bottom 50%

Fraction a < 0

Data Model

70.8% 70.6%

27.8% 28.1%

9.4% 9.5%

3.1% 2.9%

4.0% 3.1%

8.0% 7.0%

Parameter Value

Table 1: Matching the 1967 wealth distribution as a steady state implying that the gross (pre-tax, net of depreciation) return on capital (1 + r ⋆ η) lies in the interval [0.9874, 1.1437] for 90% of all agents in the initial steady state.21 To summarize, Table 1 lists the values of the five parameters (persistence and standard deviation of the discount rates; standard deviation of return shocks; the borrowing constraint; and the probability of zero income) calibrated to match as closely as possible six features of the initial steady-state wealth distribution: the shares held by the top 10%, the top 1%, the top 0.1%, the top 0.01%, and the bottom 50% as well as the fraction of the population with negative net wealth. The fit is excellent at both ends of the distribution.22 To the extent that the right tail of the wealth distribution has a Pareto tail, we are therefore also matching the Pareto coefficient governing its thickness, because this coefficient is pinned down by the ratio of the top 0.01% share to the top 0.1% share, or the ratio of the top 0.1% share to the top 1% share, both of which are roughly one-third, both in the model and in the data. Two comments are in order. First, when solving the model numerically we truncate the β and η distributions to ensure that the consumer’s optimization problem is well-defined (with finite present-value utility) and that a stationary distribution of wealth emerges. Unlike in a standard Aiyagari economy without heterogeneity in preferences, in our model some agents temporarily have discount rates that are smaller than the rate of return, a necessary condition for generating a Pareto tail in the wealth distribution (see the discussion in Section 5). It follows that the support of the stationary wealth distribution is not bounded from above. In practice, we use a large enough upper bound in our numerical implementation so that the resulting truncation error is negligible.23 Second, if our goal were solely to match the Pareto coefficient in the right tail of the wealth distribution, it would be excessive to calibrate as many as five parameters to match features of the wealth distribution. But the tail coefficient is not a sufficient statistic for wealth inequality unless the entire distribution is (counterfactually) Pareto-shaped: even if, say, the top 1% of the wealth distribution can be described exactly by a Pareto distribution, the tail coefficient determines only the distribution of wealth within these top 1% but not the fraction of total wealth held by the top 1%. While stochastic discount factors are the main force driving the shape of the upper tail in the initial steady-state wealth distribution, to achieve 21 The idiosyncratic variation of returns in our calibration turns out to be close to the amount found by Fagereng et al. (2015) in Norwegian data; see, for example, Panel C of Table 1 in that paper. Bach et al. (2015) find roughly comparable amounts of variation in Swedish data. 22 The data on top wealth shares in Table 1 is from Saez & Zucman (2016), who use a capitalization method to calculate them. Because this method is unreliable for a breakdown of the bottom 90%, the other data moments are based on survey data (SCF and precursors); see Kennickell (2011). 23 Appendix A describes in detail our numerical procedure.

16

our objective of replicating the distribution of wealth on its entire domain we found that introducing in addition a reasonable amount of randomness in returns helped to improve the fit. Moreover, because ownership of primary residences and poorly diversified private equity account for a sizable fraction of net wealth, we view randomness in returns as a realistic feature of individual asset accumulation.24

7

Results

In Section 6, we showed that our model framework, when properly calibrated, can replicate wealth heterogeneity, including the Pareto-shaped right tail, as well as other macroeconomic moments in the initial steady state. We proceed in this section to report on our main result: the evolution of the wealth distribution in the our model economy contrasted with the data. Subsequently, we assess the robustness of our findings to key assumptions about households’ foresight and about how the return to capital responds to capital deepening. We conclude the section with an outlook into the 21st century through the lens of our model.

7.1

Benchmark transition experiment

In the benchmark specification, we start the economy in its initial steady state in 1967 and compute the model transition to a new steady steady under perfect foresight. In the new steady state, the tax system is less progressive, earnings risk is higher, and top labor income inequality is higher too, as described in Sections 6.2 and 6.3. Figure 7 displays the evolution of top wealth shares in the model (solid blue line) compared to the data as measured by Saez & Zucman (2016) using the capitalization method (SZ). In addition, whenever possible the graphs are augmented by survey estimates from the SCF (dashed yellow lines). These shares display an overall upward trend in both the data and the model economy. The model economy matches the magnitude of the changes in the top 10% and top 1% shares over the last half-century, though it fails to capture the U-shaped pattern in the data: these shares drop somewhat between the 1960s and the 1980s before climbing rapidly. Further in the tail, the model economy continues to capture the trend, but fails more and more to replicate the magnitude of the changes. The share of total wealth held by the top 0.1%, for example, increases by more than ten percentage points in the data over the sample period 1967–2012 (according to the SZ estimates) but by less than five in the model.25 Even further in the tail, the share of wealth held by the top 0.01% increases from 2.9% to 4.2% over the sample period (i.e., almost a fifty percent increase), while it more than triples, from 3.1% to 11.2%, in the data (again according to the SZ estimates). As we discuss further in Section 7.6, the top wealth shares in the model economy continue to increase slowly over a long transition period before reaching the new steady state. This finding is consistent with Gabaix et al. (2016) who argue that the random growth mechanism that 24 See Moskowitz & Vissing-Jørgensen (2002), who document extreme concentration of private equity (its total value being similar in magnitude to public equity): 75% of total private equity is held by households for whom it accounts for the majority of their total net worth and entrepreneurs invest on average more than 70% of their private equity holdings in a single company. See also the discussion in McKay (2013) regarding the distribution of returns on mutual funds. 25 In the SCF data, by contrast, the increase is the same as in the model

17

top 10% wealth share

80

top 1% wealth share

model data (SZ) data (SCF)

75

40 35

70

30 25

65 1970

1980

1990

2000

2010

1970

top 0.1% wealth share

1980

1990

2000

2010

top 0.01% wealth share 12

20

10 8

15 6 4

10

2 1970

1980

1990

2000

2010

1970

1980

1990

2000

2010

Figure 7: Top wealth shares in %, 1967–2012 drives top wealth inequality tends to produce slow transitions (especially in the tails of the distribution). Figure 8 shows that the transition in the model economy is broadly consistent with the data along two other key dimensions: first, the ratio of capital to net output shows a steady rise similar to the one in the data (absent business-cycle fluctuations); second, the share of wealth held by the bottom half, already quite small in 1967, drops by two percentage points in the model as compared to about three percentage points in the data.26

7.2

Counterfactuals

Changes in three structural factors—earnings risk, top earnings inequality, and tax progressivity—drive the transitional dynamics in the model economy. To assess which of these is the most important quantitatively, we conducted three experiments in which only one of the three structural factors is allowed to change, the other two being held constant instead at their 1967 values. Which of these changes is the main driver of increases in wealth inequality, particularly in the upper reaches of the distribution? Figure 9 displays the transition paths for the top wealth shares for each of the three counterfactual experiments over the time period 1967 to 2012. These graphs show clearly that the main driver of changes in the right tail of the wealth distribution is changes in taxes. Increases in earnings risk, on the other hand, reduce top wealth inequality, other things equal. 26

Aggregate wealth data is from Piketty & Zucman (2014).

18

capital - net output ratio

6

model (capital) data (national wealth) data (private wealth)

5.5

bottom 50% share

4 3.5

5

3

4.5

2.5

4

2

3.5

1.5

3

model data (SCF)

1 1970

1980

1990

2000

2010

1970

1980

1990

2000

2010

Figure 8: Capital-output ratio and bottom 50% share (in %), 1967–2012

top 10% wealth share

top 1% wealth share 40

taxes full model top earnings earnings risk

80

35

75

30

70

25

65 1970

1980

1990

2000

2010

1970

top 0.1% wealth share

1980

1990

2000

2010

top 0.01% wealth share

14 4

13 12

3.5 11 10

3

9 2.5 1970

1980

1990

2000

2010

1970

1980

1990

Figure 9: Top wealth shares in %, 1967–2012

19

2000

2010

capital - net output ratio

4.3

taxes full model top earnings earnings risk

4.2 4.1

bottom 50% share

4 3.5 3

4 2.5 3.9 2

3.8

1.5

3.7 3.6

1 1970

1980

1990

2000

2010

1970

1980

1990

2000

2010

Figure 10: Capital-output ratio and bottom 50% share (in %), 1967–2012 The right panel of Figure 10 displays the transition path for the bottom 50% share, where again changes in taxes are the main driver. Finally, the left panel of Figure 10 displays the transition path for the ratio of capital to net output. Here are all three changes contribute positively to its growth from 1967 to 2012, with the increase in earnings risk playing the largest role. Table 2 summarizes the results of the three experiments, quantifying how much each of the factors contributes to the changes in the wealth shares over the time period 1967–2012.

Top 10% Top 1% Top 0.1% Top 0.01% Bottom 50%

Earnings risk

Top earnings

Taxes

Combined

−0.78 −0.19 −0.08 −0.04 −0.21

0.22 0.05 0.03 0.03 0.33

1.89 0.82 0.35 0.16 0.55

1.32 0.65 0.31 0.18 0.78

Table 2: Fraction of change in wealth shares explained by model: decomposition by channel To understand the numbers in the table, focus on the share of total wealth held by the richest percentile. Saez & Zucman (2016) measure an increase in this share from 27.8% to 41.8% from 1967 to 2012. Over the same time period, allowing for changes only in earnings risk and keeping all other parameters fixed at their initial steady-state values, the model predicts a decrease from 28.0% to 25.2%. Changes in earnings risk therefore explain a fraction 25.2−28.0 / 41.8−27.8 = −0.20 of the actual change.27 28.0 27.8

Again, the observed increases in earnings risk reduce inequality, moving it in the opposite direction from the observed changes! (Separate increases in either the persistent or transitory components of earnings risk also reduce inequality.) Instead, as can be seen for all the different distributional statistics, the main driver of the surge in wealth concentration is the changing U.S. tax system. The increase in top earnings inequality (parametrized by changes over time in the the Pareto tail coefficient κt on labor income) has worked in the same direction, although the effect of this channel is much smaller. 27 Note that the fractions generally do not add up to the fraction explained when feeding in all observed changes at the same time, as in our benchmark experiment. The remainder is due to interaction effects in general equilibrium.

20

Why does an increase in earnings risk reduce wealth inequality? As noted in Section 1, persistent heterogeneity in discount rates is a powerful force driving the wealth distribution apart: with permanently different discount rates and complete markets against earnings risk, the most patient would eventually hold all the economy’s wealth.28 Earnings risk, then, is a friction, or glue, that keeps the distribution from flying apart altogether as in Becker (1980)’s work cited in Section 1. This risk operates especially strongly at the low end of the wealth distribution, where poorer consumers save to move away from borrowing constraints when earnings risk is larger. In our model higher earnings risk also generates a thinner right tail in the wealth distribution because the resulting increase in aggregate precautionary savings drives down the equilibrium interest rate. This drop in the interest rate shifts the distribution of savings propensities to the left, particularly for the wellinsured wealthy consumers for whom wage risk is largely immaterial and who therefore have essentially linear decision rules. As discussed in Section 5), the Pareto tail coefficient, ζ, is defined implicitly by the equation E[sζ ] = 1, where s is the (asymptotic) marginal propensity to save out of wealth. As s falls for all discount-factor types, ζ must increase to compensate, i.e., the Pareto tail becomes thinner.29 Why have changes in the tax system induced such large changes in wealth inequality? Note first that the average tax rate (i.e., total tax revenues as a fraction of net GDP) in our model increases from 0.23 to 0.27 over the period 1967–2012. An increase in average taxes tends to reduce effective earnings risk (because the tax is multiplicative), increasing inequality for the same reason (but in the opposite direction) that the observed increases in (pre-tax) earnings risk reduce inequality. This effect, however, is a small one unless the average tax rate changes dramatically. Much more important quantitatively is the dramatic decrease in tax progressivity, where even small changes have large effects on inequality, especially at the high end of the wealth distribution. To see why, note that the marginal saving propensity for a wellinsured consumer is approximately β(1 + r(1 − τ ′ (y))), where β is the consumer’s current discount factor

and τ ′ (y) is the consumer’s current marginal tax rate. This tax rate varies with the consumer’s income, y, but it is persistent over time because income is persistent. Tax progressivity, therefore, generates persistent differences across consumers that act like persistent differences either in the consumers’ aftertax rates-of-return, r(1 − τ ′ (y)), or, equivalently, in consumers’ discount factors. Consequently, decreases

in progressivity have the same effect as increasing the spread of discount factors, a powerful force for generating differential savings behavior and as a result higher wealth inequality. In a representative-agent model the increase in average taxes would lead in equilibrium to a decrease in the capital-to-output ratio, but it does not in our heterogeneous-agent model for three reasons. First, the (smallish) increase in average taxes does not offset the even larger increase in the riskiness of pre-tax earnings, leading to more precautionary savings in the aggregate. Second, decreases in tax progressivity provide powerful incentives for the rich to save even more. Third, the increasingly “thick” right tail in earnings provides the rich (who tend to be those with high earnings) with additional resources for saving. These three forces combine to generate a fairly large increase in the ratio of capital to net output over 28

In our model, idiosyncratic returns are statistically independent across time, but were they persistent they would act much like persistent heterogeneity in discount rates to spread out the wealth distribution, because discount rates and returns enter similarly in consumers’ Euler equations. 29 Nirei & Aoki (2016) observe the same effect.

21

capital - net output ratio

4.2

perfect foresight myopic

4.1

top 1% wealth share

38 36

4

34

3.9

32

3.8

30

3.7

28

3.6

26 1970

1980

1990

2000

2010

1970

1980

1990

2000

2010

Figure 11: Contrasting informational assumptions the period 1967–2012.

7.3

Robustness to myopic agents

It is surely bold to assume that agents have perfect foresight on the entire path of the tax schedule, the parameters governing the earnings process, and the resulting equilibrium prices. To gauge the sensitivity of our findings to this assumption, we computed the transitional dynamics under complete myopia, i.e., a polar opposite case in terms of agents’ ability to predict. That is, in every period agents believe that the current environment will prevail forever and, accordingly, they are surprised to learn about their forecasting mistake in the subsequent period.30 Figure 11 shows selected model moments: we find that the differences are very small and conclude that the perfect-foresight assumption is not critically driving the results.

7.4

Robustness to the elasticity of substitution in production

The stability of the fraction of income accruing to labor, for a long time a central pillar of macroeconomic models, has recently been questioned. Karabarbounis & Neiman (2014b), among others, document a visible (though not large) decline in the labor share. Using a production function with a constant elasticity of substitution (CES), they estimate an elasticity of substitution between capital and labor of 1.25. To look into the possibility of a falling labor share, we use a standard CES production function, 

σ−1 σ

FCES (Kt , L) = ACES αCES Kt

+ (1 − αCES )L

σ−1 σ



σ σ−1

,

(8)

where ACES and αCES are chosen such that the initial steady state is identical to the Cobb-Douglas benchmark. Over time, there is capital deepening, leading to a lower labor share because the elasticity of substitution is above one. We find, however, only very small differences as compared to the CobbDouglas benchmark (see Figure 12). Capital deepening leads to a smaller reaction of the interest rate, 30

See Section 4.2 for an exact description of how this experiment is conducted.

22

capital - net output ratio Cobb-Douglas CES sigma=1.25

4.3 4.2

interest rate (pre-tax) 0.065

4.1

0.06

4 3.9 0.055

3.8 1970

1980

1990

2000

2010

1970

top 1% wealth share

1980

1990

2000

2010

Gini gross income

38 36

0.5

34 32

0.45

30 28

0.4 1970

1980

1990

2000

2010

1970

1980

1990

2000

2010

Figure 12: Cobb-Douglas vs. CES production function with elasticity σ = 1.25 so the rise in the capital-output ratio is slightly larger in equilibrium and the Gini coefficient on gross income increases a small amount more (relative to the benchmark).31 At the same time, we find that top wealth shares increase more slowly; unlike for the decline in tax progressivity, higher equilibrium interest rates induce more savings across the whole wealth distribution. In other words, at least over the time frame considered, the saving of the poor tends to be more elastic with respect to the interest rate than the saving of the rich.

7.5

r − g?

Piketty & Zucman (2015) discuss the link between r − g, i.e., the difference between the real interest rate

and the growth rate of GDP, and wealth inequality. In a model with linear, random savings rules they

show that a permanent increase in r − g decreases the Pareto coefficient characterizing the right tail of the

wealth distribution. In our model, too, an increase in r − g would increase savings rates for all consumers and thus decrease the Pareto coefficient in the right tail in steady state, as shown in the analysis in Section 5. But this result leaves open both the question of how an increase r − g affects overall wealth inequality

and the question of how wealth inequality evolves during the transition to a steady state after an increase 31

In addition, the gross labor share falls by about one percentage point over the period 1967–2012 in our model, though the net labor share actually rises a little. Karabarbounis & Neiman (2014a) report that since 1975 the gross labor share in the U.S. has fallen by about five percentage points and the net labor share by about two-and-a-half.

23

pre-tax interest rate

10

28

%

9

%

top 1% wealth share

29

8 7

27 26

6

25 0

20

40

60

80

100

0

year Gini Coefficient for Wealth

0.82

0.45

0.8

0.4

0.79

0.35

0.78

40

60

80

100

year Gini Coefficient for Income

0.5

0.81

20

pre-tax income post-tax income

0.3 0

20

40

60

80

100

0

year

20

40

60

80

100

year

Figure 13: Impulse responses to interest-rate shock (partial equilibrium) in r − g.

To investigate these issues we conducted a partial-equilibrium experiment, starting from the distri-

bution of wealth in 1967, in which the pre-tax real interest rate increases by 50 percent for a ten-year period and then returns to its original level.32 The resulting impulse responses are displayed in Figure 13. Surprisingly, the wealth share of the top 1% as well as the Gini coefficient for wealth initially decrease sharply.33 Wealth inequality then rebounds and overshoots slightly before settling back to the steadystate level. As in the experiment in Section 7.4, the model predicts a short-run elasticity of savings with respect to changes in r that is larger for poorer agents. In the very long run, wealth inequality indeed increases in response to a rise in r − g, but as pointed out by Gabaix et al. (2016) the dynamics implied

by the random growth mechanism work very slowly: it takes a long time to fill a long Pareto tail. In our framework, then, movements in r − g certainly cannot explain the observed sharp changes in wealth

inequality over the period 1967–2012 upon which we focus. The partial equilibrium assumption is not crucial for this finding. Finally, although wealth inequality falls on impact, income inequality increases in this experiment because capital income constitutes a large fraction of total income for high-income agents and the interest rate rises. The Gini coefficient for total income is plotted in the lower right panel of Figure 13, showing a sharp increase as the interest-rate shock hits the economy. 32

There is no growth, i.e., we set g = 0. The dynamics are very similar for other groups such as the top 10% or the top 0.1%; conversely, for groups such as the bottom 50% wealth shares increase before returning to their original levels. 33

24

top 10% wealth share

top 1% wealth share 50

85

45

80

40 75

35

70

30

model data (SZ)

25

65 1980 2000 2020 2040 2060 2080 2100

1980 2000 2020 2040 2060 2080 2100

top 0.1% wealth share

top 0.01% wealth share 12

20

10 8

15 6 4

10

2 1980 2000 2020 2040 2060 2080 2100

1980 2000 2020 2040 2060 2080 2100

Figure 14: Top wealth shares in %, long run

7.6

The long run

We have focussed so far on the transitional dynamics of the wealth distribution over the period 1967–2012, but what are the longer-run implications of the changes in earnings risk and, especially, tax progressivity that have occurred over this time period? Figure 14 shows the model’s prediction for the evolution of top wealth inequality in the 21st century, assuming no further changes in either earnings risk or taxes after 2012. The predictions are striking: the model suggests that the adjustment to the new fundamentals is far from completion and that wealth inequality is likely to rise even more. As pointed out before, the wealth distribution is a slow-moving object, especially in a setting with random growth in which the right tail of the wealth distribution is Pareto-shaped. Changes in fundamentals (such as the structure of taxes) that influence the consumption-savings decision differently for consumers with different wealth levels are bound, then, to have long-lasting effects. The contrast between the behavior of the wealth distribution over the transitional period and the eventual long-run steady-state wealth distribution (assuming an unchanged environment going forward) underscores the hazards of focussing solely on steady states when attempting to quantify how fundamentals affect wealth inequality. At the same time, we urge caution in interpreting 14 as a prediction for the future, because no doubt the environment will not remain unchanged going forward, and perhaps changes in wealth inequality will themselves lead to changes in, for example, the structure of taxes. Nonetheless, the long-run analysis does emphasize how powerfully tax progressivity can shape the wealth distribution,

25

particularly in its right tail.

8

Concluding remarks

In this paper we use established macroeconomic theory with heterogeneous agents—the Bewley-HuggettAiyagari setting—to examine a set of candidate explanations for the increase in U.S. wealth inequality over the last 30 or so years. The method we follow is thus to (i) independently measure changes in the environment, such as in the tax code and the earnings processes facing individuals; (ii) feed these into the model assuming that the economy is in a steady state in 1967; (iii) examine the resulting wealth distribution path; and (iv) conduct counterfactuals. We find that the model generates a path for inequality that is reasonably close to that observed, the main exception being that the rise in inequality at the very top of the distribution is under-predicted. The satisfactory performance of the model in predicting the overall path for wealth inequality notwithstanding, the main contribution is the conclusion that the most important factor—by far—behind the developments is the significant decline in tax progressivity that began in the late 1970s. Declining tax progressivity, together with increasing earnings risk and higher earnings inequality amongst top earnings, can also account for the rise in the capital-to-net-output ratio and at least some of the decline in the (gross) labor share when the elasticity of substitution between capital and labor is larger than one as in Karabarbounis & Neiman (2014b). Our model thus provides an alternative to the central mechanism—declining growth rates—to which Piketty (2014) draws attention in attempting to connect these macroeconomic trends to rising inequality. Our findings merit several remarks. Although we find that tax progressivity has played a central role in increasing inequality, our model is designed primarily as a positive rather than a normative tool. To evaluate the pros and cons of, say, reversing the changes in tax progressivity, it is important to account for the distortions created by labor taxation; in the present setting, labor earnings are exogenous and taxation is levied jointly on all incomes. We do not think that the introduction of distortionary labor taxation would change the model’s predictions for wealth inequality measurably, but it would be central for understanding the welfare consequences of tax changes. Further research contrasting the larger distortions of increased tax progressivity with the accompanying reductions in inequality seems very promising. To improve the model’s ability to account for the data, we conjecture that introducing a richer asset framework could play a large role. We do not include aggregate asset-price movements in our present setting—we do not consider either stock or housing as explicit, additional assets—and more generally we do not allow changes in the heterogeneity of returns to assets. The latter is likely an important aspect for the very top of the distribution, where in addition labor and asset incomes may be inextricably linked in practice. In particular, many modern entrepreneurs hold a large part of their private assets in their own companies, giving them an unbalanced portfolio that pays off greatly in successful states. With better availability to tax records and register data more generally, there are hopes to measure how the idiosyncratic returns to assets have evolved over time and we suspect that the variance here has risen

26

considerably, especially at the top. This is another great avenue for future research.34 Regardless of one’s normative views on wealth inequality, there are many reasons to care about its future course, as there are now many research contributions suggesting that the macroeconomy works quite differently when there is significant heterogeneity among consumers. This goes for fiscal as well as monetary policy; for examples, see Heathcote (2005), McKay & Reis (2016), and Brinca et al. (2016) for fiscal policy and Auclert (2015), McKay et al. (2016), and Kaplan et al. (2016) for monetary policy. The prediction from the present paper is that, barring reverses in the tax code, wealth inequality will go up even further, thus potentially strengthening the case for further research on the heterogeneous-agent approach to macroeconomics.

34 As we noted in Section 1, Fagereng et al. (2015) and Bach et al. (2015) have already embarked down this avenue using data from Sweden and Norway.

27

A

Computational strategy

A.1

Dynamic programming problem

The consumers’ dynamic programming problem is solved by value-function iteration using Carroll (2006)’s endogenous grid-point method (EGM) on a grid for cash-on-hand and the persistent idiosyncratic shocks (β, p). Unlike in the plain Aiyagari (1994) model, the support of the ergodic wealth distribution is unbounded in this framework. We use a log-spaced grid with 100 points for cash-on-hand (xi )100 i=1 with a very large upper bound (one million times average wealth) to minimize the truncation error.35 Cubic splines are used to interpolate the value function along the wealth dimension. The grid for the persistent component of individual productivity (pj )17 j=1 is chosen to account for the long right tail in earnings. First, we chose the grid points as the 0.0001, 0.01, 0.1, 0.25, 0.5, 0.75, 0.9, 0.925, 0.95, 0.975, 0.99, 0.999, ..., and 0.99999999 quantiles of the unconditional (i.e., cross-sectional) p-distribution (which is a normal). Second, we compute the corresponding grid in actual efficiency units of labor (ψ(p1 ), ..., ψ(p17 )). Third, given that in the current period p = pj for j = 1, ..., 17, we use GaussHermite quadrature to integrate over p′ |p, the value of idiosyncratic productivity in the next period, when updating the value function. In doing so, we use linear interpolation in ψ(p)-space to evaluate the value function off the grid (the value function is much more non-linear in p-space than in ψ(p)-space).36 Regarding the discount factor, we choose the grid points (βm )15 m=1 as the Gauss-Hermite quadrature points of the unconditional (i.e., cross-sectional) β-distribution (this will turn out to be useful when integrating over the joint distribution to compute aggregate wealth). Again, when updating the value function, we integrate over β ′ |β using Gauss-Hermite quadrature and linear interpolation in β-space.

In addition to the these three state variables, the setup requires numerical integration over the two

idiosyncratic i.i.d. shocks to earnings ν ′ and capital returns η ′ (as they affect next period’s cash-on-hand x′ ). As both shocks are normally distributed, we use Gauss-Hermite quadrature once again.

A.2

Computing the ergodic distribution

The focus on tiny population groups such as the top 0.01% of the wealth distribution implies that solving for the ergodic distribution directly is more efficient than simulating a large number of agents and applying the ergodic theorem. In doing so, simulation error is eliminated; instead one can directly control the numerical error by updating the distribution until convergence is reached. Specifically, note that the EGM entails using a grid for assets (ai )100 i=1 . Given pj and βm , saving ai is optimal with cash-on-hand x(ai ; pj , βm ) that solves u′ (x(ai ; pj , βm ) − ai ) = βm E



  1 + rη ′ 1 − τ ′ (y ′ ) V1 (ai + y ′ − τ (y ′ ) + T, p′ , β ′ )|pj , βm .

35 Alternatively, given that the Pareto tail has stabilized at some x ¯, one could in principle also impute the distribution for x > x ¯. However, this did not turn out to be necessary as the log-spaced grid—which works well as the curvature of the value function is high only close to the borrowing constraint—allows for selecting a very large upper bound while keeping the number of grid points computationally feasible. p 36 Note that these linear interpolation coefficients can be pre-computed, resulting in a 17 × 17 - matrix wp , where wj,· are the integration weights for evaluating next period’s value function on (p1 , ..., p17 ) given that in the current period p = pj .

28

While the main advantage of the EGM is efficiency (x(ai ; pj , βm ) can be found without maximizing the right-hand side of the Bellman equation), it is also convenient that the savings function is already inverted. First, for all pj , βm , νq , ηh and for all ai , i = 1, ..., 100, there exists a unique level of asset holdings a = s−1 (ai ; pj , βm , νq , ηh ) such that saving ai is optimal.37 Second, we define a finer grid for asset holdings −1 (ki )1000 i=1 and interpolate (using a cubic spline) to find the inverse savings function s (ki ; pj , βm , νq , ηh ).

Note that the borrowing constraint is binding for all k ≤ s−1 (k1 ; pj , βm , νq , ηh ). Finally, we can solve for 17 the ergodic distribution G(ki ; pj , βm ) ≡ P rob(k ≤ ki |p = pj , β = βm ) at the grid points (ki )1000 i=1 , (pj )j=1 and (βm )15 m=1 . To simplify notation, we will denote by Gj,m (ki ) this conditional cdf evaluated at grid

points (pj , βm ). This distribution has to satisfy Gj,m (ki ) =

Z Z Z Z p

β

ν

G(s−1 (ki ; p, β, ν, η); p, β)dΓη (η)dΓν (ν)dΓβ (β|βm )dΓp (p|pj ).

(9)

η

Note that pj and βm are the realizations of the shock in period t + 1 and the integration is over the shock values in period t. Nevertheless, e.g., Γβ (β|βm ) is the correct distribution as for any stationary Gaussian AR(1) process zt the conditional random variables zt |zt+1 and zt+1 |zt have the same distribution.38 Starting from some initial distribution G0j,m (ki ) and using the short-hand notation s−1 j,m,q,h (ki ) = s−1 (ki ; pj , βm , νq , ηh ), we update until convergence according to G1j ′ ,m′ (ki ) =

X

wjp′ ,j

X

β wm ′ ,m

m

j

X q

wqν

X

ˆ 0j,m (s−1 whη G j,m,q,h (ki )).

(10)

h

In (10), wqν and whη are the Gauss-Hermite quadrature weights for the transitory shocks ν and η (normalized to sum to one). The construction of the integration weights for the persistent shocks p and β is based ˆ 0 (·) linearly interpolates on linear interpolation in ψ(p)- and β-space, respectively (see details below). G j,m

G0j,m (ki )

off the grid in the k-dimension.

p β Integration weights wj, ′ j and wm′ ,m . Consider the persistent earnings shock p. Conditional on its

value in the next period being p′ = pj ′ for some fixed j ′ ∈ {1, ..., 17}, the integration over the current

period value p is with respect to the distribution of p, conditional on p′ , where p|p′ ∼ N (ρP p′ + (1 −

ρP )µP , σ P ). Gauss-Hermite quadrature, here with ten sample points, entails evaluating the function of √ ˜n = ρP p′ + (1 − ρP )µP + 2σ P x interest G(s−1 (ki ; p, β, ν, η); p, β) at (˜ pn )10 ˜n and (˜ xn )10 n=1 , where p n=1 are the

roots of the Hermite polynomial, and approximating the integral using the associated weights (w ˜n )10 n=1 as 10 1 X √ w ˜n G(s−1 (ki ; p˜n , β, ν, η); p˜n , β). ≈ π n=1

Of course, p˜n will in general not lie on the pj -grid, where the function value is known. Hence, we have to interpolate. Using linear interpolation, we can pre-compute the integration weights (wjp′ ,j )17 j=1 we 37 −1

s

(ai ; pj , βm , νq , ηh ) is defined as the unique a that solves x(ai ; pj , βm ) = a + y − τ (y) + T,

where y = rηh a + wl(pj , νq ). 38 That is, the densities satisfy fzt |zt+1 (x|y) = fzt+1 |zt (x|y).

29

put on evaluating the function of interest at (G(s−1 (ki ; pj , β, ν, η); pj , β))17 j=1 in an efficient manner: for n = 1, ..., 10, locate j(n) such that pj(n) ≤ p˜n ≤ pj(n)+1 and compute the linear interpolation coefficient in ψ(p)-space λn as

ψ(˜ pn ) − ψ(pj(n) ) . ψ(pj(n)+1 ) − ψ(pj(n) )

λn =

˜n to wjp′ ,j(n) and λn √1π w ˜n to wjp′ ,j(n)+1 . The construction Then, looping over n = 1, ..., 10, add (1− λn ) √1π w of the integration weights for β is analogous, except that linear interpolation can be performed directly in β-space. Computing moments of the distribution. For example, aggregate wealth is given by K=

Z Z Z p

β

k

 kdG(k|p, β) fp (p)fβ (β)dpdβ,

where fp (·) and fβ (·) are the unconditional (i.e., cross-sectional) normal densities of the persistent shocks p and β. We integrate numerically according to ˆ = K

17 X j=1

w ¯jp

15 X

β w ¯m

m=1

k1 Gj,m (k1 ) +

1000 X i=2

! ki−1 + ki (Gj,m (ki ) − Gj,m (ki−1 )) . 2

(11)

β 15 ¯m )m=1 As the discount factor grid (βm )15 m=1 was chosen as the Gauss-Hermite sample points, we set (w

to be the associated Gauss-Hermite quadrature weights. Recall that the Pareto tail transformation of the persistent earnings component p prompted us to define a grid (pj )17 j=1 with a particular emphasis on the right tail. Hence, we (pre-)compute the integration weights (w ¯jp )17 j=1 manually: (i) define a very fine equally spaced grid (ˆ pn )N n=1 (if, say, N = 100, 000, this has to be carried out only once) that covers the coarser grid (pj )17 j=1 ; (ii) for all n = 1, ..., N , locate j(n) and compute λn as above; (iii) looping over P

σ ) p p (fp (·) is the pdf of p ∼ N (µP , 1−ρ and λn fp (ˆ pn ) to w ¯j(n)+1 n = 1, ..., N , add (1 − λn )fp (ˆ pn ) to w ¯j(n) P ); P17 p and (iv) finally, normalize such that j=1 w ¯j = 1.39

A.3

Transition experiments

The perfect-foresight transition experiment is computationally straightforward. Given the calibrated initial steady state (K ⋆ , T ⋆ ), the new steady state (K ⋆⋆ , T ⋆⋆ ) is computed under the new exogenous 1 environment. We then search for a fixed point in (Kt , Tt )tt=t -space where t1 − t0 is chosen to be 0 +1

large enough that (Kt1 , Tt1 ) ≈ (K ⋆⋆ , T ⋆⋆ ). For each iteration, we first solve for the value functions and corresponding (inverse) savings decisions backwards and subsequently roll the distribution forward, as described in the previous sections for the steady state. Note that now the grids and integration weights for the earnings process components are time-varying.40 39

Of course one could also use Gauss-Hermite quadrature here, as the corresponding weights and results coincide for all practical purposes. 40 In particular, as the variance of the innovation term of the persistent earnings component σtP is time-varying, pt|t+1 is no longer equal to pt+1|t in distribution (but still normal); hence the integration weights for the decision problem (forwardlooking) and the cross-sectional distribution (backward-looking) differ.

30

The myopic transition experiment is conceptually very different. Given a period t distribution Gtj,m (ki ) and savings decisions stj,m,q,h(k) (reflecting factor prices rt , wt , transfers Tt and exogenous environment 41 In turn, Gt+1 (k ) and θt , all naively assumed to persist forever), Gt+1 j,m (ki ) is obtained as in (10). j,m i

θt+1 determine Kt+1 (thus rt+1 , wt+1 ) and Tt+1 . The surprised agents expect this new endogenous and exogenous environment to prevail forever and hence we solve the dynamic programming problem given this environment and accordingly obtain st+1 j,m,q,h (k). Note that no fixed point problem has to be solved and the capital stock converges to the same new steady state as under perfect foresight. Theoretically, this strategy could give rise to oscillatory paths of capital. However, this turns out not to be the case in our application.

41

Again, the grids and integration weights for the earnings process components are time-varying.

31

References Acemoglu, D. (2002). Technical Change, Inequality, and the Labor Market. Journal of Economic Literature, 40 (1), 7–72. Aiyagari, S. R. (1994). Uninsured Idiosyncratic Risk and Aggregate Saving. The Quarterly Journal of Economics, 109 (3), pp. 659–684. Aoki, S., & Nirei, M. (forthcoming). Zipf’s Law, Pareto’s Law, and the Evolution of Top Incomes in the United States. American Economic Journal: Macroeconomics. Auclert, A. (2015). Monetary Policy and the Redistribution Channel. Working paper. Bach, L., Calvet, L. E., & Sodini, P. (2015). Rich Pickings? Risk, Return, and Skill in the Portfolios of the Wealthy. Working paper. Becker, R. A. (1980). On the Long-Run Steady State in a Simple Dynamic Model of Equilibrium with Heterogeneous Households. The Quarterly Journal of Economics, 95 (2), 375–382. Benhabib, J., Bisin, A., & Luo, M. (2015a). Wealth Distribution and Social Mobility in the US: A Quantitative Approach. NBER Working Paper 21721, National Bureau of Economic Research, Inc. Benhabib, J., Bisin, A., & Zhu, S. (2011). The Distribution of Wealth and Fiscal Policy in Economies With Finitely Lived Agents. Econometrica, 79 (1), 123–157. Benhabib, J., Bisin, A., & Zhu, S. (2015b). The Wealth Distribution in Bewley Economies with Capital Income Risk. Journal of Economic Theory, 159, Part A, 489 – 515. URL http://www.sciencedirect.com/science/article/pii/S0022053115001362 Bewley, T. (undated). Interest Bearing Money and the Equilibrium Stock of Capital. Manuscript. Bricker, J., Henriques, A., Krimmel, J., & Sabelhaus, J. (2016). Measuring Income and Wealth at the Top Using Administrative and Survey Data. Brookings Papers on Economic Activity, Spring 2016 , 261–331. Brinca, P., Holter, H., Krusell, P., & Malafry, L. (2016). Fiscal Multipliers in the 21st Century. Journal of Monetary Economics, 77 , 53–69. Cagetti, M., & De Nardi, M. (2006). Entrepreneurship, Frictions, and Wealth. Journal of Political Economy, 114 (5), 835–870. Carroll, C. D. (2006). The Method of Endogenous Gridpoints for Solving Dynamic Stochastic Optimization Problems. Economics Letters, 91 (3), 312–320. Carroll, C. D. (2012). Theoretical Foundations of Buffer Stock Saving. Working paper.

32

Carroll, C. D., & Kimball, M. S. (1996). On the Concavity of the Consumption Function. Econometrica, 64 (4), 981–92. Casta˜ neda, A., D´ıas-Gim´enez, J., & R´ıos-Rull, J.-V. (2003). Accounting for the U.S. Earnings and Wealth Inequality. Journal of Political Economy, 111 (4), 818–857. CBO (2015). The Budget and Economic Outlook: 2015 to 2025. Tech. rep., Congressional Budget Office. Fagereng, A., Guiso, L., Malacrino, D., & Pistaferri, L. (2015). Heterogeneity and Persistence in Returns to Wealth. Working paper. Gabaix, X. (2009). Power Laws in Economics and Finance. Annual Review of Economics, 1 (1), 255–294. Gabaix, X., Lasry, J.-M., Lions, P.-L., & Moll, B. (2016). The Dynamics of Inequality. Econometrica, 84 (6), 2071–2111. URL http://dx.doi.org/10.3982/ECTA13569 Guerrieri, V., & Lorenzoni, G. (2011). Credit Crises, Precautionary Savings, and the Liquidity Trap. NBER Working Papers 17583, National Bureau of Economic Research, Inc. Heathcote, J. (2005). Fiscal policy with heterogeneous agents and incomplete markets. Review of Economic Studies, 72 (1), 161–188. Heathcote, J., Storesletten, K., & Violante, G. L. (2010). The Macroeconomic Implications of Rising Wage Inequality in the United States. Journal of Political Economy, 118 (4), 681–722. Hornstein, A., Krusell, P., & Violante, G. (2005). The Effects of Technical Change on Labor Market Inequalities , vol. 1 of Handbook of Economic Growth. Elsevier. Huggett, M. (1993). The Risk-Free Rate in Heterogeneous-Agent Incomplete-Insurance Economies. Journal of Economic Dynamics and Control, 17 (5-6), 953–969. Kaplan, G., Moll, B., & Violante, G. L. (2016). Monetary Policy According to HANK. Working paper. Karabarbounis, L., & Neiman, B. (2014a). Capital Depreciation and Labor Shares Around the World: Measurement and Implications. Working paper. Karabarbounis, L., & Neiman, B. (2014b). The Global Decline of the Labor Share. The Quarterly Journal of Economics, 129 (1), 61–103. Katz, L. F., & Murphy, K. M. (1992). Changes in Relative Wages, 1963-1987: Supply and Demand Factors. The Quarterly Journal of Economics, 107 (1), 35–78. Kaymak, B., & Poschke, M. (2016). The Evolution of Wealth Inequality over Half a Century: The Role of Taxes, Transfers and Technology. Journal of Monetary Economics, 77 (C), 1–25. Kennickell, A. B. (2011). Tossed and Turned: Wealth Dynamics of U.S. Households 2007-2009. Finance and Economics Discussion Series 2011-51, Board of Governors of the Federal Reserve System. 33

Kesten, H. (1973). Random Difference Equations and Renewal Theory for Products of Random Matrices. Acta Mathematica, 131 (1), 207–248. Kopczuk, W. (2015). What Do We Know about the Evolution of Top Wealth Shares in the United States? Journal of Economic Perspectives, 29 (1), 47–66. Kopczuk, W., & Saez, E. (2004). Top Wealth Shares in the United States, 1916-2000: Evidence from Estate Tax Returns. National Tax Journal, 2, Part 2 , 445–487. Krusell, P., Mukoyama, T., S ¸ ahin, A., & Smith, Jr., A. (2009). Revisiting the Welfare Effects of Eliminating Business Cycles. Review of Economic Dynamics, 12 , 393–404. Krusell, P., & Smith, Jr., A. (1998). Income and Wealth Heterogeneity in the Macroeconomy. Journal of Political Economy, 106 (5), 867–896. Krusell, P., & Smith, Jr., A. (2006). Quantitative Macroeconomic Models with Heterogeneous Agents. In R. Blundell, W. Newey, & T. Persson (Eds.) Advances in Economics and Econometrics: Theory and Applications, Ninth World Congress, Econometric Society Monographs, 41 , (pp. 298–340). Cambridge University Press. Krusell, P., & Smith, Jr., A. (2015). Is Piketty’s “Second Law of Capitalism” Fundamental? Journal of Political Economy, 123 (4), 725–748. McKay, A. (2013). Search for Financial Returns and Social Security Privatization. Review of Economic Dynamics, 16 (2), 253–270. McKay, A., Nakamura, E., & Steinsson, J. (2016). The Power of Forward Guidance Revisited. American Economic Review , 106 (10), 3133–3158. McKay, A., & Reis, R. (2016). The Role of Automatic Stabilizers in the U.S. Business Cycle. Econometrica, 84 (1), 141–194. Moskowitz, T. J., & Vissing-Jørgensen, A. (2002). The Returns to Entrepreneurial Investment: A Private Equity Premium Puzzle? American Economic Review , 92 (4), 745–778. Nirei, M., & Aoki, S. (2016). Pareto Distribution of Income in Neoclassical Growth Models. Review of Economic Dynamics, 20 (1), 25–42. Piketty, T. (1995). Social Mobility and Redistributive Politics. The Quarterly Journal of Economics, 110 (3), 551–84. Piketty, T. (1997). The Dynamics of the Wealth Distribution and the Interest Rate with Credit Rationing. Review of Economic Studies, 64 , 173–189. Piketty, T. (2014). Capital in the Twenty-First Century. Translated by Arthur Goldhammer. Cambridge, MA: Belknap. 34

Piketty, T., & Saez, E. (2003). Income Inequality in the United States, 1913-1998. The Quarterly Journal of Economics, 118 (1), 1–41. Piketty, T., & Saez, E. (2007). How Progressive is the U.S. Federal Tax System? A Historical and International Perspective. Journal of Economic Perspectives, 21 (1), 3–24. Piketty, T., & Zucman, G. (2014). Capital is Back: Wealth-Income Ratios in Rich Countries 1700-2010. The Quarterly Journal of Economics, 129 (3), 1255–1310. Piketty, T., & Zucman, G. (2015). Wealth and Inheritance in the Long Run (Chapter 15), vol. 2 of Handbook of Income Distribution. Elsevier. Quadrini, V. (2000). Entrepreneurship, Saving, and Social Mobility. Review of Economic Dynamics, 3 (1), 1–40. Quadrini, V., & Rios-Rull, J.-V. (2015). Inequality in Macroeconomics (Chapter 14), vol. 2 of Handbook of Income Distribution. Elsevier. Saez, E., & Zucman, G. (2016). Wealth Inequality in the United States since 1913: Evidence from Capitalized Income Tax Data. Quarterly Journal of Economics, 2 , 519–578. Stiglitz, J. E. (1969). Distribution of Income and Wealth Among Individuals. Econometrica, 37 (3), 382–397.

35